From david.holmes at oracle.com Sun Jul 1 12:00:06 2018 From: david.holmes at oracle.com (David Holmes) Date: Sun, 1 Jul 2018 22:00:06 +1000 Subject: Misbehaving exit status from Hotspot In-Reply-To: References: <297ee516-faa1-aaa6-f026-90287e360a99@oracle.com> <756b8d97-f11a-c1f2-3ef4-f76fd7253203@oracle.com> <12b0f7cf-3161-6ef2-bd23-aa386747da79@oracle.com> <2a183bbf-8be3-a036-f1c2-c3fd9c1669d3@redhat.com> Message-ID: <9c072b52-0294-de03-81db-60e90bb6d1b3@oracle.com> On 30/06/2018 2:41 PM, Charles Oliver Nutter wrote: > On 06/29/2018 01:51 AM, David Holmes wrote: >> "Such a handler should end by specifying the default action for the signal that happened and then reraising it; this will cause the program to terminate with that signal, as if it had not had a handler." > > Yes, this is really what set me down the path of wishing Hotspot would > do the same thing. This and the fact that CRuby does it, and I can't > fit into certain CRuby deployments because JRuby can't emulate the > signal results. I'd be sympathetic to providing some kind of hook, as John alludes, that might allow you to provide this behaviour without disrupting anyone else. At present you can't chain or install a handler for SIGTERM so a new capability would need to be added. A simple VM flag does not suffice because the VM does not perform the shutdown and so does not know when it is safe to re-raise the SIGTERM. > On Fri, Jun 29, 2018 at 4:50 AM, Florian Weimer wrote: >> The advice seems appropriate to me for handlers that lead to termination, as generally intended for these signals. SIGQUIT doesn't do that for the JVM, so the advice doesn't apply. SIGTERM appears to do so. So why not preserve in the information that the process was shut down by SIGTERM by reraising the signal? This might confer useful information to the caller. > > You've got my vote! > > I think it's worth enumerating the pros and cons, eh? > > Con: > > * waitpid-related macros would now show termination due to a signal > rather than a normal exit. > > Pro: > > * waitpid-related macros would now show termination due to a signal > rather than a normal exit. > * They'd also show the actual signal value in termsig rather than the 128+N. > * The actual command line exit could would remain unchanged. > > So...taking this another step... > > Currently, you can *only* rely on the command line exit code, because > the watipid macros just say it was a normal exit, and the rest of > their values are nonsense. No the exitstatus is not nonsense and encodes the reason why the JVM chose to terminate. > So anyone writing process-management stuff > for Hotspot subprocesses can only use the exit code (128+N) to detect > that the exit was due to TERM. And that is precisely what they should be using. > And if we changed it? Well, the above would continue to work exactly > as it does now, but folks expecting GNU-like TERM handling > (propagation to default handler) would suddenly start to work with > Hotspot. If you change it then you potentially break everyone using waitpid who will now see an abnormal termination and perform abnormal, rather than normal, termination actions. They might be one and the same to you, but that doesn't mean they are one and the same to everyone. David ----- > Obviously I'm in favor of this, so I'd like to understand what this > change would break. It seems like a net positive. > > - Charlie > From david.holmes at oracle.com Sun Jul 1 12:16:14 2018 From: david.holmes at oracle.com (David Holmes) Date: Sun, 1 Jul 2018 22:16:14 +1000 Subject: Misbehaving exit status from Hotspot In-Reply-To: <2a183bbf-8be3-a036-f1c2-c3fd9c1669d3@redhat.com> References: <297ee516-faa1-aaa6-f026-90287e360a99@oracle.com> <756b8d97-f11a-c1f2-3ef4-f76fd7253203@oracle.com> <12b0f7cf-3161-6ef2-bd23-aa386747da79@oracle.com> <2a183bbf-8be3-a036-f1c2-c3fd9c1669d3@redhat.com> Message-ID: <8cbb5b46-9aaa-c710-74f9-83ce12314ef5@oracle.com> On 29/06/2018 7:50 PM, Florian Weimer wrote: > On 06/29/2018 01:51 AM, David Holmes wrote: >> To be clear, I think the libc advise on this topic is just wrong-headed: >> >> https://www.gnu.org/software/libc/manual/html_node/Termination-Signals.html >> >> >> "Such a handler should end by specifying the default action for the >> signal that happened and then reraising it; this will cause the >> program to terminate with that signal, as if it had not had a handler." >> >> It's not even self-consistent because it states this for the >> "termination signals" but when caught some of these signals >> intentionally do not trigger termination. So following that advice for >> SIGQUIT would be completely wrong for the JVM! > > The advice seems appropriate to me for handlers that lead to > termination, as generally intended for these signals.? SIGQUIT doesn't > do that for the JVM, so the advice doesn't apply. That's somewhat selective. The document claims these are all termination signals and should be handled the same way. > SIGTERM appears to do > so.? So why not preserve in the information that the process was shut > down by SIGTERM by reraising the signal?? This might confer useful > information to the caller. The information is already preserved in the exit code of the JVM when it calls exit(). There's no justification for changing 20+ years of behaviour here - the JVM is doing nothing wrong. David ----- > Thanks, > Florian From thomas.stuefe at gmail.com Sun Jul 1 12:18:12 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 1 Jul 2018 14:18:12 +0200 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: Hi Zhengyu, On Fri, Jun 29, 2018 at 10:17 PM, Zhengyu Gu wrote: > Hi Thomas, > > On 06/29/2018 03:56 PM, Thomas St?fe wrote: >> >> Hi Zhengyu, >> >> do I understand the problem right that the static initialization of >> EMPTY_STACK can be preceded by a call to MemTracker::tracking_level()? >> Otherwise I do not understand the placement new in >> MemTracker::tracking_level(). >> > Correct. > >> If yes, how? Do we really run that much complex code as part of C++ >> static initialization? > > > Because there are other static objects that call os::malloc/new in their > constructors, and they may be initialized prior to EMPTY_STACK. > this is terrible :) We should probably fix that sometime, but in the meantime your change makes sense to me. Reviewed. Thanks, Thomas > Thanks, > > -Zhengyu > > > > > >> >> Thanks, Thomas >> >> >> On Fri, Jun 29, 2018 at 3:04 PM, Zhengyu Gu wrote: >>> >>> Hi, >>> >>> clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is all >>> zeros, and since it is a static constant, it places the object in the >>> read-only BSS data section. >>> >>> To workaround static initialization ordering issue, NMT has to ensure >>> EMPTY_STACK is initialized before turns itself on, which can happen in >>> the >>> middle of initialization of other static objects. In this case, it causes >>> SIGSEGV while try to write to the read-only memory. >>> >>> The solution is to make EMPTY_STACk private and non-constant, but hands >>> out >>> constant version. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 >>> Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ >>> >>> Test: >>> >>> hotspot_nmt on Linux 64 (fastdebug and release) >>> >>> Thanks, >>> >>> -Zhengyu From david.holmes at oracle.com Sun Jul 1 21:48:10 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 2 Jul 2018 07:48:10 +1000 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: References: Message-ID: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> Hi Volker, This doesn't really address any of the concerns I had with the original proposal - it just moves the "field" from the Java side to the VM side. There is still a massive amount of Java code execution in relation to this - which itself may encounter secondary exceptions. It's very hard to tell if you will leave things in a suitable state if such exceptions arise. My position remains that the primary place to deal with the initialization error is when initialization occurs and the error happens. Subsequent attempted uses of the erroneous class may benefit from some additional information about the nature of the original exceptions, but I don't think full stacktraces are necessary or desirable (and I do believe they will confuse most users given the lack of continuity in the stack frames and that they may have happened in a different thread!). That aside ... There appears to a race on constructing the Hashtable. At least it was not obvious to me where a lock may be held during that process. I can't determine that clearing backtrace in removeNativeBacktrace is correct with respect to the overall protocol within Throwable for dealing with backtrace and stackTrace. I have to wonder why nothing in Throwable clears the backtrace today ? I'm not clear why you record the ExceptionInInitializerError wrapper instead of the actual exception that occurred? Throwable states: + * This method is currently only called from the VM for instances of + * ExceptionInInitializerError which are stored for later chaining into a + * NoClassDefFoundError in order to prevent keeping classes from the native + * backtrace alive. + */ but IIUC it will also be called for instances of Error that occurred which do not get wrapped in EIIE. Regards, David ------ On 30/06/2018 12:53 AM, Volker Simonis wrote: > Hi, > > can I please have a review for the following change which saves > ExceptionInInitializerError thrown during class initialization and > chains them as cause into potential NoClassDefFoundErrors for the same > class. We are using this features since years in our commercial SAP > JVM and it proved extremely useful for detecting and fixing errors > especially in big deployments. > > This is a follow-up on a discussion previously started by Goetz [1]. > His first proposal (which is close to our current, internal > implementation) inserted an additional field into java.lang.Class > objects to save potential ExceptionInInitializerErrors. This was > considered to much overhead in the initial discussion [1]. > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8203826.v2/ > https://bugs.openjdk.java.net/browse/JDK-8203826 > > So in this change, I've completely re-implemented the feature by using > a java.lang.Hashtable which is attached to the ClassLoaderData object. > The Hashtable is lazily created when the first > ExceptionInInitializerError is thrown and maps the Class which > triggered the ExceptionInInitializerError during the execution of its > static initializer to the corresponding ExceptionInInitializerError. > > If the same class will be accessed once again, this will directly lead > to a plain NoClassDefFoundError (as per the JVMS, 5.5 Initialization) > because the static initializer won't be executed a second time. Until > now, this NoClassDefFoundError wasn't linked in any way to the root > cause of the problem (i.e. the first ExceptionInInitializerError > together with the chained exception that happened during the execution > of the static initializer). With this change, the NoClassDefFoundError > will now chain the initial ExceptionInInitializerError as cause, > making it much easier to detect the problem which lead to the > NoClassDefFoundError. > > Following is an example from the new JTreg tests which comes which > this change to demonstrate the feature. Until know, a typical stack > trace from a NoClassDefFoundError looked as follows: > > java.lang.NoClassDefFoundError: Could not initialize class > NoClassDefFound$ClassWithFailedInitializer > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:291) > at NoClassDefFound.main(NoClassDefFound.java:38) > > With this change, the same stack trace now looks as follows: > > java.lang.NoClassDefFoundError: Could not initialize class > NoClassDefFound$ClassWithFailedInitializer > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:315) > at NoClassDefFound.main(NoClassDefFound.java:38) > Caused by: java.lang.ExceptionInInitializerError > at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) > at java.base/java.lang.Class.newInstance(Class.java:584) > at NoClassDefFound$ClassWithFailedInitializer.(NoClassDefFound.java:20) > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:315) > at NoClassDefFound.main(NoClassDefFound.java:30) > Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 2 out of > bounds for length 1 > at NoClassDefFound$A.(NoClassDefFound.java:9) > ... 9 more > > As you can see, the reason for the NoClassDefFoundError when accessing > the class 'NoClassDefFound$ClassWithFailedInitializer' is actually not > even in the class or its static initializer itself, but in the class > 'NoClassDefFound$A' which is a base class of > 'NoClassDefFound$ClassWithFailedInitializer'. This is not easily > detectible from the old, plain NoClassDefFoundError. > > As I wrote, the only overhead we have with the new implementation is > an additional OopHandle field per ClassLoaderData which I think is > acceptable. The Hashtable object itself is only created lazily, after > the first occurrence of an ExceptionInInitializerError in the > corresponding class loader. The whole Hashtable creation and > storing/quering of ExceptionInInitializerErrors in > ClassLoaderData::record_init_exception()/ClassLoaderData::query_init_exception() > is optional in the sense that any errors/exceptions occurring during > the execution of these functions are ignored and cleared. > > Finally, we also take care to recursively convert all native > backtraces in the stored ExceptionInInitializerErrors (and their > suppressed and chained exceptions) into symbolic stack traces in order > to avoid holding references to classes and prevent their unloading. > This is implemented in the new private, static method > java.lang.Throwable::removeNativeBacktrace() which is called for each > ExceptionInInitializerError before it is stored in the Hashtable. > > Thank you and best regards, > Volker > > [1] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-June/028310.html > From david.holmes at oracle.com Sun Jul 1 23:57:57 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 2 Jul 2018 09:57:57 +1000 Subject: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop In-Reply-To: <21dc215e155e40af8e0f3f34603ff4e7@sap.com> References: <46c20309-4c4e-33dc-b2be-d734b874885e@oracle.com> <3fd92d1e-5a52-6618-cf9c-53848f6b4fca@oracle.com> <21dc215e155e40af8e0f3f34603ff4e7@sap.com> Message-ID: Hi Martin, On 29/06/2018 8:12 PM, Doerr, Martin wrote: > Thank you for the reviews. > > I've created a new webrev with a "_release" version instead of "_no_release": > http://cr.openjdk.java.net/~mdoerr/8206003_tlh_sync_membars/webrev.01/ That all seems fine. > Due to this change, SafepointMechanism::initialize_header doesn't use a release barrier anymore which should be fine. I agree. The JavaThread being constructed does not yet have a native thread associated with it so there is no "acquire" for a "release" to pair with in this case. Native thread creation/execution has its own memory barriers. > Pushed to jdk/submit11 and our internal testing. I'll put this through our internal testing too. Thanks, David > > Best regards, > Martin > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Freitag, 29. Juni 2018 00:49 > To: Erik ?sterlund ; Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net; Robbin Ehn ; Andrew Haley (aph at redhat.com) > Subject: Re: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop > > On 29/06/2018 1:28 AM, Erik ?sterlund wrote: >> Hi Martin, >> >> This did catch my eye too. This looks good to me. But could you consider >> having _release in the name of the setter that uses release, and no >> postfix for the one using a plain store, instead of giving that one a >> _no_release postfix. I don't need another webrev. > > +1 > > I'm assuming that nothing may be tripped up (ie assertion somewhere) if > the polling status of different threads can now be seen out-of-order. > > Thanks, > David > >> >> Thanks, >> /Erik >> >> On 2018-06-28 16:52, Doerr, Martin wrote: >>> Hi, >>> >>> I have recently come across a bad placement of memory barriers in >>> SafepointSynchronize::begin() and end() which were changed for JEP >>> 312: Thread-Local Handshakes. They iterate over all JavaThreads and >>> call SafepointMechanism::arm_local_poll or disarm_local_poll. >>> Unfortunately, the release barriers are inside the latter functions. >>> >>> Assume we have several 1000 JavaThreads. This means the code executes >>> several 1000 release barriers on weak memory model platforms (PPC64 >>> and ARM/aarch64). Only one is needed. >>> >>> A goal of JEP 312 was to minimize latency of safepoints which gets >>> defeated by this issue to some extend on these platforms. >>> >>> It could be fixed by this proposal: >>> http://cr.openjdk.java.net/~mdoerr/8206003_tlh_sync_membars/webrev.00/ >>> >>> Please review. >>> >>> Best regards, >>> Martin >>> >> From martin.doerr at sap.com Mon Jul 2 06:14:53 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 2 Jul 2018 06:14:53 +0000 Subject: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop In-Reply-To: References: <46c20309-4c4e-33dc-b2be-d734b874885e@oracle.com> <3fd92d1e-5a52-6618-cf9c-53848f6b4fca@oracle.com> <21dc215e155e40af8e0f3f34603ff4e7@sap.com> Message-ID: Hi David, thank you for your support. Please let me know if your testing has passed successfully. I didn't get a reply from submit11, but our internal builds and some tests were run. Best regards, Martin -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Montag, 2. Juli 2018 01:58 To: Doerr, Martin ; Erik ?sterlund ; hotspot-runtime-dev at openjdk.java.net; Robbin Ehn ; Andrew Haley (aph at redhat.com) Subject: Re: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop Hi Martin, On 29/06/2018 8:12 PM, Doerr, Martin wrote: > Thank you for the reviews. > > I've created a new webrev with a "_release" version instead of "_no_release": > http://cr.openjdk.java.net/~mdoerr/8206003_tlh_sync_membars/webrev.01/ That all seems fine. > Due to this change, SafepointMechanism::initialize_header doesn't use a release barrier anymore which should be fine. I agree. The JavaThread being constructed does not yet have a native thread associated with it so there is no "acquire" for a "release" to pair with in this case. Native thread creation/execution has its own memory barriers. > Pushed to jdk/submit11 and our internal testing. I'll put this through our internal testing too. Thanks, David > > Best regards, > Martin > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Freitag, 29. Juni 2018 00:49 > To: Erik ?sterlund ; Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net; Robbin Ehn ; Andrew Haley (aph at redhat.com) > Subject: Re: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop > > On 29/06/2018 1:28 AM, Erik ?sterlund wrote: >> Hi Martin, >> >> This did catch my eye too. This looks good to me. But could you consider >> having _release in the name of the setter that uses release, and no >> postfix for the one using a plain store, instead of giving that one a >> _no_release postfix. I don't need another webrev. > > +1 > > I'm assuming that nothing may be tripped up (ie assertion somewhere) if > the polling status of different threads can now be seen out-of-order. > > Thanks, > David > >> >> Thanks, >> /Erik >> >> On 2018-06-28 16:52, Doerr, Martin wrote: >>> Hi, >>> >>> I have recently come across a bad placement of memory barriers in >>> SafepointSynchronize::begin() and end() which were changed for JEP >>> 312: Thread-Local Handshakes. They iterate over all JavaThreads and >>> call SafepointMechanism::arm_local_poll or disarm_local_poll. >>> Unfortunately, the release barriers are inside the latter functions. >>> >>> Assume we have several 1000 JavaThreads. This means the code executes >>> several 1000 release barriers on weak memory model platforms (PPC64 >>> and ARM/aarch64). Only one is needed. >>> >>> A goal of JEP 312 was to minimize latency of safepoints which gets >>> defeated by this issue to some extend on these platforms. >>> >>> It could be fixed by this proposal: >>> http://cr.openjdk.java.net/~mdoerr/8206003_tlh_sync_membars/webrev.00/ >>> >>> Please review. >>> >>> Best regards, >>> Martin >>> >> From david.holmes at oracle.com Mon Jul 2 07:26:12 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 2 Jul 2018 17:26:12 +1000 Subject: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop In-Reply-To: References: <46c20309-4c4e-33dc-b2be-d734b874885e@oracle.com> <3fd92d1e-5a52-6618-cf9c-53848f6b4fca@oracle.com> <21dc215e155e40af8e0f3f34603ff4e7@sap.com> Message-ID: <18d8cac9-6092-4fd2-7e36-513b361e5f42@oracle.com> Hi Martin, On 2/07/2018 4:14 PM, Doerr, Martin wrote: > Hi David, > > thank you for your support. Please let me know if your testing has passed successfully. I didn't get a reply from submit11, but our internal builds and some tests were run. Yes all good - only one unrelated failure. Cheers, David > Best regards, > Martin > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Montag, 2. Juli 2018 01:58 > To: Doerr, Martin ; Erik ?sterlund ; hotspot-runtime-dev at openjdk.java.net; Robbin Ehn ; Andrew Haley (aph at redhat.com) > Subject: Re: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop > > Hi Martin, > > On 29/06/2018 8:12 PM, Doerr, Martin wrote: >> Thank you for the reviews. >> >> I've created a new webrev with a "_release" version instead of "_no_release": >> http://cr.openjdk.java.net/~mdoerr/8206003_tlh_sync_membars/webrev.01/ > > That all seems fine. > >> Due to this change, SafepointMechanism::initialize_header doesn't use a release barrier anymore which should be fine. > > I agree. The JavaThread being constructed does not yet have a native > thread associated with it so there is no "acquire" for a "release" to > pair with in this case. Native thread creation/execution has its own > memory barriers. > >> Pushed to jdk/submit11 and our internal testing. > > I'll put this through our internal testing too. > > Thanks, > David > >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Freitag, 29. Juni 2018 00:49 >> To: Erik ?sterlund ; Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net; Robbin Ehn ; Andrew Haley (aph at redhat.com) >> Subject: Re: RFR(S): 8206003: SafepointSynchronize with TLH: StoreStore barriers should be moved out of the loop >> >> On 29/06/2018 1:28 AM, Erik ?sterlund wrote: >>> Hi Martin, >>> >>> This did catch my eye too. This looks good to me. But could you consider >>> having _release in the name of the setter that uses release, and no >>> postfix for the one using a plain store, instead of giving that one a >>> _no_release postfix. I don't need another webrev. >> >> +1 >> >> I'm assuming that nothing may be tripped up (ie assertion somewhere) if >> the polling status of different threads can now be seen out-of-order. >> >> Thanks, >> David >> >>> >>> Thanks, >>> /Erik >>> >>> On 2018-06-28 16:52, Doerr, Martin wrote: >>>> Hi, >>>> >>>> I have recently come across a bad placement of memory barriers in >>>> SafepointSynchronize::begin() and end() which were changed for JEP >>>> 312: Thread-Local Handshakes. They iterate over all JavaThreads and >>>> call SafepointMechanism::arm_local_poll or disarm_local_poll. >>>> Unfortunately, the release barriers are inside the latter functions. >>>> >>>> Assume we have several 1000 JavaThreads. This means the code executes >>>> several 1000 release barriers on weak memory model platforms (PPC64 >>>> and ARM/aarch64). Only one is needed. >>>> >>>> A goal of JEP 312 was to minimize latency of safepoints which gets >>>> defeated by this issue to some extend on these platforms. >>>> >>>> It could be fixed by this proposal: >>>> http://cr.openjdk.java.net/~mdoerr/8206003_tlh_sync_membars/webrev.00/ >>>> >>>> Please review. >>>> >>>> Best regards, >>>> Martin >>>> >>> From zgu at redhat.com Mon Jul 2 12:00:30 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 2 Jul 2018 08:00:30 -0400 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: Hi Dan, Thanks for reminding. jdk-submit came back clean. -Zhengyu On 06/29/2018 09:07 AM, Daniel D. Daugherty wrote: > Please don't forget to do a jdk-submit run. > > Dan > > > On 6/29/18 9:04 AM, Zhengyu Gu wrote: >> Hi, >> >> clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is >> all zeros, and since it is a static constant, it places the object in >> the read-only BSS data section. >> >> To workaround static initialization ordering issue, NMT has to ensure >> EMPTY_STACK is initialized before turns itself on, which can happen in >> the middle of initialization of other static objects. In this case, it >> causes SIGSEGV while try to write to the read-only memory. >> >> The solution is to make EMPTY_STACk private and non-constant, but >> hands out constant version. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 >> Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ >> >> Test: >> >> ? hotspot_nmt on Linux 64 (fastdebug and release) >> >> Thanks, >> >> -Zhengyu >> > From zgu at redhat.com Mon Jul 2 12:16:51 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 2 Jul 2018 08:16:51 -0400 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: <25d78407-1da8-8c1b-a431-b8692ffd28e1@redhat.com> Thanks, Thomas! -Zhengyu On 07/01/2018 08:18 AM, Thomas St?fe wrote: > Hi Zhengyu, > > On Fri, Jun 29, 2018 at 10:17 PM, Zhengyu Gu wrote: >> Hi Thomas, >> >> On 06/29/2018 03:56 PM, Thomas St?fe wrote: >>> >>> Hi Zhengyu, >>> >>> do I understand the problem right that the static initialization of >>> EMPTY_STACK can be preceded by a call to MemTracker::tracking_level()? >>> Otherwise I do not understand the placement new in >>> MemTracker::tracking_level(). >>> >> Correct. >> >>> If yes, how? Do we really run that much complex code as part of C++ >>> static initialization? >> >> >> Because there are other static objects that call os::malloc/new in their >> constructors, and they may be initialized prior to EMPTY_STACK. >> > > this is terrible :) > > We should probably fix that sometime, but in the meantime your change > makes sense to me. Reviewed. > > Thanks, Thomas > > >> Thanks, >> >> -Zhengyu >> >> >> >> >> >>> >>> Thanks, Thomas >>> >>> >>> On Fri, Jun 29, 2018 at 3:04 PM, Zhengyu Gu wrote: >>>> >>>> Hi, >>>> >>>> clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is all >>>> zeros, and since it is a static constant, it places the object in the >>>> read-only BSS data section. >>>> >>>> To workaround static initialization ordering issue, NMT has to ensure >>>> EMPTY_STACK is initialized before turns itself on, which can happen in >>>> the >>>> middle of initialization of other static objects. In this case, it causes >>>> SIGSEGV while try to write to the read-only memory. >>>> >>>> The solution is to make EMPTY_STACk private and non-constant, but hands >>>> out >>>> constant version. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 >>>> Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ >>>> >>>> Test: >>>> >>>> hotspot_nmt on Linux 64 (fastdebug and release) >>>> >>>> Thanks, >>>> >>>> -Zhengyu From martinrb at google.com Mon Jul 2 19:26:39 2018 From: martinrb at google.com (Martin Buchholz) Date: Mon, 2 Jul 2018 12:26:39 -0700 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: Zhengyu, Thanks for fixing this! I would have tried to use *Construct On First Use Idiom* https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use but nothing here is easy, unlike in Java with its static finals. (and I'm not (yet) a hotspot engineer) On Fri, Jun 29, 2018 at 6:04 AM, Zhengyu Gu wrote: > Hi, > > clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is all > zeros, and since it is a static constant, it places the object in the > read-only BSS data section. > > To workaround static initialization ordering issue, NMT has to ensure > EMPTY_STACK is initialized before turns itself on, which can happen in the > middle of initialization of other static objects. In this case, it causes > SIGSEGV while try to write to the read-only memory. > > The solution is to make EMPTY_STACk private and non-constant, but hands > out constant version. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 > Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ > > Test: > > hotspot_nmt on Linux 64 (fastdebug and release) > > Thanks, > > -Zhengyu > From martinrb at google.com Mon Jul 2 19:35:36 2018 From: martinrb at google.com (Martin Buchholz) Date: Mon, 2 Jul 2018 12:35:36 -0700 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: Can we have this fix make it into jdk11? From zgu at redhat.com Mon Jul 2 19:41:59 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 2 Jul 2018 15:41:59 -0400 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: <1f05f4d7-0ffc-5aa0-1a0c-235b5e289eef@redhat.com> Hi Martin, On 07/02/2018 03:26 PM, Martin Buchholz wrote: > Zhengyu, > > Thanks for fixing this! > > I would have tried to use /Construct On First Use Idiom/ > https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use > but nothing here is easy, unlike in Java with its static finals. > (and I'm not (yet) a hotspot engineer) Thanks for the link. However, this pattern does not apply here, since we *cannot* allocate any objects at this point. Hotspot disables global new operator, and allocates any CHeapObj here will loop back to initialize NMT (when it is on). We have to workaround this problem in several places, see comments in services/mallocSiteTable.hpp, for an example. Thanks, -Zhengyu > > > On Fri, Jun 29, 2018 at 6:04 AM, Zhengyu Gu > wrote: > > Hi, > > clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is > all zeros, and since it is a static constant, it places the object > in the read-only BSS data section. > > To workaround static initialization ordering issue, NMT has to > ensure EMPTY_STACK is initialized before turns itself on, which can > happen in the middle of initialization of other static objects. In > this case, it causes SIGSEGV while try to write to the read-only memory. > > The solution is to make EMPTY_STACk private and non-constant, but > hands out constant version. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 > > Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ > > > Test: > > ? hotspot_nmt on Linux 64 (fastdebug and release) > > Thanks, > > -Zhengyu > > From zgu at redhat.com Mon Jul 2 19:46:13 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 2 Jul 2018 15:46:13 -0400 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: <600c0bea-3d4d-1d1b-f88d-63150295b790@redhat.com> Hi, Could I get second review? Thanks, -Zhengyu On 07/01/2018 08:18 AM, Thomas St?fe wrote: > Hi Zhengyu, > > On Fri, Jun 29, 2018 at 10:17 PM, Zhengyu Gu wrote: >> Hi Thomas, >> >> On 06/29/2018 03:56 PM, Thomas St?fe wrote: >>> >>> Hi Zhengyu, >>> >>> do I understand the problem right that the static initialization of >>> EMPTY_STACK can be preceded by a call to MemTracker::tracking_level()? >>> Otherwise I do not understand the placement new in >>> MemTracker::tracking_level(). >>> >> Correct. >> >>> If yes, how? Do we really run that much complex code as part of C++ >>> static initialization? >> >> >> Because there are other static objects that call os::malloc/new in their >> constructors, and they may be initialized prior to EMPTY_STACK. >> > > this is terrible :) > > We should probably fix that sometime, but in the meantime your change > makes sense to me. Reviewed. > > Thanks, Thomas > > >> Thanks, >> >> -Zhengyu >> >> >> >> >> >>> >>> Thanks, Thomas >>> >>> >>> On Fri, Jun 29, 2018 at 3:04 PM, Zhengyu Gu wrote: >>>> >>>> Hi, >>>> >>>> clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is all >>>> zeros, and since it is a static constant, it places the object in the >>>> read-only BSS data section. >>>> >>>> To workaround static initialization ordering issue, NMT has to ensure >>>> EMPTY_STACK is initialized before turns itself on, which can happen in >>>> the >>>> middle of initialization of other static objects. In this case, it causes >>>> SIGSEGV while try to write to the read-only memory. >>>> >>>> The solution is to make EMPTY_STACk private and non-constant, but hands >>>> out >>>> constant version. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 >>>> Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ >>>> >>>> Test: >>>> >>>> hotspot_nmt on Linux 64 (fastdebug and release) >>>> >>>> Thanks, >>>> >>>> -Zhengyu From martinrb at google.com Mon Jul 2 20:12:56 2018 From: martinrb at google.com (Martin Buchholz) Date: Mon, 2 Jul 2018 13:12:56 -0700 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: <1f05f4d7-0ffc-5aa0-1a0c-235b5e289eef@redhat.com> References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> <1f05f4d7-0ffc-5aa0-1a0c-235b5e289eef@redhat.com> Message-ID: Below is a possible untested attempt to /Construct On First Use Idiom/ that doesn't (appear to) allocate, and gets rid of the double construction. diff --git a/src/hotspot/share/services/mallocSiteTable.hpp b/src/hotspot/share/services/mallocSiteTable.hpp --- a/src/hotspot/share/services/mallocSiteTable.hpp +++ b/src/hotspot/share/services/mallocSiteTable.hpp @@ -42,7 +42,7 @@ class MallocSite : public AllocationSite public: MallocSite() : - AllocationSite(NativeCallStack::EMPTY_STACK), _flags(mtNone) {} + AllocationSite(NativeCallStack::empty_stack()), _flags(mtNone) {} MallocSite(const NativeCallStack& stack, MEMFLAGS flags) : AllocationSite(stack), _flags(flags) {} diff --git a/src/hotspot/share/services/memTracker.cpp b/src/hotspot/share/services/memTracker.cpp --- a/src/hotspot/share/services/memTracker.cpp +++ b/src/hotspot/share/services/memTracker.cpp @@ -68,10 +68,6 @@ NMT_TrackingLevel MemTracker::init_track os::unsetenv(buf); } - // Construct NativeCallStack::EMPTY_STACK. It may get constructed twice, - // but it is benign, the results are the same. - ::new ((void*)&NativeCallStack::EMPTY_STACK) NativeCallStack(0, false); - if (!MallocTracker::initialize(level) || !VirtualMemoryTracker::initialize(level)) { level = NMT_off; diff --git a/src/hotspot/share/services/memTracker.hpp b/src/hotspot/share/services/memTracker.hpp --- a/src/hotspot/share/services/memTracker.hpp +++ b/src/hotspot/share/services/memTracker.hpp @@ -31,8 +31,8 @@ #if !INCLUDE_NMT -#define CURRENT_PC NativeCallStack::EMPTY_STACK -#define CALLER_PC NativeCallStack::EMPTY_STACK +#define CURRENT_PC NativeCallStack::empty_stack() +#define CALLER_PC NativeCallStack::empty_stack() class Tracker : public StackObj { public: @@ -86,9 +86,9 @@ class MemTracker : AllStatic { extern volatile bool NMT_stack_walkable; #define CURRENT_PC ((MemTracker::tracking_level() == NMT_detail && NMT_stack_walkable) ? \ - NativeCallStack(0, true) : NativeCallStack::EMPTY_STACK) + NativeCallStack(0, true) : NativeCallStack::empty_stack()) #define CALLER_PC ((MemTracker::tracking_level() == NMT_detail && NMT_stack_walkable) ? \ - NativeCallStack(1, true) : NativeCallStack::EMPTY_STACK) + NativeCallStack(1, true) : NativeCallStack::empty_stack()) class MemBaseline; class Mutex; diff --git a/src/hotspot/share/services/virtualMemoryTracker.hpp b/src/hotspot/share/services/virtualMemoryTracker.hpp --- a/src/hotspot/share/services/virtualMemoryTracker.hpp +++ b/src/hotspot/share/services/virtualMemoryTracker.hpp @@ -302,7 +302,7 @@ class ReservedMemoryRegion : public Virt ReservedMemoryRegion(address base, size_t size) : - VirtualMemoryRegion(base, size), _stack(NativeCallStack::EMPTY_STACK), _flag(mtNone) { } + VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(mtNone) { } // Copy constructor ReservedMemoryRegion(const ReservedMemoryRegion& rr) : diff --git a/src/hotspot/share/utilities/nativeCallStack.cpp b/src/hotspot/share/utilities/nativeCallStack.cpp --- a/src/hotspot/share/utilities/nativeCallStack.cpp +++ b/src/hotspot/share/utilities/nativeCallStack.cpp @@ -28,8 +28,6 @@ #include "utilities/globalDefinitions.hpp" #include "utilities/nativeCallStack.hpp" -const NativeCallStack NativeCallStack::EMPTY_STACK(0, false); - NativeCallStack::NativeCallStack(int toSkip, bool fillStack) : _hash_value(0) { diff --git a/src/hotspot/share/utilities/nativeCallStack.hpp b/src/hotspot/share/utilities/nativeCallStack.hpp --- a/src/hotspot/share/utilities/nativeCallStack.hpp +++ b/src/hotspot/share/utilities/nativeCallStack.hpp @@ -53,7 +53,10 @@ */ class NativeCallStack : public StackObj { public: - static const NativeCallStack EMPTY_STACK; + inline static const NativeCallStack& empty_stack() { + static const NativeCallStack EMPTY_STACK(0, false); + return EMPTY_STACK; + } private: address _stack[NMT_TrackingStackDepth]; On Mon, Jul 2, 2018 at 12:41 PM, Zhengyu Gu wrote: > Hi Martin, > > > On 07/02/2018 03:26 PM, Martin Buchholz wrote: > >> Zhengyu, >> >> Thanks for fixing this! >> >> I would have tried to use /Construct On First Use Idiom/ >> https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use >> but nothing here is easy, unlike in Java with its static finals. >> (and I'm not (yet) a hotspot engineer) >> > > Thanks for the link. > > However, this pattern does not apply here, since we *cannot* allocate any > objects at this point. > > Hotspot disables global new operator, and allocates any CHeapObj here will > loop back to initialize NMT (when it is on). We have to workaround this > problem in several places, see comments in services/mallocSiteTable.hpp, > for an example. > > Thanks, > > -Zhengyu > > > >> >> On Fri, Jun 29, 2018 at 6:04 AM, Zhengyu Gu > zgu at redhat.com>> wrote: >> >> Hi, >> >> clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is >> all zeros, and since it is a static constant, it places the object >> in the read-only BSS data section. >> >> To workaround static initialization ordering issue, NMT has to >> ensure EMPTY_STACK is initialized before turns itself on, which can >> happen in the middle of initialization of other static objects. In >> this case, it causes SIGSEGV while try to write to the read-only >> memory. >> >> The solution is to make EMPTY_STACk private and non-constant, but >> hands out constant version. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 >> >> Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ >> >> >> Test: >> >> hotspot_nmt on Linux 64 (fastdebug and release) >> >> Thanks, >> >> -Zhengyu >> >> >> From martinrb at google.com Mon Jul 2 20:17:01 2018 From: martinrb at google.com (Martin Buchholz) Date: Mon, 2 Jul 2018 13:17:01 -0700 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: <600c0bea-3d4d-1d1b-f88d-63150295b790@redhat.com> References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> <600c0bea-3d4d-1d1b-f88d-63150295b790@redhat.com> Message-ID: Whether or not you use my suggestions, I am your second reviewer. Approved. On Mon, Jul 2, 2018 at 12:46 PM, Zhengyu Gu wrote: > Hi, > > Could I get second review? > > Thanks, > > -Zhengyu > > > On 07/01/2018 08:18 AM, Thomas St?fe wrote: > >> Hi Zhengyu, >> >> On Fri, Jun 29, 2018 at 10:17 PM, Zhengyu Gu wrote: >> >>> Hi Thomas, >>> >>> On 06/29/2018 03:56 PM, Thomas St?fe wrote: >>> >>>> >>>> Hi Zhengyu, >>>> >>>> do I understand the problem right that the static initialization of >>>> EMPTY_STACK can be preceded by a call to MemTracker::tracking_level()? >>>> Otherwise I do not understand the placement new in >>>> MemTracker::tracking_level(). >>>> >>>> Correct. >>> >>> If yes, how? Do we really run that much complex code as part of C++ >>>> static initialization? >>>> >>> >>> >>> Because there are other static objects that call os::malloc/new in their >>> constructors, and they may be initialized prior to EMPTY_STACK. >>> >>> >> this is terrible :) >> >> We should probably fix that sometime, but in the meantime your change >> makes sense to me. Reviewed. >> >> Thanks, Thomas >> >> >> Thanks, >>> >>> -Zhengyu >>> >>> >>> >>> >>> >>> >>>> Thanks, Thomas >>>> >>>> >>>> On Fri, Jun 29, 2018 at 3:04 PM, Zhengyu Gu wrote: >>>> >>>>> >>>>> Hi, >>>>> >>>>> clang-6.0 and above, can deduce that NativeCallStack::EMPTY_STACK is >>>>> all >>>>> zeros, and since it is a static constant, it places the object in the >>>>> read-only BSS data section. >>>>> >>>>> To workaround static initialization ordering issue, NMT has to ensure >>>>> EMPTY_STACK is initialized before turns itself on, which can happen in >>>>> the >>>>> middle of initialization of other static objects. In this case, it >>>>> causes >>>>> SIGSEGV while try to write to the read-only memory. >>>>> >>>>> The solution is to make EMPTY_STACk private and non-constant, but hands >>>>> out >>>>> constant version. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205965 >>>>> Webrev: http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ >>>>> >>>>> Test: >>>>> >>>>> hotspot_nmt on Linux 64 (fastdebug and release) >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>> From zgu at redhat.com Mon Jul 2 20:36:45 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 2 Jul 2018 16:36:45 -0400 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> <600c0bea-3d4d-1d1b-f88d-63150295b790@redhat.com> Message-ID: Hi Martin, Thanks for the review. Let's go with current version, and file a new RFE to verify your untested version: https://bugs.openjdk.java.net/browse/JDK-8206183 -Zhengyu On 07/02/2018 04:17 PM, Martin Buchholz wrote: > Whether or not you use my suggestions, I am your second reviewer.? Approved. > > On Mon, Jul 2, 2018 at 12:46 PM, Zhengyu Gu > wrote: > > Hi, > > Could I get second review? > > Thanks, > > -Zhengyu > > > On 07/01/2018 08:18 AM, Thomas St?fe wrote: > > Hi Zhengyu, > > On Fri, Jun 29, 2018 at 10:17 PM, Zhengyu Gu > wrote: > > Hi Thomas, > > On 06/29/2018 03:56 PM, Thomas St?fe wrote: > > > Hi Zhengyu, > > do I understand the problem right that the static > initialization of > EMPTY_STACK can be preceded by a call to > MemTracker::tracking_level()? > Otherwise I do not understand the placement new in > MemTracker::tracking_level(). > > Correct. > > If yes, how? Do we really run that much complex code as > part of C++ > static initialization? > > > > Because there are other static objects that call > os::malloc/new in their > constructors, and they may be initialized prior to EMPTY_STACK. > > > this is terrible :) > > We should probably fix that sometime, but in the meantime your > change > makes sense to me. Reviewed. > > Thanks, Thomas > > > Thanks, > > -Zhengyu > > > > > > > Thanks, Thomas > > > On Fri, Jun 29, 2018 at 3:04 PM, Zhengyu Gu > > wrote: > > > Hi, > > clang-6.0 and above, can deduce that > NativeCallStack::EMPTY_STACK is all > zeros, and since it is a static constant, it places > the object in the > read-only BSS data section. > > To workaround static initialization ordering issue, > NMT has to ensure > EMPTY_STACK is initialized before turns itself on, > which can happen in > the > middle of initialization of other static objects. In > this case, it causes > SIGSEGV while try to write to the read-only memory. > > The solution is to make EMPTY_STACk private and > non-constant, but hands > out > constant version. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8205965 > > Webrev: > http://cr.openjdk.java.net/~zgu/8205965/webrev.00/ > > > Test: > > ? ? hotspot_nmt on Linux 64 (fastdebug and release) > > Thanks, > > -Zhengyu > > From zgu at redhat.com Mon Jul 2 21:04:54 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 2 Jul 2018 17:04:54 -0400 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> Message-ID: <209f5eee-e387-1bdd-8d3c-ef848842ac19@redhat.com> Hi Martin, I pushed to jdk repo, but have no clue on the process to get it to jdk11. -Zhengyu On 07/02/2018 03:35 PM, Martin Buchholz wrote: > > Can we have this fix make it into jdk11? From david.holmes at oracle.com Mon Jul 2 21:12:44 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Jul 2018 07:12:44 +1000 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: <209f5eee-e387-1bdd-8d3c-ef848842ac19@redhat.com> References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> <209f5eee-e387-1bdd-8d3c-ef848842ac19@redhat.com> Message-ID: <7355bb55-45c0-019b-44ac-e434d3c16b4f@oracle.com> On 3/07/2018 7:04 AM, Zhengyu Gu wrote: > Hi Martin, > > I pushed to jdk repo, but have no clue on the process to get it to jdk11. Nothing special in RDP1. This is a P2 so has to be fixed in 11. Simply clone the jdk11 repo and apply the changeset there. David > -Zhengyu > > On 07/02/2018 03:35 PM, Martin Buchholz wrote: >> >> Can we have this fix make it into jdk11? From zgu at redhat.com Mon Jul 2 21:40:48 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 2 Jul 2018 17:40:48 -0400 Subject: RFR(S) 8205965: SIGSEGV on write to NativeCallStack::EMPTY_STACK In-Reply-To: <7355bb55-45c0-019b-44ac-e434d3c16b4f@oracle.com> References: <91a6eeb0-00d9-dea1-586b-b8ac515c2a15@redhat.com> <209f5eee-e387-1bdd-8d3c-ef848842ac19@redhat.com> <7355bb55-45c0-019b-44ac-e434d3c16b4f@oracle.com> Message-ID: <8feeb028-227c-ee2d-a36a-3321a9db8358@redhat.com> Thanks, David! I will push to jdk11. -Zhengyu On 07/02/2018 05:12 PM, David Holmes wrote: > On 3/07/2018 7:04 AM, Zhengyu Gu wrote: >> Hi Martin, >> >> I pushed to jdk repo, but have no clue on the process to get it to jdk11. > > Nothing special in RDP1. This is a P2 so has to be fixed in 11. Simply > clone the jdk11 repo and apply the changeset there. > > David > >> -Zhengyu >> >> On 07/02/2018 03:35 PM, Martin Buchholz wrote: >>> >>> Can we have this fix make it into jdk11? From david.holmes at oracle.com Tue Jul 3 09:21:56 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Jul 2018 19:21:56 +1000 Subject: RFC: more robust handling of terminated but still attached threads Message-ID: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 We hit asserts or trigger SEGVs when we try to operate on a native thread ID for a JNI-attached thread that has actually terminated but which did not detach first. It still appears in the threadsList and we try to process it during DumpOnExit (but there are probably other operations that could run into this in the general case). Fixing the tests is easy. But the more general question is how to make the VM code more robust in the face of this situation. At the lowest level we can watch for ESRCH from pthread_* functions and try to program in alternate logic that gives some "result" for that thread. At higher-level we may be able to heuristically guess that the native thread has terminated and so skip it in ALL_JAVA_THREADS and similar constructors. For example pthread_kill(t,0) can heuristically check if 't' is not alive as it may return ESRCH. But of course if t terminated then it is entirely possible that the pthread_t value for it has been reused. And if t is not going to detach we could be racing with its termination anyway - so the heuristic may pass and we still hit a low-level assert or SEGV. What do people think? Do we try to deal with this at the bottom, or at the top, or all the way through? (There's obviously a diminishing return on effort versus benefit here.) Thanks, David From fweimer at redhat.com Tue Jul 3 11:28:49 2018 From: fweimer at redhat.com (Florian Weimer) Date: Tue, 3 Jul 2018 13:28:49 +0200 Subject: RFC: more robust handling of terminated but still attached threads In-Reply-To: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> References: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> Message-ID: <741e23e9-fe39-ae4c-1e1a-3cb652219e63@redhat.com> On 07/03/2018 11:21 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 > > We hit asserts or trigger SEGVs when we try to operate on a native > thread ID for a JNI-attached thread that has actually terminated but > which did not detach first. It still appears in the threadsList and we > try to process it during DumpOnExit (but there are probably other > operations that could run into this in the general case). This bug is not public. The use case isn't entirely clear to me. If you are sufficiently unlucky, the memory behind a pthread_t value is simply gone after thread exit (and potentially TCB/thread stack reclamation in the thread library). On glibc, this includes the internal TID, which is required for pthread_kill (thr, 0) actually sending the signal. I'm not familiar with the Hotspot run-time and why it needs to do this. Can you deregister the thread from a thread directory once it exits (using one of the TLS variants with a destructor)? Or is the concern there that the destructor would not run late enough? Thanks, Florian From david.holmes at oracle.com Tue Jul 3 12:09:26 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Jul 2018 22:09:26 +1000 Subject: RFC: more robust handling of terminated but still attached threads In-Reply-To: <741e23e9-fe39-ae4c-1e1a-3cb652219e63@redhat.com> References: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> <741e23e9-fe39-ae4c-1e1a-3cb652219e63@redhat.com> Message-ID: On 3/07/2018 9:28 PM, Florian Weimer wrote: > On 07/03/2018 11:21 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >> >> We hit asserts or trigger SEGVs when we try to operate on a native >> thread ID for a JNI-attached thread that has actually terminated but >> which did not detach first. It still appears in the threadsList and we >> try to process it during DumpOnExit (but there are probably other >> operations that could run into this in the general case). > > This bug is not public. Sorry I'll try to get that changed. It's an issue with some of the newly opened tests in: vmTestbase/nsk/jvmti/scenarios/jni_interception/ when run with FlightRecorder set to use DumpOnExit. I will of course fix the tests. > The use case isn't entirely clear to me.? If you are sufficiently > unlucky, the memory behind a pthread_t value is simply gone after thread > exit (and potentially TCB/thread stack reclamation in the thread > library).? On glibc, this includes the internal TID, which is required > for pthread_kill (thr, 0) actually sending the signal. IIUC pthread_kill(thr,0) never sends any signal, but may lookup the id to see if it is valid. I understand there's no guarantee and that there is an inherent race regardless. > I'm not familiar with the Hotspot run-time and why it needs to do this. > Can you deregister the thread from a thread directory once it exits > (using one of the TLS variants with a destructor)?? Or is the concern > there that the destructor would not run late enough? The issue is native process threads that attach to the VM through JNI but then don't detach themselves before terminating. While it may be possible to create such a mechanism as you describe it goes way beyond what I'm trying to do here and violates a basic principle that we try to interfere as little as possible with threads that attach to the VM directly (rather than being created by the VM). There was also a rather complex bug involving native threads that themselves provided such a TLS destructor (to detach themselves) and the VMs own (fairly recent) use of TLS. All I'm looking at is some basic robustness if the VM encounters such a thread (for which all the VM data structures remain intact - and effectively leak) so that we don't assert or crash when we do invoke a pthread function (pthread_getcpuclockid is the one in question in the bug report). It may be that it isn't really worth trying to do this given it can't be 100% reliable anyway. Thanks, David > Thanks, > Florian From fweimer at redhat.com Tue Jul 3 12:19:57 2018 From: fweimer at redhat.com (Florian Weimer) Date: Tue, 3 Jul 2018 14:19:57 +0200 Subject: RFC: more robust handling of terminated but still attached threads In-Reply-To: References: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> <741e23e9-fe39-ae4c-1e1a-3cb652219e63@redhat.com> Message-ID: <9e6322c4-2d72-a8bb-28b1-664be427324c@redhat.com> On 07/03/2018 02:09 PM, David Holmes wrote: >> The use case isn't entirely clear to me.? If you are sufficiently >> unlucky, the memory behind a pthread_t value is simply gone after >> thread exit (and potentially TCB/thread stack reclamation in the >> thread library).? On glibc, this includes the internal TID, which is >> required for pthread_kill (thr, 0) actually sending the signal. > > IIUC pthread_kill(thr,0) never sends any signal, but may lookup the id > to see if it is valid. I understand there's no guarantee and that there > is an inherent race regardless. It still makes a system call to send the pseudo-signal 0. This is what I meant. It can bail out earlier in case of terminated threads which have not yet been joined, though. >> I'm not familiar with the Hotspot run-time and why it needs to do >> this. Can you deregister the thread from a thread directory once it >> exits (using one of the TLS variants with a destructor)?? Or is the >> concern there that the destructor would not run late enough? > > The issue is native process threads that attach to the VM through JNI > but then don't detach themselves before terminating. While it may be > possible to create such a mechanism as you describe it goes way beyond > what I'm trying to do here and violates a basic principle that we try to > interfere as little as possible with threads that attach to the VM > directly (rather than being created by the VM). There was also a rather > complex bug involving native threads that themselves provided such a TLS > destructor (to detach themselves) and the VMs own (fairly recent) use of > TLS. > > All I'm looking at is some basic robustness if the VM encounters such a > thread (for which all the VM data structures remain intact - and > effectively leak) so that we don't assert or crash when we do invoke a > pthread function (pthread_getcpuclockid is the one in question in the > bug report). You could capture the TID and the task creation time from /proc when the thread is attached, and try to recover the information you need from /proc afterwards (possibly with a comparison to the startup time). You probably cannot ensure that the thread will not suddenly cease to exist, so none of the pthread_* functions cannot be called. The only in-process way I can image which ensures that the thread stays around is to send it a signal with an unblocked handler which you control, and which can then prevent the thread from exiting indefinitely. But that is a very heavy-handed approach. Out-of-process, you could use ptrace to freeze threads. Thanks, Florian From aph at redhat.com Tue Jul 3 14:31:21 2018 From: aph at redhat.com (aph) Date: Tue, 3 Jul 2018 15:31:21 +0100 Subject: RFR: 8206267: Unsafe publication of StubCodeDesc leads to crashes Message-ID: The StubCodeDesc constructor is unsychronized. However, it runs when the C2 compiler thread is initializing. The compiler thread reads the StubCodeDesc list while it is in an unstable state, resulting in a read from an uninitialized pointer field and it then segfaults, causing the VM to abort. http://cr.openjdk.java.net/~aph/8206267/ OK for 11 and 12? -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at redhat.com Tue Jul 3 14:57:39 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 3 Jul 2018 16:57:39 +0200 Subject: RFR: 8206267: Unsafe publication of StubCodeDesc leads to crashes In-Reply-To: References: Message-ID: <782ea25b-dc66-3ad1-d023-70345f59ee58@redhat.com> On 07/03/2018 04:31 PM, aph wrote: > The StubCodeDesc constructor is unsychronized. However, it runs when > the C2 compiler thread is initializing. The compiler thread reads the > StubCodeDesc list while it is in an unstable state, resulting in a > read from an uninitialized pointer field and it then segfaults, > causing the VM to abort. > > http://cr.openjdk.java.net/~aph/8206267/ > > OK for 11 and 12? Looks good for 12. Looks simple enough for 11. Star formatting is a bit awkward: 37 StubCodeDesc *volatile StubCodeDesc::_list = NULL; ... 42 static StubCodeDesc *volatile _list; // the list of all descriptors ...should probably be: 37 StubCodeDesc* volatile StubCodeDesc::_list = NULL; ... 42 static StubCodeDesc* volatile _list; // the list of all descriptors -Aleksey From patricio.chilano.mateo at oracle.com Tue Jul 3 15:15:44 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Tue, 3 Jul 2018 11:15:44 -0400 Subject: 8134538: Duplicate implementations of os::lasterror Message-ID: Hi all, Could you please review this small change? Summary: Identical Linux, BSD, Solaris and AIX implementations of os::lasterror were replaced with a single os_posix one. Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 Webrev URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html The fix was tested with Mach5 on tiers 1-5 on all platforms. Thanks, Patricio From aph at redhat.com Tue Jul 3 15:16:04 2018 From: aph at redhat.com (Andrew Haley) Date: Tue, 3 Jul 2018 16:16:04 +0100 Subject: RFR: 8206267: Unsafe publication of StubCodeDesc leads to crashes In-Reply-To: <782ea25b-dc66-3ad1-d023-70345f59ee58@redhat.com> References: <782ea25b-dc66-3ad1-d023-70345f59ee58@redhat.com> Message-ID: <55629334-1f50-98c8-9530-c616234d2a97@redhat.com> On 07/03/2018 03:57 PM, Aleksey Shipilev wrote: > On 07/03/2018 04:31 PM, aph wrote: >> The StubCodeDesc constructor is unsychronized. However, it runs when >> the C2 compiler thread is initializing. The compiler thread reads the >> StubCodeDesc list while it is in an unstable state, resulting in a >> read from an uninitialized pointer field and it then segfaults, >> causing the VM to abort. >> >> http://cr.openjdk.java.net/~aph/8206267/ >> >> OK for 11 and 12? > > Looks good for 12. > Looks simple enough for 11. > > Star formatting is a bit awkward: > 37 StubCodeDesc *volatile StubCodeDesc::_list = NULL; > ... > 42 static StubCodeDesc *volatile _list; // the list of all descriptors > > ...should probably be: > 37 StubCodeDesc* volatile StubCodeDesc::_list = NULL; > ... > 42 static StubCodeDesc* volatile _list; // the list of all descriptors That's incorrect. Indirection binds to the right. You need to get it right for: int *a, b; which would be highly misleading as int *a, b; But anyway, I have withdrawn the bug report: it's been fixed a different way in current sources. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Tue Jul 3 15:17:21 2018 From: aph at redhat.com (Andrew Haley) Date: Tue, 3 Jul 2018 16:17:21 +0100 Subject: RFR: 8206267: Unsafe publication of StubCodeDesc leads to crashes In-Reply-To: <55629334-1f50-98c8-9530-c616234d2a97@redhat.com> References: <782ea25b-dc66-3ad1-d023-70345f59ee58@redhat.com> <55629334-1f50-98c8-9530-c616234d2a97@redhat.com> Message-ID: On 07/03/2018 04:16 PM, Andrew Haley wrote: > which would be highly misleading as > > int *a, b; int* a, b; LOL! It's too hot to think today. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From harold.seigel at oracle.com Tue Jul 3 15:21:52 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Tue, 3 Jul 2018 11:21:52 -0400 Subject: 8134538: Duplicate implementations of os::lasterror In-Reply-To: References: Message-ID: <1ec45d31-485e-7974-ca4b-5d4793d0a1e1@oracle.com> Hi Patricio, This change looks good. Thanks, Harold On 7/3/2018 11:15 AM, patricio.chilano.mateo at oracle.com wrote: > Hi all, > > Could you please review this small change? > > Summary: Identical Linux, BSD, Solaris and AIX implementations of > os::lasterror were replaced with a single os_posix one. > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 > Webrev > URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html > > > The fix was tested with Mach5 on tiers 1-5 on all platforms. > > Thanks, > Patricio From patricio.chilano.mateo at oracle.com Tue Jul 3 15:28:33 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Tue, 3 Jul 2018 11:28:33 -0400 Subject: 8134538: Duplicate implementations of os::lasterror In-Reply-To: <1ec45d31-485e-7974-ca4b-5d4793d0a1e1@oracle.com> References: <1ec45d31-485e-7974-ca4b-5d4793d0a1e1@oracle.com> Message-ID: Thanks Harold! Patricio On 7/3/18 11:21 AM, Harold David Seigel wrote: > Hi Patricio, > > This change looks good. > > Thanks, Harold > > On 7/3/2018 11:15 AM, patricio.chilano.mateo at oracle.com wrote: >> Hi all, >> >> Could you please review this small change? >> >> Summary: Identical Linux, BSD, Solaris and AIX implementations of >> os::lasterror were replaced with a single os_posix one. >> >> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 >> Webrev >> URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html >> >> >> The fix was tested with Mach5 on tiers 1-5 on all platforms. >> >> Thanks, >> Patricio > From coleen.phillimore at oracle.com Tue Jul 3 15:34:16 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Jul 2018 11:34:16 -0400 Subject: [12] RFR (S) 8205534: Remove SymbolTable dependency from serviceability agent In-Reply-To: References: <88e391a8-78a2-8dbc-a489-fce9c6b922b5@oracle.com> Message-ID: <29f661c2-ce48-b174-cb2f-7300bd4aa8f2@oracle.com> Hi Jini,? Thank you for reviewing this. On 6/29/18 12:02 PM, Jini George wrote: > Hi Coleen, > > Apologize for the delay. Your changes look good to me overall. A few > comments: > > It might make sense to also remove the corresponding lines in the > vmStructs files. Like: > > ?File????????? Line > vmStructs.cpp? 170 typedef RehashableHashtable > RehashableSymbolHashtable; > vmStructs.cpp? 477 static_field(RehashableSymbolHashtable, _seed, > juint)???????????????????????????????? \ > vmStructs.cpp 1362 declare_type(RehashableSymbolHashtable, > BasicHashtable)???? \ > vmStructs.cpp? 475 static_field(SymbolTable, _the_table, > SymbolTable*)????????????????????????? \ > vmStructs.cpp? 476 static_field(SymbolTable, _shared_table, > SymbolCompactHashTable)??????????????? \ > Gerard has these changes in his changeset for rewriting the SymbolTable so I am going to leave this part of the change to him. > You could also remove the "friend class VMStructs" from the > corresponding C++ data types. > Good point.? We'll make sure it's not there in his changes. > The test case: test/jdk/sun/tools/jhsdb/AlternateHashingTest.java with > the file: test/jdk/sun/tools/jhsdb/LingeredAppWithAltHashing.java were > created to test the alternate hashing mechanism of the SymbolTable in > SA. Don't know if it makes sense to retain these. > Ok, I was debating with myself whether to remove these.? It makes sense not to test something that doesn't test what's intended anymore.? I'll remove them. > One nit: > > Line 1079 of HeapHprofBinWriter.java: Extra spaces needed. > Fixed. Thanks! Coleen > Thanks, > Jini. > > > On 6/23/2018 3:10 AM, coleen.phillimore at oracle.com wrote: >> Summary: Modify SA code to not use SymbolTable and remove it. >> >> This is to support the concurrent hashtable for SymbolTable. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8205534 >> >> Tested with hs-tier1-5. >> >> Thanks, >> Coleen From aph at redhat.com Tue Jul 3 15:42:33 2018 From: aph at redhat.com (Andrew Haley) Date: Tue, 3 Jul 2018 16:42:33 +0100 Subject: Patch withdrawn: 8206267: Unsafe publication of StubCodeDesc leads to crashes In-Reply-To: References: Message-ID: <7278f633-fe24-5356-cf80-17297395ae5e@redhat.com> FYI. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From kim.barrett at oracle.com Tue Jul 3 16:56:34 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 3 Jul 2018 12:56:34 -0400 Subject: 8134538: Duplicate implementations of os::lasterror In-Reply-To: References: Message-ID: <18E97724-82A5-407E-9248-251354AEC8B5@oracle.com> > On Jul 3, 2018, at 11:15 AM, patricio.chilano.mateo at oracle.com wrote: > > Hi all, > > Could you please review this small change? > > Summary: Identical Linux, BSD, Solaris and AIX implementations of os::lasterror were replaced with a single os_posix one. > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 > Webrev URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html > > The fix was tested with Mach5 on tiers 1-5 on all platforms. > > Thanks, > Patricio Looks good. From patricio.chilano.mateo at oracle.com Tue Jul 3 17:29:55 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Tue, 3 Jul 2018 13:29:55 -0400 Subject: 8134538: Duplicate implementations of os::lasterror In-Reply-To: <18E97724-82A5-407E-9248-251354AEC8B5@oracle.com> References: <18E97724-82A5-407E-9248-251354AEC8B5@oracle.com> Message-ID: <6e86bd99-f56f-42bf-c59b-9185a38f47fb@oracle.com> Thanks Kim! Patricio On 7/3/18 12:56 PM, Kim Barrett wrote: >> On Jul 3, 2018, at 11:15 AM, patricio.chilano.mateo at oracle.com wrote: >> >> Hi all, >> >> Could you please review this small change? >> >> Summary: Identical Linux, BSD, Solaris and AIX implementations of os::lasterror were replaced with a single os_posix one. >> >> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 >> Webrev URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html >> >> The fix was tested with Mach5 on tiers 1-5 on all platforms. >> >> Thanks, >> Patricio > Looks good. > From harold.seigel at oracle.com Tue Jul 3 17:36:05 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Tue, 3 Jul 2018 13:36:05 -0400 Subject: [12] RFR (S) 8205534: Remove SymbolTable dependency from serviceability agent In-Reply-To: <29f661c2-ce48-b174-cb2f-7300bd4aa8f2@oracle.com> References: <88e391a8-78a2-8dbc-a489-fce9c6b922b5@oracle.com> <29f661c2-ce48-b174-cb2f-7300bd4aa8f2@oracle.com> Message-ID: <17800e54-494a-299c-ad93-ece46b463c5e@oracle.com> Hi Coleen, These changes look good.? Glad you got rid of a lot of code. Thanks, Harold On 7/3/2018 11:34 AM, coleen.phillimore at oracle.com wrote: > > Hi Jini,? Thank you for reviewing this. > > On 6/29/18 12:02 PM, Jini George wrote: >> Hi Coleen, >> >> Apologize for the delay. Your changes look good to me overall. A few >> comments: >> >> It might make sense to also remove the corresponding lines in the >> vmStructs files. Like: >> >> ?File????????? Line >> vmStructs.cpp? 170 typedef RehashableHashtable >> RehashableSymbolHashtable; >> vmStructs.cpp? 477 static_field(RehashableSymbolHashtable, _seed, >> juint)???????????????????????????????? \ >> vmStructs.cpp 1362 declare_type(RehashableSymbolHashtable, >> BasicHashtable)???? \ >> vmStructs.cpp? 475 static_field(SymbolTable, _the_table, >> SymbolTable*)????????????????????????? \ >> vmStructs.cpp? 476 static_field(SymbolTable, _shared_table, >> SymbolCompactHashTable)??????????????? \ >> > > Gerard has these changes in his changeset for rewriting the > SymbolTable so I am going to leave this part of the change to him. > >> You could also remove the "friend class VMStructs" from the >> corresponding C++ data types. >> > > Good point.? We'll make sure it's not there in his changes. > >> The test case: test/jdk/sun/tools/jhsdb/AlternateHashingTest.java >> with the file: >> test/jdk/sun/tools/jhsdb/LingeredAppWithAltHashing.java were created >> to test the alternate hashing mechanism of the SymbolTable in SA. >> Don't know if it makes sense to retain these. >> > > Ok, I was debating with myself whether to remove these.? It makes > sense not to test something that doesn't test what's intended > anymore.? I'll remove them. > > >> One nit: >> >> Line 1079 of HeapHprofBinWriter.java: Extra spaces needed. >> > Fixed. > > Thanks! > Coleen >> Thanks, >> Jini. >> >> >> On 6/23/2018 3:10 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Modify SA code to not use SymbolTable and remove it. >>> >>> This is to support the concurrent hashtable for SymbolTable. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8205534 >>> >>> Tested with hs-tier1-5. >>> >>> Thanks, >>> Coleen > From coleen.phillimore at oracle.com Tue Jul 3 17:55:48 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Jul 2018 13:55:48 -0400 Subject: [12] RFR (S) 8205534: Remove SymbolTable dependency from serviceability agent In-Reply-To: <17800e54-494a-299c-ad93-ece46b463c5e@oracle.com> References: <88e391a8-78a2-8dbc-a489-fce9c6b922b5@oracle.com> <29f661c2-ce48-b174-cb2f-7300bd4aa8f2@oracle.com> <17800e54-494a-299c-ad93-ece46b463c5e@oracle.com> Message-ID: <7f6285fa-bb7b-1240-6a61-ca5caae75f2b@oracle.com> Thank you, Harold! Coleen On 7/3/18 1:36 PM, Harold David Seigel wrote: > Hi Coleen, > > These changes look good.? Glad you got rid of a lot of code. > > Thanks, Harold > > On 7/3/2018 11:34 AM, coleen.phillimore at oracle.com wrote: >> >> Hi Jini,? Thank you for reviewing this. >> >> On 6/29/18 12:02 PM, Jini George wrote: >>> Hi Coleen, >>> >>> Apologize for the delay. Your changes look good to me overall. A few >>> comments: >>> >>> It might make sense to also remove the corresponding lines in the >>> vmStructs files. Like: >>> >>> ?File????????? Line >>> vmStructs.cpp? 170 typedef RehashableHashtable >>> RehashableSymbolHashtable; >>> vmStructs.cpp? 477 static_field(RehashableSymbolHashtable, _seed, >>> juint)???????????????????????????????? \ >>> vmStructs.cpp 1362 declare_type(RehashableSymbolHashtable, >>> BasicHashtable)???? \ >>> vmStructs.cpp? 475 static_field(SymbolTable, _the_table, >>> SymbolTable*)????????????????????????? \ >>> vmStructs.cpp? 476 static_field(SymbolTable, _shared_table, >>> SymbolCompactHashTable)??????????????? \ >>> >> >> Gerard has these changes in his changeset for rewriting the >> SymbolTable so I am going to leave this part of the change to him. >> >>> You could also remove the "friend class VMStructs" from the >>> corresponding C++ data types. >>> >> >> Good point.? We'll make sure it's not there in his changes. >> >>> The test case: test/jdk/sun/tools/jhsdb/AlternateHashingTest.java >>> with the file: >>> test/jdk/sun/tools/jhsdb/LingeredAppWithAltHashing.java were created >>> to test the alternate hashing mechanism of the SymbolTable in SA. >>> Don't know if it makes sense to retain these. >>> >> >> Ok, I was debating with myself whether to remove these.? It makes >> sense not to test something that doesn't test what's intended >> anymore.? I'll remove them. >> >> >>> One nit: >>> >>> Line 1079 of HeapHprofBinWriter.java: Extra spaces needed. >>> >> Fixed. >> >> Thanks! >> Coleen >>> Thanks, >>> Jini. >>> >>> >>> On 6/23/2018 3:10 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: Modify SA code to not use SymbolTable and remove it. >>>> >>>> This is to support the concurrent hashtable for SymbolTable. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8205534 >>>> >>>> Tested with hs-tier1-5. >>>> >>>> Thanks, >>>> Coleen >> > From coleen.phillimore at oracle.com Tue Jul 3 18:15:58 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Jul 2018 14:15:58 -0400 Subject: 8134538: Duplicate implementations of os::lasterror In-Reply-To: References: Message-ID: <25328158-1a55-2bb4-ecea-278b331fa129@oracle.com> This looks good, and I believe trivial enough to not need the 24 hour wait. Thanks, Coleen On 7/3/18 11:15 AM, patricio.chilano.mateo at oracle.com wrote: > Hi all, > > Could you please review this small change? > > Summary: Identical Linux, BSD, Solaris and AIX implementations of > os::lasterror were replaced with a single os_posix one. > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 > Webrev > URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html > > > The fix was tested with Mach5 on tiers 1-5 on all platforms. > > Thanks, > Patricio From patricio.chilano.mateo at oracle.com Tue Jul 3 18:30:37 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Tue, 3 Jul 2018 14:30:37 -0400 Subject: 8134538: Duplicate implementations of os::lasterror In-Reply-To: <25328158-1a55-2bb4-ecea-278b331fa129@oracle.com> References: <25328158-1a55-2bb4-ecea-278b331fa129@oracle.com> Message-ID: <5aaede9a-0229-200c-afb1-17cc209e454c@oracle.com> Thanks Coleen! Patricio On 7/3/18 2:15 PM, coleen.phillimore at oracle.com wrote: > > This looks good, and I believe trivial enough to not need the 24 hour > wait. > > Thanks, > Coleen > > On 7/3/18 11:15 AM, patricio.chilano.mateo at oracle.com wrote: >> Hi all, >> >> Could you please review this small change? >> >> Summary: Identical Linux, BSD, Solaris and AIX implementations of >> os::lasterror were replaced with a single os_posix one. >> >> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 >> Webrev >> URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html >> >> >> The fix was tested with Mach5 on tiers 1-5 on all platforms. >> >> Thanks, >> Patricio > From coleen.phillimore at oracle.com Tue Jul 3 20:04:06 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Jul 2018 16:04:06 -0400 Subject: RFR (trivial) 8206309: Tier1 SA tests fail Message-ID: <0c00a4eb-e9ab-1f0a-6424-a60759fc09ae@oracle.com> Summary: remove tests that should have been removed with last push open webrev at http://cr.openjdk.java.net/~coleenp/8206309.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8206309 Thanks, Coleen From harold.seigel at oracle.com Tue Jul 3 20:05:59 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Tue, 3 Jul 2018 16:05:59 -0400 Subject: RFR (trivial) 8206309: Tier1 SA tests fail In-Reply-To: <0c00a4eb-e9ab-1f0a-6424-a60759fc09ae@oracle.com> References: <0c00a4eb-e9ab-1f0a-6424-a60759fc09ae@oracle.com> Message-ID: Looks good and trivial! Thanks, Harold On 7/3/2018 4:04 PM, coleen.phillimore at oracle.com wrote: > Summary: remove tests that should have been removed with last push > > open webrev at http://cr.openjdk.java.net/~coleenp/8206309.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8206309 > > Thanks, > Coleen From coleen.phillimore at oracle.com Tue Jul 3 20:07:21 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Jul 2018 16:07:21 -0400 Subject: RFR (trivial) 8206309: Tier1 SA tests fail In-Reply-To: References: <0c00a4eb-e9ab-1f0a-6424-a60759fc09ae@oracle.com> Message-ID: <99ec8b87-544e-5bb5-0141-413dcec6b094@oracle.com> Thanks Harold.? I'm going to push now to fix tier1. Coleen On 7/3/18 4:05 PM, Harold David Seigel wrote: > Looks good and trivial! > > Thanks, Harold > > > On 7/3/2018 4:04 PM, coleen.phillimore at oracle.com wrote: >> Summary: remove tests that should have been removed with last push >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8206309.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8206309 >> >> Thanks, >> Coleen > From david.holmes at oracle.com Tue Jul 3 21:19:40 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 4 Jul 2018 07:19:40 +1000 Subject: 8134538: Duplicate implementations of os::lasterror In-Reply-To: References: Message-ID: Hi Patricio, Not that you need another review, but this looks good to me too. :) Aside: it seems to me that the only non-Windows use of os::lasterror may actually be incorrect: _file = fopen(file, "r"); _line_no = 0; _interfaces = new (ResourceObj::C_HEAP, mtClass) GrowableArray(10, true); if (_file == NULL) { char errmsg[JVM_MAXPATHLEN]; os::lasterror(errmsg, JVM_MAXPATHLEN); vm_exit_during_initialization("Loading classlist failed", errmsg); } 'errno' from the fopen may have been overwritten inside the allocation logic (particularly if logging is enabled) by the time we call os::lasterror. Cheers, David On 4/07/2018 1:15 AM, patricio.chilano.mateo at oracle.com wrote: > Hi all, > > Could you please review this small change? > > Summary: Identical Linux, BSD, Solaris and AIX implementations of > os::lasterror were replaced with a single os_posix one. > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8134538 > Webrev > URL:http://cr.openjdk.java.net/~coleenp/8134538.01/webrev/index.html > > > The fix was tested with Mach5 on tiers 1-5 on all platforms. > > Thanks, > Patricio From calvin.cheung at oracle.com Tue Jul 3 21:38:13 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 03 Jul 2018 14:38:13 -0700 Subject: RFR(S): 8205548: Remove multi-release jar related vm code Message-ID: <5B3BECC5.3040604@oracle.com> JBS: https://bugs.openjdk.java.net/browse/JDK-8205548 webrev: http://cr.openjdk.java.net/~ccheung/8205548/webrev.00/ Ran hs-tier{1,2,3,4} tests. thanks, Calvin From ioi.lam at oracle.com Tue Jul 3 21:52:56 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 3 Jul 2018 14:52:56 -0700 Subject: RFR(S): 8205548: Remove multi-release jar related vm code In-Reply-To: <5B3BECC5.3040604@oracle.com> References: <5B3BECC5.3040604@oracle.com> Message-ID: <24195d97-5e57-0f83-776c-f4562edb9f4d@oracle.com> Hi Calvin, The changes look good to me. Thanks for doing the clean up. - Ioi On 7/3/18 2:38 PM, Calvin Cheung wrote: > JBS: https://bugs.openjdk.java.net/browse/JDK-8205548 > > webrev: http://cr.openjdk.java.net/~ccheung/8205548/webrev.00/ > > Ran hs-tier{1,2,3,4} tests. > > thanks, > Calvin From headius at headius.com Tue Jul 3 21:56:05 2018 From: headius at headius.com (Charles Oliver Nutter) Date: Tue, 3 Jul 2018 16:56:05 -0500 Subject: Misbehaving exit status from Hotspot In-Reply-To: <292F06F8-36AB-4BE1-A763-1D4816F885D6@oracle.com> References: <297ee516-faa1-aaa6-f026-90287e360a99@oracle.com> <756b8d97-f11a-c1f2-3ef4-f76fd7253203@oracle.com> <12b0f7cf-3161-6ef2-bd23-aa386747da79@oracle.com> <2a183bbf-8be3-a036-f1c2-c3fd9c1669d3@redhat.com> <292F06F8-36AB-4BE1-A763-1D4816F885D6@oracle.com> Message-ID: On Sat, Jun 30, 2018 at 1:20 AM, John Rose wrote: > I'm sympathetic with what you want to do, but I would be scared to just change > the behavior of the VM in response to a signal, even in this relatively innocuous > way. Surely something out there will break. And I am sympathetic to breaking something. To be honest, at this point the discussion is mostly academic for me...I want to understand why things are the way they are beyond "because that's how we've always done it." I believe I understand the reasons better now. > My next thought is to throw in a -XX:+DoSignalsCharliesWay flag (not its real > name). The objection to that is that it adds an obscure corner to our testing > matrix. Not insuperable, but it's not something our test harnesses are well > designed for. > > BTW, why doesn't -Xrs work for you? That's certainly closer to the mark. > Is it that you want to run some JRuby shutdown hooks and then trap out > (rather than exit)? If so, I suppose you want some sort of -Xrs0.5. I don't think we use exit hooks at all in JRuby (or at least not the built-in support for it in the JDK) so -Xrs may be an acceptable answer for limited cases. > Underneath this stuff is a very simple API called JVM_handle_linux_signal. > This is the 'secret identity' of all the JVM's signal handlers. If you know this > identity, then perhaps you can set your own signal handler in its place, > and delegate everything to JVM_handle_linux_signal. Here's the signal > handler HotSpot uses for *all signals*: Yeah that would be another option. We do have a native JRuby executable that directly boots libjvm and goes from there. That would be an opportunity to register our own top-level TERM handler that does what I want. > Sorry it's not a simpler answer? It never is :-) I don't agree with how Hotspot handles this, but I'm willing to grant that it's not a big enough difference to warrant changing it. At this point I'll be going back to my users (and the JRuby FAQ) to let folks know this is just the way it is, and the justification for it. They probably won't agree either, but my hands are kinda tied. - Charlie From jiangli.zhou at oracle.com Tue Jul 3 22:09:50 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 3 Jul 2018 15:09:50 -0700 Subject: RFR(S): 8205548: Remove multi-release jar related vm code In-Reply-To: <5B3BECC5.3040604@oracle.com> References: <5B3BECC5.3040604@oracle.com> Message-ID: <4E9C114C-D48A-4A5C-A844-A6BAC13DA08D@oracle.com> Looks good. Thanks, Jiangli > On Jul 3, 2018, at 2:38 PM, Calvin Cheung wrote: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8205548 > > webrev: http://cr.openjdk.java.net/~ccheung/8205548/webrev.00/ > > Ran hs-tier{1,2,3,4} tests. > > thanks, > Calvin From calvin.cheung at oracle.com Tue Jul 3 23:11:59 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 03 Jul 2018 16:11:59 -0700 Subject: RFR(S): 8205548: Remove multi-release jar related vm code In-Reply-To: <24195d97-5e57-0f83-776c-f4562edb9f4d@oracle.com> References: <5B3BECC5.3040604@oracle.com> <24195d97-5e57-0f83-776c-f4562edb9f4d@oracle.com> Message-ID: <5B3C02BF.2060205@oracle.com> Hi Ioi, Thanks for your quick review. Calvin On 7/3/18, 2:52 PM, Ioi Lam wrote: > Hi Calvin, > > The changes look good to me. Thanks for doing the clean up. > > - Ioi > > > On 7/3/18 2:38 PM, Calvin Cheung wrote: >> JBS: https://bugs.openjdk.java.net/browse/JDK-8205548 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8205548/webrev.00/ >> >> Ran hs-tier{1,2,3,4} tests. >> >> thanks, >> Calvin > From calvin.cheung at oracle.com Tue Jul 3 23:13:23 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 03 Jul 2018 16:13:23 -0700 Subject: RFR(S): 8205548: Remove multi-release jar related vm code In-Reply-To: <4E9C114C-D48A-4A5C-A844-A6BAC13DA08D@oracle.com> References: <5B3BECC5.3040604@oracle.com> <4E9C114C-D48A-4A5C-A844-A6BAC13DA08D@oracle.com> Message-ID: <5B3C0313.2010600@oracle.com> Hi Jiangli, Thanks for your quick review. Calvin On 7/3/18, 3:09 PM, Jiangli Zhou wrote: > Looks good. > > Thanks, > Jiangli > >> On Jul 3, 2018, at 2:38 PM, Calvin Cheung wrote: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8205548 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8205548/webrev.00/ >> >> Ran hs-tier{1,2,3,4} tests. >> >> thanks, >> Calvin From david.holmes at oracle.com Wed Jul 4 06:53:04 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 4 Jul 2018 16:53:04 +1000 Subject: RFC: more robust handling of terminated but still attached threads In-Reply-To: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> References: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> Message-ID: <8f27557c-4b35-b960-34a2-63f318166a52@oracle.com> After more experimentation and code scrutiny I could only find one place where this is actually a problem, so I will fix that under 820578. David On 3/07/2018 7:21 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 > > We hit asserts or trigger SEGVs when we try to operate on a native > thread ID for a JNI-attached thread that has actually terminated but > which did not detach first. It still appears in the threadsList and we > try to process it during DumpOnExit (but there are probably other > operations that could run into this in the general case). > > Fixing the tests is easy. But the more general question is how to make > the VM code more robust in the face of this situation. > > At the lowest level we can watch for ESRCH from pthread_* functions and > try to program in alternate logic that gives some "result" for that thread. > > At higher-level we may be able to heuristically guess that the native > thread has terminated and so skip it in ALL_JAVA_THREADS and similar > constructors. For example pthread_kill(t,0) can heuristically check if > 't' is not alive as it may return ESRCH. But of course if t terminated > then it is entirely possible that the pthread_t value for it has been > reused. And if t is not going to detach we could be racing with its > termination anyway - so the heuristic may pass and we still hit a > low-level assert or SEGV. > > What do people think? Do we try to deal with this at the bottom, or at > the top, or all the way through? (There's obviously a diminishing return > on effort versus benefit here.) > > Thanks, > David From goetz.lindenmaier at sap.com Wed Jul 4 08:21:02 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 4 Jul 2018 08:21:02 +0000 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> Message-ID: <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> Hi, Volker, thanks for improving on my original change and implementing David's and Karen's proposals. David, I think the change addresses a row of your concerns. > proposal - it just moves the "field" from the Java side to the VM side. No, the space overhead in case of successful initialization is reduced from O(classes) to O(classloaders). There is only runtime overhead if there was an error, and that should be acceptable. > dealing with backtrace and stackTrace. I have to wonder why nothing in > Throwable clears the backtrace today ? Maybe the concern about the backTraces is pointless and the conversion to stackTraces should be dropped. As you say, it's done nowhere else, and other backTraces should cause similar issues. > I'm not clear why you record the ExceptionInInitializerError wrapper > instead of the actual exception that occurred? Keeping the ExceptionInInitializerError is helpful for people to understand that this has happened during initialization and not directly where the NCDFE is thrown. They will understand that this might have happened in another thread. Best regards, Goetz. > -----Original Message----- > From: core-libs-dev [mailto:core-libs-dev-bounces at openjdk.java.net] On > Behalf Of David Holmes > Sent: Sonntag, 1. Juli 2018 23:48 > To: Volker Simonis ; hotspot-runtime- > dev at openjdk.java.net runtime ; > Java Core Libs > Subject: Re: RFR(M): 8203826: Chain class initialization exceptions into later > NoClassDefFoundErrors > > Hi Volker, > > This doesn't really address any of the concerns I had with the original > proposal - it just moves the "field" from the Java side to the VM side. > There is still a massive amount of Java code execution in relation to > this - which itself may encounter secondary exceptions. It's very hard > to tell if you will leave things in a suitable state if such exceptions > arise. > > My position remains that the primary place to deal with the > initialization error is when initialization occurs and the error > happens. Subsequent attempted uses of the erroneous class may benefit > from some additional information about the nature of the original > exceptions, but I don't think full stacktraces are necessary or > desirable (and I do believe they will confuse most users given the lack > of continuity in the stack frames and that they may have happened in a > different thread!). > > That aside ... > > There appears to a race on constructing the Hashtable. At least it was > not obvious to me where a lock may be held during that process. > > I can't determine that clearing backtrace in removeNativeBacktrace is > correct with respect to the overall protocol within Throwable for > dealing with backtrace and stackTrace. I have to wonder why nothing in > Throwable clears the backtrace today ? > > I'm not clear why you record the ExceptionInInitializerError wrapper > instead of the actual exception that occurred? > > Throwable states: > > + * This method is currently only called from the VM for instances of > + * ExceptionInInitializerError which are stored for later chaining > into a > + * NoClassDefFoundError in order to prevent keeping classes from > the native > + * backtrace alive. > + */ > > but IIUC it will also be called for instances of Error that occurred > which do not get wrapped in EIIE. > > > Regards, > David > ------ > > > On 30/06/2018 12:53 AM, Volker Simonis wrote: > > Hi, > > > > can I please have a review for the following change which saves > > ExceptionInInitializerError thrown during class initialization and > > chains them as cause into potential NoClassDefFoundErrors for the same > > class. We are using this features since years in our commercial SAP > > JVM and it proved extremely useful for detecting and fixing errors > > especially in big deployments. > > > > This is a follow-up on a discussion previously started by Goetz [1]. > > His first proposal (which is close to our current, internal > > implementation) inserted an additional field into java.lang.Class > > objects to save potential ExceptionInInitializerErrors. This was > > considered to much overhead in the initial discussion [1]. > > > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8203826.v2/ > > https://bugs.openjdk.java.net/browse/JDK-8203826 > > > > So in this change, I've completely re-implemented the feature by using > > a java.lang.Hashtable which is attached to the ClassLoaderData object. > > The Hashtable is lazily created when the first > > ExceptionInInitializerError is thrown and maps the Class which > > triggered the ExceptionInInitializerError during the execution of its > > static initializer to the corresponding ExceptionInInitializerError. > > > > If the same class will be accessed once again, this will directly lead > > to a plain NoClassDefFoundError (as per the JVMS, 5.5 Initialization) > > because the static initializer won't be executed a second time. Until > > now, this NoClassDefFoundError wasn't linked in any way to the root > > cause of the problem (i.e. the first ExceptionInInitializerError > > together with the chained exception that happened during the execution > > of the static initializer). With this change, the NoClassDefFoundError > > will now chain the initial ExceptionInInitializerError as cause, > > making it much easier to detect the problem which lead to the > > NoClassDefFoundError. > > > > Following is an example from the new JTreg tests which comes which > > this change to demonstrate the feature. Until know, a typical stack > > trace from a NoClassDefFoundError looked as follows: > > > > java.lang.NoClassDefFoundError: Could not initialize class > > NoClassDefFound$ClassWithFailedInitializer > > at java.base/java.lang.Class.forName0(Native Method) > > at java.base/java.lang.Class.forName(Class.java:291) > > at NoClassDefFound.main(NoClassDefFound.java:38) > > > > With this change, the same stack trace now looks as follows: > > > > java.lang.NoClassDefFoundError: Could not initialize class > > NoClassDefFound$ClassWithFailedInitializer > > at java.base/java.lang.Class.forName0(Native Method) > > at java.base/java.lang.Class.forName(Class.java:315) > > at NoClassDefFound.main(NoClassDefFound.java:38) > > Caused by: java.lang.ExceptionInInitializerError > > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0( > Native > > Method) > > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstan > ce(DelegatingConstructorAccessorImpl.java:45) > > at > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) > > at java.base/java.lang.Class.newInstance(Class.java:584) > > at > NoClassDefFound$ClassWithFailedInitializer.(NoClassDefFound.java:2 > 0) > > at java.base/java.lang.Class.forName0(Native Method) > > at java.base/java.lang.Class.forName(Class.java:315) > > at NoClassDefFound.main(NoClassDefFound.java:30) > > Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 2 out of > > bounds for length 1 > > at NoClassDefFound$A.(NoClassDefFound.java:9) > > ... 9 more > > > > As you can see, the reason for the NoClassDefFoundError when accessing > > the class 'NoClassDefFound$ClassWithFailedInitializer' is actually not > > even in the class or its static initializer itself, but in the class > > 'NoClassDefFound$A' which is a base class of > > 'NoClassDefFound$ClassWithFailedInitializer'. This is not easily > > detectible from the old, plain NoClassDefFoundError. > > > > As I wrote, the only overhead we have with the new implementation is > > an additional OopHandle field per ClassLoaderData which I think is > > acceptable. The Hashtable object itself is only created lazily, after > > the first occurrence of an ExceptionInInitializerError in the > > corresponding class loader. The whole Hashtable creation and > > storing/quering of ExceptionInInitializerErrors in > > > ClassLoaderData::record_init_exception()/ClassLoaderData::query_init_exce > ption() > > is optional in the sense that any errors/exceptions occurring during > > the execution of these functions are ignored and cleared. > > > > Finally, we also take care to recursively convert all native > > backtraces in the stored ExceptionInInitializerErrors (and their > > suppressed and chained exceptions) into symbolic stack traces in order > > to avoid holding references to classes and prevent their unloading. > > This is implemented in the new private, static method > > java.lang.Throwable::removeNativeBacktrace() which is called for each > > ExceptionInInitializerError before it is stored in the Hashtable. > > > > Thank you and best regards, > > Volker > > > > [1] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018- > June/028310.html > > From peter.levart at gmail.com Wed Jul 4 10:39:20 2018 From: peter.levart at gmail.com (Peter Levart) Date: Wed, 4 Jul 2018 12:39:20 +0200 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> Message-ID: <593a5eb1-b268-44d5-c9e8-90e082f967f9@gmail.com> Hi Volker, It occurred to me that getting rid of backtrace-s of cause(s)/suppressed exception(s) might not be enough to prevent ClassLoader leaks... On 07/04/2018 10:21 AM, Lindenmaier, Goetz wrote: >> dealing with backtrace and stackTrace. I have to wonder why nothing in >> Throwable clears the backtrace today ? > Maybe the concern about the backTraces is pointless and the > conversion to stackTraces should be dropped. > As you say, it's done nowhere else, and other backTraces should > cause similar issues. > Exception objects are typically not retained for longer periods. They are normally caught, dumped to log and let gone. This change retains exception(s) so that they are reachable from a ClassLoader that loaded the failed class. It could happen that the chain of cause(s)/suppressed exception(s) of some ExceptionInInitializerError is an exception object of a class that is loaded by some child ClassLoader of the ClassLoader that loaded the failed class. Such child ClassLoader would have leaked. The solution would be to replace the chain of cause(s)/suppressed exception(s) with a chain of replacement exception objects like this one (this would also take care of backtraces of original exceptions as it would not retain the original exceptions at all): /** ?* A {@link RuntimeException} that acts as a substitute for the original exception ?* (checked or unchecked) and mimics the original exception in every aspect except it's type. ?*/ public class ExceptionSubstitute extends RuntimeException { ??? private static final long serialVersionUID = 1; ??? private String originalExceptionClassName, localizedMessage; ??? public ExceptionSubstitute(Throwable originalException) { ??????? super(originalException.getMessage()); ??????? this.originalExceptionClassName = originalException.getClass().getName(); ??????? this.localizedMessage = originalException.getLocalizedMessage(); ??????? // substitute originalException's cause ??????? Throwable cause = originalException.getCause(); ??????? initCause(cause == null ? null : new ExceptionSubstitute(cause)); ??????? // substitute originalException's suppressed exceptions if any ??????? for (Throwable suppressed : originalException.getSuppressed()) { ??????????? addSuppressed(new ExceptionSubstitute(suppressed)); ??????? } ??????? // inherit stack trace elements from originalException ??????? setStackTrace(originalException.getStackTrace()); ??? } ??? @Override ??? public Throwable fillInStackTrace() { ??????? // don't need our backtrace - will inherit stack trace elements from originalException ??????? return this; ??? } ??? @Override ??? public String getLocalizedMessage() { ??????? return localizedMessage; ??? } ??? /** ???? * @return the class name of the original exception for which this exception is a substitute ???? */ ??? public String getOriginalExceptionClassName() { ??????? return originalExceptionClassName; ??? } ??? /** ???? * Emulate toString() method as if called upon originalException ???? */ ??? @Override ??? public String toString() { ??????? String message = getLocalizedMessage(); ??????? return (message != null) ?????????????? ? (getOriginalExceptionClassName() + ": " + message) ?????????????? : getOriginalExceptionClassName(); ??? } } Regards, Peter From thomas.stuefe at gmail.com Wed Jul 4 15:15:38 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 4 Jul 2018 17:15:38 +0200 Subject: RFC: more robust handling of terminated but still attached threads In-Reply-To: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> References: <8778103f-5730-344d-a671-a15f4fc5bfc8@oracle.com> Message-ID: On Tue, Jul 3, 2018 at 11:21 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 > > We hit asserts or trigger SEGVs when we try to operate on a native thread ID > for a JNI-attached thread that has actually terminated but which did not > detach first. It still appears in the threadsList and we try to process it > during DumpOnExit (but there are probably other operations that could run > into this in the general case). > > Fixing the tests is easy. But the more general question is how to make the > VM code more robust in the face of this situation. > > At the lowest level we can watch for ESRCH from pthread_* functions and try > to program in alternate logic that gives some "result" for that thread. > > At higher-level we may be able to heuristically guess that the native thread > has terminated and so skip it in ALL_JAVA_THREADS and similar constructors. > For example pthread_kill(t,0) can heuristically check if 't' is not alive as > it may return ESRCH. But of course if t terminated then it is entirely > possible that the pthread_t value for it has been reused. And if t is not > going to detach we could be racing with its termination anyway - so the > heuristic may pass and we still hit a low-level assert or SEGV. > > What do people think? Do we try to deal with this at the bottom, or at the > top, or all the way through? (There's obviously a diminishing return on > effort versus benefit here.) I think handling ESRCH in a couple of pthread APIs at the bottom and having a mechanism to quietly shoo away Thread objects for dead threads would cover most cases. I also would only do this for threads attached from the outside. I can see that reused pthread IDs of dead JNI threads could be a problem, but I do not see a cheap solution for that. And this is a JNI coding error, right? If we really are worried about this kind of problem, we can have a watcher task periodically probing the threads in the threadlist to check if their pthread_t s result in ESRCH - if you do this periodically, you reduce the chance of reuse. The period could even be adjustable, for analysis reasons (if you suspect this kind of error, decrease test period time) But I would only do this if really necessary. Just my 5 cent. ..Thomas > > Thanks, > David From david.holmes at oracle.com Wed Jul 4 23:01:34 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Jul 2018 09:01:34 +1000 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> Message-ID: On 4/07/2018 6:21 PM, Lindenmaier, Goetz wrote: > Hi, > > Volker, thanks for improving on my original change and > implementing David's and Karen's proposals. > > David, I think the change addresses a row of your concerns. > >> proposal - it just moves the "field" from the Java side to the VM side. > No, the space overhead in case of successful initialization is > reduced from O(classes) to O(classloaders). I don't think space usage was one of my concerns. > There is only runtime overhead if there was an error, and > that should be acceptable. > >> dealing with backtrace and stackTrace. I have to wonder why nothing in >> Throwable clears the backtrace today ? > Maybe the concern about the backTraces is pointless and the > conversion to stackTraces should be dropped. > As you say, it's done nowhere else, and other backTraces should > cause similar issues. My comment on the clearing of Throwable.backtrace was purely in relation to the internal state protocol of Throwable, not about the conversion process as such. The conversion process is needed, as previously discussed, to avoid class/classloader leaks. Exceptions are normally thread-contained so there is limited scope for creating class/classloader leaks. However as soon as you allow this original exception to be stored and potentially later accessed in a different thread then the leak is possible - it might even be possible to hit some kind of loader constraint error if a same named type can be loaded by different classloaders used by the different threads. Add some extra information about the type of the original exception by all means, but not the stacktrace (unless you've stringified it). As I say the place the stack is of interest in when the original exception is encountered. If something is swallowing this then logging is the tool to expose it IMHO. >> I'm not clear why you record the ExceptionInInitializerError wrapper >> instead of the actual exception that occurred? > Keeping the ExceptionInInitializerError is helpful for people > to understand that this has happened during initialization and not directly > where the NCDFE is thrown. They will understand that this might have > happened in another thread. The NCDFE message already states something like "cannot initialize class because prior initialization attempt failed", which I think makes it quite clear. I dispute "they will understand this might have happened in another thread". Cheers, David > Best regards, > Goetz. > > > > > > >> -----Original Message----- >> From: core-libs-dev [mailto:core-libs-dev-bounces at openjdk.java.net] On >> Behalf Of David Holmes >> Sent: Sonntag, 1. Juli 2018 23:48 >> To: Volker Simonis ; hotspot-runtime- >> dev at openjdk.java.net runtime ; >> Java Core Libs >> Subject: Re: RFR(M): 8203826: Chain class initialization exceptions into later >> NoClassDefFoundErrors >> >> Hi Volker, >> >> This doesn't really address any of the concerns I had with the original >> proposal - it just moves the "field" from the Java side to the VM side. >> There is still a massive amount of Java code execution in relation to >> this - which itself may encounter secondary exceptions. It's very hard >> to tell if you will leave things in a suitable state if such exceptions >> arise. >> >> My position remains that the primary place to deal with the >> initialization error is when initialization occurs and the error >> happens. Subsequent attempted uses of the erroneous class may benefit >> from some additional information about the nature of the original >> exceptions, but I don't think full stacktraces are necessary or >> desirable (and I do believe they will confuse most users given the lack >> of continuity in the stack frames and that they may have happened in a >> different thread!). >> >> That aside ... >> >> There appears to a race on constructing the Hashtable. At least it was >> not obvious to me where a lock may be held during that process. >> >> I can't determine that clearing backtrace in removeNativeBacktrace is >> correct with respect to the overall protocol within Throwable for >> dealing with backtrace and stackTrace. I have to wonder why nothing in >> Throwable clears the backtrace today ? >> >> I'm not clear why you record the ExceptionInInitializerError wrapper >> instead of the actual exception that occurred? >> >> Throwable states: >> >> + * This method is currently only called from the VM for instances of >> + * ExceptionInInitializerError which are stored for later chaining >> into a >> + * NoClassDefFoundError in order to prevent keeping classes from >> the native >> + * backtrace alive. >> + */ >> >> but IIUC it will also be called for instances of Error that occurred >> which do not get wrapped in EIIE. >> >> >> Regards, >> David >> ------ >> >> >> On 30/06/2018 12:53 AM, Volker Simonis wrote: >>> Hi, >>> >>> can I please have a review for the following change which saves >>> ExceptionInInitializerError thrown during class initialization and >>> chains them as cause into potential NoClassDefFoundErrors for the same >>> class. We are using this features since years in our commercial SAP >>> JVM and it proved extremely useful for detecting and fixing errors >>> especially in big deployments. >>> >>> This is a follow-up on a discussion previously started by Goetz [1]. >>> His first proposal (which is close to our current, internal >>> implementation) inserted an additional field into java.lang.Class >>> objects to save potential ExceptionInInitializerErrors. This was >>> considered to much overhead in the initial discussion [1]. >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8203826.v2/ >>> https://bugs.openjdk.java.net/browse/JDK-8203826 >>> >>> So in this change, I've completely re-implemented the feature by using >>> a java.lang.Hashtable which is attached to the ClassLoaderData object. >>> The Hashtable is lazily created when the first >>> ExceptionInInitializerError is thrown and maps the Class which >>> triggered the ExceptionInInitializerError during the execution of its >>> static initializer to the corresponding ExceptionInInitializerError. >>> >>> If the same class will be accessed once again, this will directly lead >>> to a plain NoClassDefFoundError (as per the JVMS, 5.5 Initialization) >>> because the static initializer won't be executed a second time. Until >>> now, this NoClassDefFoundError wasn't linked in any way to the root >>> cause of the problem (i.e. the first ExceptionInInitializerError >>> together with the chained exception that happened during the execution >>> of the static initializer). With this change, the NoClassDefFoundError >>> will now chain the initial ExceptionInInitializerError as cause, >>> making it much easier to detect the problem which lead to the >>> NoClassDefFoundError. >>> >>> Following is an example from the new JTreg tests which comes which >>> this change to demonstrate the feature. Until know, a typical stack >>> trace from a NoClassDefFoundError looked as follows: >>> >>> java.lang.NoClassDefFoundError: Could not initialize class >>> NoClassDefFound$ClassWithFailedInitializer >>> at java.base/java.lang.Class.forName0(Native Method) >>> at java.base/java.lang.Class.forName(Class.java:291) >>> at NoClassDefFound.main(NoClassDefFound.java:38) >>> >>> With this change, the same stack trace now looks as follows: >>> >>> java.lang.NoClassDefFoundError: Could not initialize class >>> NoClassDefFound$ClassWithFailedInitializer >>> at java.base/java.lang.Class.forName0(Native Method) >>> at java.base/java.lang.Class.forName(Class.java:315) >>> at NoClassDefFound.main(NoClassDefFound.java:38) >>> Caused by: java.lang.ExceptionInInitializerError >>> at >> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0( >> Native >>> Method) >>> at >> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance( >> NativeConstructorAccessorImpl.java:62) >>> at >> java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstan >> ce(DelegatingConstructorAccessorImpl.java:45) >>> at >> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) >>> at java.base/java.lang.Class.newInstance(Class.java:584) >>> at >> NoClassDefFound$ClassWithFailedInitializer.(NoClassDefFound.java:2 >> 0) >>> at java.base/java.lang.Class.forName0(Native Method) >>> at java.base/java.lang.Class.forName(Class.java:315) >>> at NoClassDefFound.main(NoClassDefFound.java:30) >>> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 2 out of >>> bounds for length 1 >>> at NoClassDefFound$A.(NoClassDefFound.java:9) >>> ... 9 more >>> >>> As you can see, the reason for the NoClassDefFoundError when accessing >>> the class 'NoClassDefFound$ClassWithFailedInitializer' is actually not >>> even in the class or its static initializer itself, but in the class >>> 'NoClassDefFound$A' which is a base class of >>> 'NoClassDefFound$ClassWithFailedInitializer'. This is not easily >>> detectible from the old, plain NoClassDefFoundError. >>> >>> As I wrote, the only overhead we have with the new implementation is >>> an additional OopHandle field per ClassLoaderData which I think is >>> acceptable. The Hashtable object itself is only created lazily, after >>> the first occurrence of an ExceptionInInitializerError in the >>> corresponding class loader. The whole Hashtable creation and >>> storing/quering of ExceptionInInitializerErrors in >>> >> ClassLoaderData::record_init_exception()/ClassLoaderData::query_init_exce >> ption() >>> is optional in the sense that any errors/exceptions occurring during >>> the execution of these functions are ignored and cleared. >>> >>> Finally, we also take care to recursively convert all native >>> backtraces in the stored ExceptionInInitializerErrors (and their >>> suppressed and chained exceptions) into symbolic stack traces in order >>> to avoid holding references to classes and prevent their unloading. >>> This is implemented in the new private, static method >>> java.lang.Throwable::removeNativeBacktrace() which is called for each >>> ExceptionInInitializerError before it is stored in the Hashtable. >>> >>> Thank you and best regards, >>> Volker >>> >>> [1] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018- >> June/028310.html >>> From david.holmes at oracle.com Thu Jul 5 08:19:17 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Jul 2018 18:19:17 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code Message-ID: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ Problem: The tests create native threads that attach to the VM through JNI_AttachCurrentThread but which then terminate without detaching themselves. When the VM exits and we're using Flight Recorder "dumponexit" this leads to a call to VM_PrintThreads that in part wants to print the per-thread CPU usage. When we encounter the threads that have terminated already the low level pthread_getcpuclockid calls returns ESRCH but the code doesn't expect that and so fails an assert in debug mode and can SEGV in product mode. Solution: Serviceability-side: fix the tests Change the tests so that the threads detach before terminating. The two tests are (surprisingly) written in completely different styles, so the solution also takes on two different styles. Runtime-side: make the VM more robust in the fact of JNI attached threads that terminate before detaching, and add a regression test I took a good look at the low-level code for interacting with arbitrary threads and as far as I can see the problem only exists for this one case of pthread_getcpuclockid on Linux. Elsewhere the potential for a library call failure just reports an error value (such as -1 for the cpu time used). So the fix is simply to allow for ESRCH when calling pthread_getcpuclockid and return -1 for the cpu usage in that case. I created a new regression test to create a new native thread, attach it and then let it terminate while still attached. The java code then calls various Thread and ThreadMXBean functions on it to ensure there are no crashes or unexpected exceptions. Testing: - old tests with fixed run-time - old run-time with fixed tests - mach tier4 (which exposed the problem - that's where we enable Flight recorder for the tests) [in progress] - mach5 tier 1-3 for good measure [in progress] - new regression test Thanks, David From david.holmes at oracle.com Thu Jul 5 09:58:39 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Jul 2018 19:58:39 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: Solaris compiler complains about doing a return from inside a do-while loop. I'll have to rework part of the fix tomorrow. David On 5/07/2018 6:19 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 > Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ > > Problem: > > The tests create native threads that attach to the VM through > JNI_AttachCurrentThread but which then terminate without detaching > themselves. When the VM exits and we're using Flight Recorder > "dumponexit" this leads to a call to VM_PrintThreads that in part wants > to print the per-thread CPU usage. When we encounter the threads that > have terminated already the low level pthread_getcpuclockid calls > returns ESRCH but the code doesn't expect that and so fails an assert in > debug mode and can SEGV in product mode. > > Solution: > > Serviceability-side: fix the tests > > Change the tests so that the threads detach before terminating. The two > tests are (surprisingly) written in completely different styles, so the > solution also takes on two different styles. > > Runtime-side: make the VM more robust in the fact of JNI attached > threads that terminate before detaching, and add a regression test > > I took a good look at the low-level code for interacting with arbitrary > threads and as far as I can see the problem only exists for this one > case of pthread_getcpuclockid on Linux. Elsewhere the potential for a > library call failure just reports an error value (such as -1 for the cpu > time used). > > So the fix is simply to allow for ESRCH when calling > pthread_getcpuclockid and return -1 for the cpu usage in that case. > > I created a new regression test to create a new native thread, attach it > and then let it terminate while still attached. The java code then calls > various Thread and ThreadMXBean functions on it to ensure there are no > crashes or unexpected exceptions. > > Testing: > ?- old tests with fixed run-time > ?- old run-time with fixed tests > ?- mach tier4 (which exposed the problem - that's where we enable > Flight recorder for the tests) [in progress] > ?- mach5 tier 1-3 for good measure [in progress] > ?- new regression test > > Thanks, > David From harold.seigel at oracle.com Thu Jul 5 17:44:39 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Thu, 5 Jul 2018 13:44:39 -0400 Subject: RFR 8203911: Test runtime/modules/getModuleJNI/GetModule fails with -Xcheck:jni Message-ID: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> Hi, Please review this small JDK-12 fix for bug JDK-8203911.? The change contains the fix suggested in the bug. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8203911/webrev/ JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8203911 This fix was tested by running the failing test with -Xcheck:jni.? Regression testing included Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, with tiers 3-5 tests on Linux-x64, and with JCK-11 Lang and VM tests. Thanks, Harold From coleen.phillimore at oracle.com Thu Jul 5 20:00:20 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 16:00:20 -0400 Subject: =?UTF-8?Q?[12]=c2=a0RFR_=28S=29_8205417:_Obsolete_UnlinkSymbolsALot?= =?UTF-8?Q?_debugging_option?= Message-ID: Summary: Obsolete and remove support for UnlinkSymbolsALot open webrev at http://cr.openjdk.java.net/~coleenp/8205417.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8205417 Tested manually, with test/hotspot/jtreg/runtime/CommandLine, and mach5 hs-tier1,2. $ java -XX:+UnlinkSymbolsALot -version Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option UnlinkSymbolsALot; support was removed in 12.0 java version "12-internal" 2019-03-19 Java(TM) SE Runtime Environment 19.3 (fastdebug build 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols) Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols, mixed mode) Thanks, Coleen From harold.seigel at oracle.com Thu Jul 5 20:34:28 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Thu, 5 Jul 2018 16:34:28 -0400 Subject: =?UTF-8?Q?Re:_[12]=c2=a0RFR_=28S=29_8205417:_Obsolete_UnlinkSymbols?= =?UTF-8?Q?ALot_debugging_option?= In-Reply-To: References: Message-ID: <994635ef-6165-4fc3-ee7e-a234958f1aba@oracle.com> Hi Coleen, This change looks good! Thanks, Harold On 7/5/2018 4:00 PM, coleen.phillimore at oracle.com wrote: > Summary: Obsolete and remove support for UnlinkSymbolsALot > > open webrev at http://cr.openjdk.java.net/~coleenp/8205417.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8205417 > > Tested manually, with test/hotspot/jtreg/runtime/CommandLine, and > mach5 hs-tier1,2. > > $ java -XX:+UnlinkSymbolsALot -version > Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option > UnlinkSymbolsALot; support was removed in 12.0 > java version "12-internal" 2019-03-19 > Java(TM) SE Runtime Environment 19.3 (fastdebug build > 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols) > Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build > 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols, mixed mode) > > Thanks, > Coleen From coleen.phillimore at oracle.com Thu Jul 5 21:19:55 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 17:19:55 -0400 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option Message-ID: Summary: obsolete option and remove support. open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8202737 % java -XX:-AllowNonVirtualCalls -version Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option AllowNonVirtualCalls; support was removed in 12.0 java version "12-internal" 2019-03-19 Java(TM) SE Runtime Environment 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) Ran mach5 hs-tier1,2 and 3. Thanks, Coleen From chris.plummer at oracle.com Thu Jul 5 21:55:36 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 5 Jul 2018 14:55:36 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: Hi David, Solaris problems aside, overall it looks fine. Some minor things I noted: I noticed that exitCode is never modified in agentA() or agentB(), so there isn't much point to having it. If you reach the bottom of the function, it passed, so PASSED can be returned. The code would be more clear if it did this. As-is it is implied that you can reach the bottom when it fails. Is detaching the threads along the failure paths really needed? exit() is called, so this would seem to make it unnecessary. I prefer assignments not to be embedded inside the "if" condition. The DetachCurrentThread code in THREAD_return() is much more readable than the similar code in agentA() and agentB(). In the test: ? 54???????? // Generally as long as we don't crash of throw unexpected ? 55???????? // exceptions then the test passes. In some cases we know exactly "of" should be "or". Shouldn't you be catching exceptions for all the Thread methods you are calling? Otherwise the test will exit if one is thrown, and the above comment indicates that you don't want this. Don't we normally put these tests in a package? thanks, Chris On 7/5/18 2:58 AM, David Holmes wrote: > Solaris compiler complains about doing a return from inside a > do-while loop. I'll have to rework part of the fix tomorrow. > > David > > On 5/07/2018 6:19 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >> >> Problem: >> >> The tests create native threads that attach to the VM through >> JNI_AttachCurrentThread but which then terminate without detaching >> themselves. When the VM exits and we're using Flight Recorder >> "dumponexit" this leads to a call to VM_PrintThreads that in part >> wants to print the per-thread CPU usage. When we encounter the >> threads that have terminated already the low level >> pthread_getcpuclockid calls returns ESRCH but the code doesn't expect >> that and so fails an assert in debug mode and can SEGV in product mode. >> >> Solution: >> >> Serviceability-side: fix the tests >> >> Change the tests so that the threads detach before terminating. The >> two tests are (surprisingly) written in completely different styles, >> so the solution also takes on two different styles. >> >> Runtime-side: make the VM more robust in the fact of JNI attached >> threads that terminate before detaching, and add a regression test >> >> I took a good look at the low-level code for interacting with >> arbitrary threads and as far as I can see the problem only exists for >> this one case of pthread_getcpuclockid on Linux. Elsewhere the >> potential for a library call failure just reports an error value >> (such as -1 for the cpu time used). >> >> So the fix is simply to allow for ESRCH when calling >> pthread_getcpuclockid and return -1 for the cpu usage in that case. >> >> I created a new regression test to create a new native thread, attach >> it and then let it terminate while still attached. The java code then >> calls various Thread and ThreadMXBean functions on it to ensure there >> are no crashes or unexpected exceptions. >> >> Testing: >> ??- old tests with fixed run-time >> ??- old run-time with fixed tests >> ??- mach tier4 (which exposed the problem - that's where we enable >> Flight recorder for the tests) [in progress] >> ??- mach5 tier 1-3 for good measure [in progress] >> ??- new regression test >> >> Thanks, >> David From david.holmes at oracle.com Thu Jul 5 22:18:52 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 08:18:52 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: <5752c0cc-f2bd-ed5c-0579-51ed639ee4cb@oracle.com> On 5/07/2018 7:58 PM, David Holmes wrote: > Solaris compiler complains about doing a return from inside a > do-while loop. I'll have to rework part of the fix tomorrow. Webrev updated in-place. The only change is to the makefile to disable a warning: + ifeq ($(TOOLCHAIN_TYPE), solstudio) + BUILD_HOTSPOT_JTREG_LIBRARIES_CFLAGS_libji06t001 += -erroff=E_END_OF_LOOP_CODE_NOT_REACHED + endif + David ----- > David > > On 5/07/2018 6:19 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >> >> Problem: >> >> The tests create native threads that attach to the VM through >> JNI_AttachCurrentThread but which then terminate without detaching >> themselves. When the VM exits and we're using Flight Recorder >> "dumponexit" this leads to a call to VM_PrintThreads that in part >> wants to print the per-thread CPU usage. When we encounter the threads >> that have terminated already the low level pthread_getcpuclockid calls >> returns ESRCH but the code doesn't expect that and so fails an assert >> in debug mode and can SEGV in product mode. >> >> Solution: >> >> Serviceability-side: fix the tests >> >> Change the tests so that the threads detach before terminating. The >> two tests are (surprisingly) written in completely different styles, >> so the solution also takes on two different styles. >> >> Runtime-side: make the VM more robust in the fact of JNI attached >> threads that terminate before detaching, and add a regression test >> >> I took a good look at the low-level code for interacting with >> arbitrary threads and as far as I can see the problem only exists for >> this one case of pthread_getcpuclockid on Linux. Elsewhere the >> potential for a library call failure just reports an error value (such >> as -1 for the cpu time used). >> >> So the fix is simply to allow for ESRCH when calling >> pthread_getcpuclockid and return -1 for the cpu usage in that case. >> >> I created a new regression test to create a new native thread, attach >> it and then let it terminate while still attached. The java code then >> calls various Thread and ThreadMXBean functions on it to ensure there >> are no crashes or unexpected exceptions. >> >> Testing: >> ??- old tests with fixed run-time >> ??- old run-time with fixed tests >> ??- mach tier4 (which exposed the problem - that's where we enable >> Flight recorder for the tests) [in progress] >> ??- mach5 tier 1-3 for good measure [in progress] >> ??- new regression test >> >> Thanks, >> David From david.holmes at oracle.com Thu Jul 5 22:40:06 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 08:40:06 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Hi Chris, Thanks for looking at this. Updated webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ Only real changes in ji05t001.c. (And fixed typo in the new test) More below ... On 6/07/2018 7:55 AM, Chris Plummer wrote: > Hi David, > > Solaris problems aside, overall it looks fine. Some minor things I noted: > > I noticed that exitCode is never modified in agentA() or agentB(), so > there isn't much point to having it. If you reach the bottom of the > function, it passed, so PASSED can be returned. The code would be more > clear if it did this. As-is it is implied that you can reach the bottom > when it fails. I resisted any and all urges to do any kind of unrelated code cleanup in the tests - once you start you may end up doing a full rewrite. > Is detaching the threads along the failure paths really needed? exit() > is called, so this would seem to make it unnecessary. You're right that isn't necessary. I'll remove the changes from before the exits in ji05t001.c > I prefer assignments not to be embedded inside the "if" condition. The > DetachCurrentThread code in THREAD_return() is much more readable than > the similar code in agentA() and agentB(). It's an existing style already used in that test e.g. 287 if ((res = 288 JNI_ENV_PTR(vm)->AttachCurrentThread( 289 JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 0) { and I don't mind it, so I'd prefer not to change it. > In the test: > > ? 54???????? // Generally as long as we don't crash of throw unexpected > ? 55???????? // exceptions then the test passes. In some cases we know > exactly > > "of" should be "or". Well spotted. Thanks. > Shouldn't you be catching exceptions for all the Thread methods you are > calling? Otherwise the test will exit if one is thrown, and the above > comment indicates that you don't want this. I'm not expecting there to be any exceptions from any of the called methods. That would potentially indicate a problem in handling the terminated native thread, so would indicate a test failure. > Don't we normally put these tests in a package? Doesn't seem to be any hard and fast rule. I only uses packages when they are important for the test. In runtime we have 905 java files and only 116 have a package statement. It varies elsewhere. Thanks, David > thanks, > > Chris > > On 7/5/18 2:58 AM, David Holmes wrote: >> Solaris compiler complains about doing a return from inside a >> do-while loop. I'll have to rework part of the fix tomorrow. >> >> David >> >> On 5/07/2018 6:19 PM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>> >>> Problem: >>> >>> The tests create native threads that attach to the VM through >>> JNI_AttachCurrentThread but which then terminate without detaching >>> themselves. When the VM exits and we're using Flight Recorder >>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>> wants to print the per-thread CPU usage. When we encounter the >>> threads that have terminated already the low level >>> pthread_getcpuclockid calls returns ESRCH but the code doesn't expect >>> that and so fails an assert in debug mode and can SEGV in product mode. >>> >>> Solution: >>> >>> Serviceability-side: fix the tests >>> >>> Change the tests so that the threads detach before terminating. The >>> two tests are (surprisingly) written in completely different styles, >>> so the solution also takes on two different styles. >>> >>> Runtime-side: make the VM more robust in the fact of JNI attached >>> threads that terminate before detaching, and add a regression test >>> >>> I took a good look at the low-level code for interacting with >>> arbitrary threads and as far as I can see the problem only exists for >>> this one case of pthread_getcpuclockid on Linux. Elsewhere the >>> potential for a library call failure just reports an error value >>> (such as -1 for the cpu time used). >>> >>> So the fix is simply to allow for ESRCH when calling >>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>> >>> I created a new regression test to create a new native thread, attach >>> it and then let it terminate while still attached. The java code then >>> calls various Thread and ThreadMXBean functions on it to ensure there >>> are no crashes or unexpected exceptions. >>> >>> Testing: >>> ??- old tests with fixed run-time >>> ??- old run-time with fixed tests >>> ??- mach tier4 (which exposed the problem - that's where we enable >>> Flight recorder for the tests) [in progress] >>> ??- mach5 tier 1-3 for good measure [in progress] >>> ??- new regression test >>> >>> Thanks, >>> David > > > From david.holmes at oracle.com Thu Jul 5 22:46:24 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 08:46:24 +1000 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: References: Message-ID: Looks good Coleen - thanks. David On 6/07/2018 7:19 AM, coleen.phillimore at oracle.com wrote: > Summary: obsolete option and remove support. > > open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8202737 > > % java -XX:-AllowNonVirtualCalls -version > Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option > AllowNonVirtualCalls; support was removed in 12.0 > java version "12-internal" 2019-03-19 > Java(TM) SE Runtime Environment 19.3 (fastdebug build > 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) > Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build > 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) > > Ran mach5 hs-tier1,2 and 3. > > Thanks, > Coleen From david.holmes at oracle.com Thu Jul 5 22:57:05 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 08:57:05 +1000 Subject: =?UTF-8?Q?Re:_[12]=c2=a0RFR_=28S=29_8205417:_Obsolete_UnlinkSymbols?= =?UTF-8?Q?ALot_debugging_option?= In-Reply-To: References: Message-ID: Looks good. Thanks Coleen. David On 6/07/2018 6:00 AM, coleen.phillimore at oracle.com wrote: > Summary: Obsolete and remove support for UnlinkSymbolsALot > > open webrev at http://cr.openjdk.java.net/~coleenp/8205417.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8205417 > > Tested manually, with test/hotspot/jtreg/runtime/CommandLine, and mach5 > hs-tier1,2. > > $ java -XX:+UnlinkSymbolsALot -version > Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option > UnlinkSymbolsALot; support was removed in 12.0 > java version "12-internal" 2019-03-19 > Java(TM) SE Runtime Environment 19.3 (fastdebug build > 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols) > Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build > 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols, mixed mode) > > Thanks, > Coleen From chris.plummer at oracle.com Thu Jul 5 23:00:39 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 5 Jul 2018 16:00:39 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Message-ID: Hi David, Looks good. Regarding the test being in a package, looks like this was the convention for the nsk tests, so that's why I noted it. thanks, Chris On 7/5/18 3:40 PM, David Holmes wrote: > Hi Chris, > > Thanks for looking at this. > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ > > Only real changes in ji05t001.c. (And fixed typo in the new test) > > More below ... > > On 6/07/2018 7:55 AM, Chris Plummer wrote: >> Hi David, >> >> Solaris problems aside, overall it looks fine. Some minor things I >> noted: >> >> I noticed that exitCode is never modified in agentA() or agentB(), so >> there isn't much point to having it. If you reach the bottom of the >> function, it passed, so PASSED can be returned. The code would be >> more clear if it did this. As-is it is implied that you can reach the >> bottom when it fails. > > I resisted any and all urges to do any kind of unrelated code cleanup > in the tests - once you start you may end up doing a full rewrite. > >> Is detaching the threads along the failure paths really needed? >> exit() is called, so this would seem to make it unnecessary. > > You're right that isn't necessary. I'll remove the changes from before > the exits in ji05t001.c > >> I prefer assignments not to be embedded inside the "if" condition. >> The DetachCurrentThread code in THREAD_return() is much more readable >> than the similar code in agentA() and agentB(). > > It's an existing style already used in that test e.g. > > ?287???? if ((res = > ?288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( > ?289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != > 0) { > > and I don't mind it, so I'd prefer not to change it. > >> In the test: >> >> ?? 54???????? // Generally as long as we don't crash of throw unexpected >> ?? 55???????? // exceptions then the test passes. In some cases we >> know exactly >> >> "of" should be "or". > > Well spotted. Thanks. > >> Shouldn't you be catching exceptions for all the Thread methods you >> are calling? Otherwise the test will exit if one is thrown, and the >> above comment indicates that you don't want this. > > I'm not expecting there to be any exceptions from any of the called > methods. That would potentially indicate a problem in handling the > terminated native thread, so would indicate a test failure. > >> Don't we normally put these tests in a package? > > Doesn't seem to be any hard and fast rule. I only uses packages when > they are important for the test. In runtime we have 905 java files and > only 116 have a package statement. It varies elsewhere. > > Thanks, > David > >> thanks, >> >> Chris >> >> On 7/5/18 2:58 AM, David Holmes wrote: >>> Solaris compiler complains about doing a return from inside a >>> do-while loop. I'll have to rework part of the fix tomorrow. >>> >>> David >>> >>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>> >>>> Problem: >>>> >>>> The tests create native threads that attach to the VM through >>>> JNI_AttachCurrentThread but which then terminate without detaching >>>> themselves. When the VM exits and we're using Flight Recorder >>>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>>> wants to print the per-thread CPU usage. When we encounter the >>>> threads that have terminated already the low level >>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>> expect that and so fails an assert in debug mode and can SEGV in >>>> product mode. >>>> >>>> Solution: >>>> >>>> Serviceability-side: fix the tests >>>> >>>> Change the tests so that the threads detach before terminating. The >>>> two tests are (surprisingly) written in completely different >>>> styles, so the solution also takes on two different styles. >>>> >>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>> threads that terminate before detaching, and add a regression test >>>> >>>> I took a good look at the low-level code for interacting with >>>> arbitrary threads and as far as I can see the problem only exists >>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the >>>> potential for a library call failure just reports an error value >>>> (such as -1 for the cpu time used). >>>> >>>> So the fix is simply to allow for ESRCH when calling >>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>> >>>> I created a new regression test to create a new native thread, >>>> attach it and then let it terminate while still attached. The java >>>> code then calls various Thread and ThreadMXBean functions on it to >>>> ensure there are no crashes or unexpected exceptions. >>>> >>>> Testing: >>>> ??- old tests with fixed run-time >>>> ??- old run-time with fixed tests >>>> ??- mach tier4 (which exposed the problem - that's where we enable >>>> Flight recorder for the tests) [in progress] >>>> ??- mach5 tier 1-3 for good measure [in progress] >>>> ??- new regression test >>>> >>>> Thanks, >>>> David >> >> >> From coleen.phillimore at oracle.com Thu Jul 5 23:04:36 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 19:04:36 -0400 Subject: =?UTF-8?Q?Re:_[12]=c2=a0RFR_=28S=29_8205417:_Obsolete_UnlinkSymbols?= =?UTF-8?Q?ALot_debugging_option?= In-Reply-To: References: Message-ID: Thanks David and Harold for the reviews. Coleen On 7/5/18 6:57 PM, David Holmes wrote: > Looks good. Thanks Coleen. > > David > > On 6/07/2018 6:00 AM, coleen.phillimore at oracle.com wrote: >> Summary: Obsolete and remove support for UnlinkSymbolsALot >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205417.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8205417 >> >> Tested manually, with test/hotspot/jtreg/runtime/CommandLine, and >> mach5 hs-tier1,2. >> >> $ java -XX:+UnlinkSymbolsALot -version >> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option >> UnlinkSymbolsALot; support was removed in 12.0 >> java version "12-internal" 2019-03-19 >> Java(TM) SE Runtime Environment 19.3 (fastdebug build >> 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols) >> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build >> 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols, mixed mode) >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Thu Jul 5 23:04:53 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 19:04:53 -0400 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: References: Message-ID: <624c46aa-d6e6-d2f2-f8ae-842bed35d6a9@oracle.com> Thanks for the code review, David. Coleen On 7/5/18 6:46 PM, David Holmes wrote: > Looks good Coleen - thanks. > > David > > On 6/07/2018 7:19 AM, coleen.phillimore at oracle.com wrote: >> Summary: obsolete option and remove support. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8202737 >> >> % java -XX:-AllowNonVirtualCalls -version >> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option >> AllowNonVirtualCalls; support was removed in 12.0 >> java version "12-internal" 2019-03-19 >> Java(TM) SE Runtime Environment 19.3 (fastdebug build >> 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) >> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build >> 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) >> >> Ran mach5 hs-tier1,2 and 3. >> >> Thanks, >> Coleen From jiangli.zhou at oracle.com Thu Jul 5 23:05:24 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 5 Jul 2018 16:05:24 -0700 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: References: Message-ID: <749FC283-A2B6-4818-B916-B038EB1D772A@oracle.com> +1 Thanks, Jiangli > On Jul 5, 2018, at 3:46 PM, David Holmes wrote: > > Looks good Coleen - thanks. > > David > > On 6/07/2018 7:19 AM, coleen.phillimore at oracle.com wrote: >> Summary: obsolete option and remove support. >> open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8202737 >> % java -XX:-AllowNonVirtualCalls -version >> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option AllowNonVirtualCalls; support was removed in 12.0 >> java version "12-internal" 2019-03-19 >> Java(TM) SE Runtime Environment 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) >> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) >> Ran mach5 hs-tier1,2 and 3. >> Thanks, >> Coleen From coleen.phillimore at oracle.com Thu Jul 5 23:15:53 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 19:15:53 -0400 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: <749FC283-A2B6-4818-B916-B038EB1D772A@oracle.com> References: <749FC283-A2B6-4818-B916-B038EB1D772A@oracle.com> Message-ID: Thank you, Jiangli! Coleen On 7/5/18 7:05 PM, Jiangli Zhou wrote: > +1 > > Thanks, > Jiangli > >> On Jul 5, 2018, at 3:46 PM, David Holmes wrote: >> >> Looks good Coleen - thanks. >> >> David >> >> On 6/07/2018 7:19 AM, coleen.phillimore at oracle.com wrote: >>> Summary: obsolete option and remove support. >>> open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8202737 >>> % java -XX:-AllowNonVirtualCalls -version >>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option AllowNonVirtualCalls; support was removed in 12.0 >>> java version "12-internal" 2019-03-19 >>> Java(TM) SE Runtime Environment 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) >>> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) >>> Ran mach5 hs-tier1,2 and 3. >>> Thanks, >>> Coleen From kim.barrett at oracle.com Fri Jul 6 00:25:12 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 20:25:12 -0400 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: References: Message-ID: > On Jul 5, 2018, at 5:19 PM, coleen.phillimore at oracle.com wrote: > > Summary: obsolete option and remove support. > > open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8202737 > > % java -XX:-AllowNonVirtualCalls -version > Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option AllowNonVirtualCalls; support was removed in 12.0 > java version "12-internal" 2019-03-19 > Java(TM) SE Runtime Environment 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) > Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) > > Ran mach5 hs-tier1,2 and 3. > > Thanks, > Coleen Comment in arguments.cpp says As "deprecated" options age into "obsolete" options, move the entry into the "Obsolete Flags" section of the table. I don?t need another webrev for that. I think there?s another cleanup to do there, getting rid of the expired in 12 entries. From ioi.lam at oracle.com Fri Jul 6 00:45:29 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 5 Jul 2018 17:45:29 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> Message-ID: <64c6255c-7e5f-7688-6848-018504f479be@oracle.com> Hi Jiangli, Thank you so much for working on this. I think it's great that we can get the start-up improvement by archiving the ModuleDescriptor. I just have some coding style comments regarding heapShared.cpp. This file contains the code for coping objects and relocating pointers. By its nature, this kind of code is usually complicated, so I think we should try to make it as easy to understand as possible. [1] HeapShared::walk_from_field_and_archiving: ??? This name is not grammatically correct. How about HeapShared::archive_reachable_objects_from_static_field [2] How about changing the parameter field_offset -> static_field_offset ??? When I first read the code I was confused whether it's talking ??? about static or instance fields. Usually, "field" ??? implies instance field, so it's better to specifically ??? say "static field". [3] This code would fail if "f" is already archived. ??? 473?? // get the archived copy of the field referenced object ??? 474?? oop af = MetaspaceShared::archive_heap_object(f, THREAD); ??? 475?? WalkOopAndArchiveClosure walker(1, subgraph_info, f, af); ??? 476?? f->oop_iterate(&walker); [4] There's duplicated code between walk_from_field_and_archiving and ? ? WalkOopAndArchiveClosure::do_oop_work ??? 403?? assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), ? ? 404????????? "must be the relocated Klass in the shared space"); ??? 405?? _subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); ??? - vs - ? ? 484?? assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), ? ? 485????????? "must be the relocated Klass in the shared space"); ? ? 486?? subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); [5] This code? is also duplicated: ? ? 375?? RawAccess::oop_store(new_p, archived); ? ? 376?? log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, ? ? 377???????????? p2i(archived), p2i(new_p)); ??? - vs - ? ? 395? RawAccess::oop_store(new_p, archived); ??? 396? log.print("=== store archived " PTR_FORMAT " in " PTR_FORMAT, ??? 397??????????? p2i(archived), p2i(new_p)); [6] This code, even though it's correct, is hard to understand -- ? ? why are we calculating the distance between the two objects? ? ? 368? size_t delta = pointer_delta((HeapWord*)_archived_referencing_obj, ? ? 369 (HeapWord*)_orig_referencing_obj); ? ? 370? T* new_p = (T*)((HeapWord*)p + delta); ??? I thin it would be easier to understand if we change the order of the ? ? two arithmetic operations: ??? // new_p is the address of the same field inside _archived_referencing_obj. ??? size_t field_offset_in_bytes = pointer_delta(p, _orig_referencing_obj, 1); ??? T* new_p = (T*)(address(_orig_referencing_obj) + field_offset_in_bytes); [7] I have a hard time understand this log: ??? 376?? log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, ??? 377???????????? p2i(archived), p2i(new_p)); ??? How about this? ??? log.print("--- updated embedded pointer @[" PTR_FORMAT "] => " PTR_FORMAT, ????????????? p2i(new_p), p2i(archived)); For your consideration, I've incorporated my comments above into heapShared.cpp. I've not tested it so it most likely won't build :-( http://cr.openjdk.java.net/~iklam/misc/heapShared.old.cpp? [your version] http://cr.openjdk.java.net/~iklam/misc/heapShared.new.cpp? [my version] Please take a look and see if you like it. Thanks - Ioi On 6/28/18 4:15 PM, Jiangli Zhou wrote: > This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). > > The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. > > The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. > > webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ > RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 > > Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. > > Following are the details of system module archiving, which are duplicated in above bug report. > --------------------------------------------------------------------------------------------------------------------------- > Support archiving system module graph when the initial module is unnamed module from -cp currently. > > Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. > > Dump time system module object archiving > ================================= > At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. > > private static SystemModules archivedSystemModules; > private static ModuleFinder archivedSystemModuleFinder; > private static String archivedMainModule; > > The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. > > 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. > 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. > 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. > 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. > 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. > > Runtime initialization from archived system module objects > ============================================ > VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. > > If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. > > In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. > > Thanks, > Jiangli > > From coleen.phillimore at oracle.com Fri Jul 6 00:50:46 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 20:50:46 -0400 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: References: Message-ID: <000f9b99-c64a-17e7-fc23-423f19c88cf6@oracle.com> On 7/5/18 8:25 PM, Kim Barrett wrote: >> On Jul 5, 2018, at 5:19 PM, coleen.phillimore at oracle.com wrote: >> >> Summary: obsolete option and remove support. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8202737 >> >> % java -XX:-AllowNonVirtualCalls -version >> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option AllowNonVirtualCalls; support was removed in 12.0 >> java version "12-internal" 2019-03-19 >> Java(TM) SE Runtime Environment 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) >> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) >> >> Ran mach5 hs-tier1,2 and 3. >> >> Thanks, >> Coleen > Comment in arguments.cpp says > > As "deprecated" options age into "obsolete" options, move the entry into the > "Obsolete Flags" section of the table. > > I don?t need another webrev for that. Okay,? I'll make that change. thanks, Coleen > > I think there?s another cleanup to do there, getting rid of the expired in 12 entries. > From david.holmes at oracle.com Fri Jul 6 00:51:45 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 10:51:45 +1000 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: References: Message-ID: <15e2a924-988b-99f0-c7f8-9b749fda078f@oracle.com> On 6/07/2018 10:25 AM, Kim Barrett wrote: >> On Jul 5, 2018, at 5:19 PM, coleen.phillimore at oracle.com wrote: >> >> Summary: obsolete option and remove support. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8202737 >> >> % java -XX:-AllowNonVirtualCalls -version >> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option AllowNonVirtualCalls; support was removed in 12.0 >> java version "12-internal" 2019-03-19 >> Java(TM) SE Runtime Environment 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) >> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed mode) >> >> Ran mach5 hs-tier1,2 and 3. >> >> Thanks, >> Coleen > > Comment in arguments.cpp says > > As "deprecated" options age into "obsolete" options, move the entry into the > "Obsolete Flags" section of the table. > > I don?t need another webrev for that. Yes good point. The ordering of the obsolete section also needs fixing to follow the comments (order by expired-in) > I think there?s another cleanup to do there, getting rid of the expired in 12 entries. Yes - I thought there was already a placeholder issue filed for that but there isn't that I can see. We do have: https://bugs.openjdk.java.net/browse/JDK-8204591 for UseAppCDS which requires more than just a table update. We could possibly adapt that issue to remove all expired options (which should just mean updating the table for the rest). Cheers, David From coleen.phillimore at oracle.com Fri Jul 6 01:01:42 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 21:01:42 -0400 Subject: [12] RFR (S) 8202737: Obsolete AllowNonVirtualCalls option In-Reply-To: <15e2a924-988b-99f0-c7f8-9b749fda078f@oracle.com> References: <15e2a924-988b-99f0-c7f8-9b749fda078f@oracle.com> Message-ID: On 7/5/18 8:51 PM, David Holmes wrote: > On 6/07/2018 10:25 AM, Kim Barrett wrote: >>> On Jul 5, 2018, at 5:19 PM, coleen.phillimore at oracle.com wrote: >>> >>> Summary: obsolete option and remove support. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8202737.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8202737 >>> >>> % java -XX:-AllowNonVirtualCalls -version >>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option >>> AllowNonVirtualCalls; support was removed in 12.0 >>> java version "12-internal" 2019-03-19 >>> Java(TM) SE Runtime Environment 19.3 (fastdebug build >>> 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv) >>> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build >>> 12-internal+0-2018-07-05-1741563.cphillim.12remove-allow-nv, mixed >>> mode) >>> >>> Ran mach5 hs-tier1,2 and 3. >>> >>> Thanks, >>> Coleen >> >> Comment in arguments.cpp says >> >> ???? As "deprecated" options age into "obsolete" options, move the >> entry into the >> ???? "Obsolete Flags" section of the table. >> >> I don?t need another webrev for that. > > Yes good point. The ordering of the obsolete section also needs fixing > to follow the comments (order by expired-in) > >> I think there?s another cleanup to do there, getting rid of the >> expired in 12 entries. > > Yes - I thought there was already a placeholder issue filed for that > but there isn't that I can see. We do have: > > https://bugs.openjdk.java.net/browse/JDK-8204591 > > for UseAppCDS which requires more than just a table update. We could > possibly adapt that issue to remove all expired options (which should > just mean updating the table for the rest). Oh, yes, I forgot to add that removing all the expired options is a different issue, and should be done as it's own RFE. thanks, Coleen > > Cheers, > David From coleen.phillimore at oracle.com Fri Jul 6 01:03:03 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Jul 2018 21:03:03 -0400 Subject: =?UTF-8?Q?Re:_[12]=c2=a0RFR_=28S=29_8205417:_Obsolete_UnlinkSymbols?= =?UTF-8?Q?ALot_debugging_option?= In-Reply-To: References: Message-ID: As with the AllowNonVirtualCalls, I moved this option to the obsolete part of the table. thanks, Coleen On 7/5/18 7:04 PM, coleen.phillimore at oracle.com wrote: > > Thanks David and Harold for the reviews. > Coleen > > On 7/5/18 6:57 PM, David Holmes wrote: >> Looks good. Thanks Coleen. >> >> David >> >> On 6/07/2018 6:00 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Obsolete and remove support for UnlinkSymbolsALot >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8205417.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8205417 >>> >>> Tested manually, with test/hotspot/jtreg/runtime/CommandLine, and >>> mach5 hs-tier1,2. >>> >>> $ java -XX:+UnlinkSymbolsALot -version >>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option >>> UnlinkSymbolsALot; support was removed in 12.0 >>> java version "12-internal" 2019-03-19 >>> Java(TM) SE Runtime Environment 19.3 (fastdebug build >>> 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols) >>> Java HotSpot(TM) 64-Bit Server VM 19.3 (fastdebug build >>> 12-internal+0-2018-07-05-1710252.coleen.12unlink-symbols, mixed mode) >>> >>> Thanks, >>> Coleen > From david.holmes at oracle.com Fri Jul 6 02:05:01 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 12:05:01 +1000 Subject: RFR 8203911: Test runtime/modules/getModuleJNI/GetModule fails with -Xcheck:jni In-Reply-To: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> References: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> Message-ID: <46906a86-aa82-baae-1314-c76c5d8deb8e@oracle.com> Looks good to me! (Does that could as a self-review? :) ) Thanks, David On 6/07/2018 3:44 AM, Harold David Seigel wrote: > Hi, > > Please review this small JDK-12 fix for bug JDK-8203911.? The change > contains the fix suggested in the bug. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8203911/webrev/ > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8203911 > > This fix was tested by running the failing test with -Xcheck:jni. > Regression testing included Mach5 tiers 1 and 2 tests and builds on > Linux-X64, Windows, Solaris Sparc, and Mac OS X, with tiers 3-5 tests on > Linux-x64, and with JCK-11 Lang and VM tests. > > Thanks, Harold > From david.holmes at oracle.com Fri Jul 6 02:16:41 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 12:16:41 +1000 Subject: RFR 8203911: Test runtime/modules/getModuleJNI/GetModule fails with -Xcheck:jni In-Reply-To: <46906a86-aa82-baae-1314-c76c5d8deb8e@oracle.com> References: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> <46906a86-aa82-baae-1314-c76c5d8deb8e@oracle.com> Message-ID: On 6/07/2018 12:05 PM, David Holmes wrote: > Looks good to me! (Does that could as a self-review? :) ) s/could/count/ > Thanks, > David > > On 6/07/2018 3:44 AM, Harold David Seigel wrote: >> Hi, >> >> Please review this small JDK-12 fix for bug JDK-8203911.? The change >> contains the fix suggested in the bug. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8203911/webrev/ >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8203911 >> >> This fix was tested by running the failing test with -Xcheck:jni. >> Regression testing included Mach5 tiers 1 and 2 tests and builds on >> Linux-X64, Windows, Solaris Sparc, and Mac OS X, with tiers 3-5 tests >> on Linux-x64, and with JCK-11 Lang and VM tests. >> >> Thanks, Harold >> From jiangli.zhou at Oracle.COM Fri Jul 6 02:38:38 2018 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Thu, 5 Jul 2018 19:38:38 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <64c6255c-7e5f-7688-6848-018504f479be@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <64c6255c-7e5f-7688-6848-018504f479be@oracle.com> Message-ID: Hi Ioi, Thanks for the review! > On Jul 5, 2018, at 5:45 PM, Ioi Lam wrote: > > Hi Jiangli, > > Thank you so much for working on this. I think it's great that we can get the > start-up improvement by archiving the ModuleDescriptor. > > I just have some coding style comments regarding heapShared.cpp. This file > contains the code for coping objects and relocating pointers. By its nature, > this kind of code is usually complicated, so I think we should try to make > it as easy to understand as possible. > > > [1] HeapShared::walk_from_field_and_archiving: > > This name is not grammatically correct. How about > HeapShared::archive_reachable_objects_from_static_field Sounds good. > > [2] How about changing the parameter field_offset -> static_field_offset > When I first read the code I was confused whether it's talking > about static or instance fields. Usually, "field" > implies instance field, so it's better to specifically > say "static field?. Ok. > > [3] This code would fail if "f" is already archived. > > 473 // get the archived copy of the field referenced object > 474 oop af = MetaspaceShared::archive_heap_object(f, THREAD); > 475 WalkOopAndArchiveClosure walker(1, subgraph_info, f, af); > 476 f->oop_iterate(&walker); Hmmm, it?s possible we might encounter an archived object during reference walking & archiving in future cases. I?ll add a check. > > [4] There's duplicated code between walk_from_field_and_archiving and > WalkOopAndArchiveClosure::do_oop_work > > 403 assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), > 404 "must be the relocated Klass in the shared space"); > 405 _subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); > > - vs - > > 484 assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), > 485 "must be the relocated Klass in the shared space"); > 486 subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); I?ll move the assert into add_subgraph_object_klass(). > > [5] This code is also duplicated: > > 375 RawAccess::oop_store(new_p, archived); > 376 log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, > 377 p2i(archived), p2i(new_p)); > > - vs - > > 395 RawAccess::oop_store(new_p, archived); > 396 log.print("=== store archived " PTR_FORMAT " in " PTR_FORMAT, > 397 p2i(archived), p2i(new_p)); The first case is for existing archived copy and the second is for newly archived. The different logging messages are helpful for debugging. Not sure if using a function to encapsulate the store & log worth it in this case. Any suggestion? > > [6] This code, even though it's correct, is hard to understand -- > why are we calculating the distance between the two objects? > > 368 size_t delta = pointer_delta((HeapWord*)_archived_referencing_obj, > 369 (HeapWord*)_orig_referencing_obj); > 370 T* new_p = (T*)((HeapWord*)p + delta); > > I thin it would be easier to understand if we change the order of the > two arithmetic operations: > > // new_p is the address of the same field inside _archived_referencing_obj. > size_t field_offset_in_bytes = pointer_delta(p, _orig_referencing_obj, 1); > T* new_p = (T*)(address(_orig_referencing_obj) + field_offset_in_bytes); I think this works too. I?ll change as you suggested. > > [7] I have a hard time understand this log: > > 376 log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, > 377 p2i(archived), p2i(new_p)); > > How about this? > > log.print("--- updated embedded pointer @[" PTR_FORMAT "] => " PTR_FORMAT, > p2i(new_p), p2i(archived)); It is for the case where there is an existing copy of the archived object. Maybe ?found existing archived copy? would help? > > > For your consideration, I've incorporated my comments above into heapShared.cpp. > I've not tested it so it most likely won't build :-( > > > http://cr.openjdk.java.net/~iklam/misc/heapShared.old.cpp [your version] > http://cr.openjdk.java.net/~iklam/misc/heapShared.new.cpp [my version] > > Please take a look and see if you like it. Thanks a lot! I?ll take a look and incorporate your suggestions. Thanks again! Jiangli > > Thanks > - Ioi > > On 6/28/18 4:15 PM, Jiangli Zhou wrote: >> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). >> >> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. >> >> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 >> >> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. >> >> Following are the details of system module archiving, which are duplicated in above bug report. >> --------------------------------------------------------------------------------------------------------------------------- >> Support archiving system module graph when the initial module is unnamed module from -cp currently. >> >> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. >> >> Dump time system module object archiving >> ================================= >> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. >> >> private static SystemModules archivedSystemModules; >> private static ModuleFinder archivedSystemModuleFinder; >> private static String archivedMainModule; >> >> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. >> >> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. >> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. >> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. >> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. >> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. >> >> Runtime initialization from archived system module objects >> ============================================ >> VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. >> >> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. >> >> In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. >> >> Thanks, >> Jiangli >> >> > From david.holmes at oracle.com Fri Jul 6 08:07:37 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 18:07:37 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Message-ID: The new test is hanging on Solaris. I just discovered we don't run these tests on Solaris until tier4. David On 6/07/2018 8:40 AM, David Holmes wrote: > Hi Chris, > > Thanks for looking at this. > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ > > Only real changes in ji05t001.c. (And fixed typo in the new test) > > More below ... > > On 6/07/2018 7:55 AM, Chris Plummer wrote: >> Hi David, >> >> Solaris problems aside, overall it looks fine. Some minor things I noted: >> >> I noticed that exitCode is never modified in agentA() or agentB(), so >> there isn't much point to having it. If you reach the bottom of the >> function, it passed, so PASSED can be returned. The code would be more >> clear if it did this. As-is it is implied that you can reach the >> bottom when it fails. > > I resisted any and all urges to do any kind of unrelated code cleanup in > the tests - once you start you may end up doing a full rewrite. > >> Is detaching the threads along the failure paths really needed? exit() >> is called, so this would seem to make it unnecessary. > > You're right that isn't necessary. I'll remove the changes from before > the exits in ji05t001.c > >> I prefer assignments not to be embedded inside the "if" condition. The >> DetachCurrentThread code in THREAD_return() is much more readable than >> the similar code in agentA() and agentB(). > > It's an existing style already used in that test e.g. > > ?287???? if ((res = > ?288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( > ?289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 0) { > > and I don't mind it, so I'd prefer not to change it. > >> In the test: >> >> ?? 54???????? // Generally as long as we don't crash of throw unexpected >> ?? 55???????? // exceptions then the test passes. In some cases we >> know exactly >> >> "of" should be "or". > > Well spotted. Thanks. > >> Shouldn't you be catching exceptions for all the Thread methods you >> are calling? Otherwise the test will exit if one is thrown, and the >> above comment indicates that you don't want this. > > I'm not expecting there to be any exceptions from any of the called > methods. That would potentially indicate a problem in handling the > terminated native thread, so would indicate a test failure. > >> Don't we normally put these tests in a package? > > Doesn't seem to be any hard and fast rule. I only uses packages when > they are important for the test. In runtime we have 905 java files and > only 116 have a package statement. It varies elsewhere. > > Thanks, > David > >> thanks, >> >> Chris >> >> On 7/5/18 2:58 AM, David Holmes wrote: >>> Solaris compiler complains about doing a return from inside a >>> do-while loop. I'll have to rework part of the fix tomorrow. >>> >>> David >>> >>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>> >>>> Problem: >>>> >>>> The tests create native threads that attach to the VM through >>>> JNI_AttachCurrentThread but which then terminate without detaching >>>> themselves. When the VM exits and we're using Flight Recorder >>>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>>> wants to print the per-thread CPU usage. When we encounter the >>>> threads that have terminated already the low level >>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>> expect that and so fails an assert in debug mode and can SEGV in >>>> product mode. >>>> >>>> Solution: >>>> >>>> Serviceability-side: fix the tests >>>> >>>> Change the tests so that the threads detach before terminating. The >>>> two tests are (surprisingly) written in completely different styles, >>>> so the solution also takes on two different styles. >>>> >>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>> threads that terminate before detaching, and add a regression test >>>> >>>> I took a good look at the low-level code for interacting with >>>> arbitrary threads and as far as I can see the problem only exists >>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the >>>> potential for a library call failure just reports an error value >>>> (such as -1 for the cpu time used). >>>> >>>> So the fix is simply to allow for ESRCH when calling >>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>> >>>> I created a new regression test to create a new native thread, >>>> attach it and then let it terminate while still attached. The java >>>> code then calls various Thread and ThreadMXBean functions on it to >>>> ensure there are no crashes or unexpected exceptions. >>>> >>>> Testing: >>>> ??- old tests with fixed run-time >>>> ??- old run-time with fixed tests >>>> ??- mach tier4 (which exposed the problem - that's where we enable >>>> Flight recorder for the tests) [in progress] >>>> ??- mach5 tier 1-3 for good measure [in progress] >>>> ??- new regression test >>>> >>>> Thanks, >>>> David >> >> >> From harold.seigel at oracle.com Fri Jul 6 11:48:14 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 6 Jul 2018 07:48:14 -0400 Subject: RFR 8203911: Test runtime/modules/getModuleJNI/GetModule fails with -Xcheck:jni In-Reply-To: References: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> <46906a86-aa82-baae-1314-c76c5d8deb8e@oracle.com> Message-ID: <26e01f08-41bb-0709-aaa9-14b963d62b36@oracle.com> Thanks David! This is probably not your first self-review :). Harold On 7/5/2018 10:16 PM, David Holmes wrote: > On 6/07/2018 12:05 PM, David Holmes wrote: >> Looks good to me! (Does that could as a self-review? :) ) > > s/could/count/ > >> Thanks, >> David >> >> On 6/07/2018 3:44 AM, Harold David Seigel wrote: >>> Hi, >>> >>> Please review this small JDK-12 fix for bug JDK-8203911.? The change >>> contains the fix suggested in the bug. >>> >>> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8203911/webrev/ >>> >>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8203911 >>> >>> This fix was tested by running the failing test with -Xcheck:jni. >>> Regression testing included Mach5 tiers 1 and 2 tests and builds on >>> Linux-X64, Windows, Solaris Sparc, and Mac OS X, with tiers 3-5 >>> tests on Linux-x64, and with JCK-11 Lang and VM tests. >>> >>> Thanks, Harold >>> From coleen.phillimore at oracle.com Fri Jul 6 12:39:48 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 6 Jul 2018 08:39:48 -0400 Subject: RFR 8203911: Test runtime/modules/getModuleJNI/GetModule fails with -Xcheck:jni In-Reply-To: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> References: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> Message-ID: <9fcae15c-2988-8a5b-99f7-5f288eddb736@oracle.com> This change looks good. Coleen On 7/5/18 1:44 PM, Harold David Seigel wrote: > Hi, > > Please review this small JDK-12 fix for bug JDK-8203911.? The change > contains the fix suggested in the bug. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8203911/webrev/ > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8203911 > > This fix was tested by running the failing test with -Xcheck:jni. > Regression testing included Mach5 tiers 1 and 2 tests and builds on > Linux-X64, Windows, Solaris Sparc, and Mac OS X, with tiers 3-5 tests > on Linux-x64, and with JCK-11 Lang and VM tests. > > Thanks, Harold > From harold.seigel at oracle.com Fri Jul 6 12:42:46 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 6 Jul 2018 08:42:46 -0400 Subject: RFR 8203911: Test runtime/modules/getModuleJNI/GetModule fails with -Xcheck:jni In-Reply-To: <9fcae15c-2988-8a5b-99f7-5f288eddb736@oracle.com> References: <949fe16c-1271-fc83-abd1-c31f113bad96@oracle.com> <9fcae15c-2988-8a5b-99f7-5f288eddb736@oracle.com> Message-ID: <7d122589-5b6b-1af4-2814-ee8c6a17ba07@oracle.com> Thanks Coleen! Harold On 7/6/2018 8:39 AM, coleen.phillimore at oracle.com wrote: > > This change looks good. > Coleen > > On 7/5/18 1:44 PM, Harold David Seigel wrote: >> Hi, >> >> Please review this small JDK-12 fix for bug JDK-8203911.? The change >> contains the fix suggested in the bug. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8203911/webrev/ >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8203911 >> >> This fix was tested by running the failing test with -Xcheck:jni. >> Regression testing included Mach5 tiers 1 and 2 tests and builds on >> Linux-X64, Windows, Solaris Sparc, and Mac OS X, with tiers 3-5 tests >> on Linux-x64, and with JCK-11 Lang and VM tests. >> >> Thanks, Harold >> > From gunter.haug at sap.com Fri Jul 6 12:51:26 2018 From: gunter.haug at sap.com (Haug, Gunter) Date: Fri, 6 Jul 2018 12:51:26 +0000 Subject: RFR(S): 8206408: Add missing CPU/system info to vm_version_ext on PPC64 Message-ID: <7F040F83-7B83-493C-8DFB-059509A55272@sap.com> Hi all, can I please have reviews and a sponsor for the following tiny fix: https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8206408 http://cr.openjdk.java.net/~ghaug/webrevs/8206408 The solution is not really accurate as there is no obvious way to detect the number of cores/slots on a PPC64 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. Thanks and best regards, Gunter From martin.doerr at sap.com Fri Jul 6 12:59:19 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 6 Jul 2018 12:59:19 +0000 Subject: RFR(S): 8206459: [s390] Prevent restoring incorrect bcp and locals in interpreter and avoid incorrect size of partialSubtypeCheckNode in C2 Message-ID: Hi, TestInterfaceMethodSelection has shown a bug in the template interpreter on s390. Restore functions for locals (R12 = Z_tmp_3) and bcp (R13 = Z_tmp_4) are used without having saved the correct values. In addition, C2 currently uses a constant size for partialSubtypeCheckNode which uses load_const_optimized with variable size. We can simply preserve these 2 registers and remove the restore function calls. Webrev: http://cr.openjdk.java.net/~mdoerr/8206459_s390_fixes/webrev.00/ Please review. Best regards, Martin From martin.doerr at sap.com Fri Jul 6 13:22:30 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 6 Jul 2018 13:22:30 +0000 Subject: RFR(S): 8206408: Add missing CPU/system info to vm_version_ext on PPC64 In-Reply-To: <7F040F83-7B83-493C-8DFB-059509A55272@sap.com> References: <7F040F83-7B83-493C-8DFB-059509A55272@sap.com> Message-ID: <1e2adbb18e36400398ab5c45441fd4bb@sap.com> Hi Gunter, thanks for adding the missing initialization of the values. I think that the unused local variables core_id, chip_id, len and src_string should better get removed. But I don't need a new webrev for that. Besides that, the change looks good and I can sponsor it. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Haug, Gunter Sent: Freitag, 6. Juli 2018 14:51 To: hotspot-runtime-dev at openjdk.java.net Subject: [CAUTION] RFR(S): 8206408: Add missing CPU/system info to vm_version_ext on PPC64 Hi all, can I please have reviews and a sponsor for the following tiny fix: https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8206408 http://cr.openjdk.java.net/~ghaug/webrevs/8206408 The solution is not really accurate as there is no obvious way to detect the number of cores/slots on a PPC64 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. Thanks and best regards, Gunter From volker.simonis at gmail.com Fri Jul 6 14:14:59 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 6 Jul 2018 16:14:59 +0200 Subject: RFR(S): 8206408: Add missing CPU/system info to vm_version_ext on PPC64 In-Reply-To: <7F040F83-7B83-493C-8DFB-059509A55272@sap.com> References: <7F040F83-7B83-493C-8DFB-059509A55272@sap.com> Message-ID: Hi Gunter, in general, your change looks good! Is it guaranteed, that PowerArchitecturePPC64 and VM_Version::_features_strings will be always initialized before they are called from VM_Version_Ext::initialize_cpu_information ? And finally, I'm wondering why you are using "CPU_TYPE_DESC_BUF_SIZE - 1" as the length argument in the first snprintf() call. Wouldn't "CPU_TYPE_DESC_BUF_SIZE" be just fine like in the second call where you are using "CPU_DETAILED_DESC_BUF_SIZE". Thank you and best regards, Volker On Fri, Jul 6, 2018 at 2:51 PM, Haug, Gunter wrote: > Hi all, > > can I please have reviews and a sponsor for the following tiny fix: > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8206408 > http://cr.openjdk.java.net/~ghaug/webrevs/8206408 > > The solution is not really accurate as there is no obvious way to detect the number of cores/slots on a PPC64 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. > > Thanks and best regards, > Gunter > From peter.levart at gmail.com Fri Jul 6 16:10:19 2018 From: peter.levart at gmail.com (Peter Levart) Date: Fri, 6 Jul 2018 18:10:19 +0200 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> Message-ID: Hi, On 07/05/2018 01:01 AM, David Holmes wrote: > I dispute "they will understand this might have happened in another > thread". What if the stack trace was like the following... Before patch: 1st attempt [ForkJoinPool.commonPool-worker-3]: java.lang.ExceptionInInitializerError ??????? at ClinitFailure.lambda$main$0(ClinitFailure.java:20) ??????? at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ??????? at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) ??????? at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ??????? at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) ??????? at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) ??????? at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) ??????? at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) Caused by: java.lang.RuntimeException: Can't get it! ??????? at ClinitFailure$Faulty.(ClinitFailure.java:12) ??????? ... 8 more Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 0 ??????? at ClinitFailure$Faulty.(ClinitFailure.java:10) ??????? ... 8 more 2nd attempt [ForkJoinPool.commonPool-worker-5]: java.lang.NoClassDefFoundError: Could not initialize class ClinitFailure$Faulty ??????? at ClinitFailure.lambda$main$1(ClinitFailure.java:28) ??????? at java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) ??????? at java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) ??????? at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ??????? at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) ??????? at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) ??????? at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) ??????? at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) After patch: 1st attempt [ForkJoinPool.commonPool-worker-3]: java.lang.ExceptionInInitializerError ??????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) ??????? at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ??????? at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) ??????? at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ??????? at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) ??????? at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) ??????? at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) ??????? at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) Caused by: java.lang.RuntimeException: Can't get it! ??????? at ClinitFailure$Faulty.(ClinitFailure.java:10) ??????? ... 8 more Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 0 ??????? at ClinitFailure$Faulty.(ClinitFailure.java:8) ??????? ... 8 more 2nd attempt [ForkJoinPool.commonPool-worker-5]: java.lang.NoClassDefFoundError: Could not initialize class ClinitFailure$Faulty ??????? at java.base/java.lang.ClassLoader.throwReinitException(ClassLoader.java:3062) ??????? at ClinitFailure.lambda$main$1(ClinitFailure.java:25) ??????? at java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) ??????? at java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) ??????? at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ??????? at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) ??????? at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) ??????? at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) ??????? at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) Caused by: java.lang.ExceptionInInitializerError: 11 ms ago in thread ForkJoinPool.commonPool-worker-3 ??????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) ??????? at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ??????? at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) ??????? ... 5 more Caused by: java.lang.RuntimeException: Can't get it! ??????? at ClinitFailure$Faulty.(ClinitFailure.java:10) ??????? ... 8 more Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 0 ??????? at ClinitFailure$Faulty.(ClinitFailure.java:8) ??????? ... 8 more This is what gets printed by the sample program: public class ClinitFailure { ??? static class Faulty { ??????? static { ??????????? try { ??????????????? int i = (new int[0])[1]; ??????????? } catch (Exception e) { ??????????????? throw new RuntimeException("Can't get it!", e); ??????????? } ??????? } ??? } ??? public static void main(String[] args) throws Exception { ??????? CompletableFuture.runAsync(() -> { ??????????? try { ??????????????? new Faulty(); ??????????? } catch (Throwable e) { ??????????????? System.out.printf("\n1st attempt [%s]:\n\n", Thread.currentThread().getName()); ??????????????? e.printStackTrace(System.out); ??????????? } ??????? }).thenRunAsync(() -> { ??????????? try { ??????????????? new Faulty(); ??????????? } catch (Throwable e) { ??????????????? System.out.printf("\n2nd attempt [%s]:\n\n", Thread.currentThread().getName()); ??????????????? e.printStackTrace(System.out); ??????????? } ??????? }).join(); ??? } } When the following patch is applied: http://cr.openjdk.java.net/~plevart/jdk-dev/8203826_NoClassDefFoundError.cause/webrev.01/ I took Volker's patch and modified it a bit: - The logic to construct and throw NoClassDefFoundError and to record initial exception is in java now. It uses ClassLoaderValue internal API to save the chains of exception(s) for faulty classes. It is easier to do such logic in Java and less error prone. - The chain of original exception(s) is replaced with substitutes that mimic .toString() and .printStackTrace() methods of original chain, but don't reference any classes outside bootstrap class loader - The replacement chain of original exceptions adds a custom message insert into the top exception as a hint to the user: ??? ??? java.lang.ExceptionInInitializerError: 11 ms ago in thread ForkJoinPool.commonPool-worker-3 So, what do you think of this one? Regards, Peter From calvin.cheung at oracle.com Fri Jul 6 16:15:39 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Fri, 06 Jul 2018 09:15:39 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> Message-ID: <5B3F95AB.7060702@oracle.com> Hi Jiangli, Thanks for this start-up improvement. The changes look good overall. I've the following minor comments. 1) make/hotspot/symbols/symbols-unix 134 JVM_InitializeFromArchive If you want the symbols to be in alphabetical order, the above should be moved after JVM_InitStackTraceElementArray. 2) metaspaceShared.cpp 1927 oop MetaspaceShared::materialize_archived_object(oop obj) { 1928 if (obj != NULL) { 1929 return G1CollectedHeap::heap()->materialize_archived_object(obj); 1930 } 1931 return NULL; 1932 } Instead of two return statements, how about replacing lines 1928 - 1931 with the following? return (obj != NULL) ? G1CollectedHeap::heap()->materialize_archived_object(obj) : NULL; 3) ArchivedModuleComboTest.java 55 Path moduleDir = Files.createTempDirectory(userDir, "mods"); I don't see anything got placed under the "mods" dir, is it by design? For the "dump with --module-path" cases, there seems to be a missing test case with "--show-module-resolution" (similar to Test case 2). 4) CheckArchivedModuleApp.java 53 if (expectArchived && wb.isShared(md)) { 54 System.out.println(name + " is archived. Expected."); 55 } else if (!expectArchived && !wb.isShared(md)) { 56 System.out.println(name + " is not archived. Expected."); 57 } else if (expectArchived) { 58 throw new RuntimeException( 59 "FAILED. " + name + " is not archived. Expect archived."); 60 } else { 61 throw new RuntimeException( 62 "FAILED. " + name + " is archived. Expect not archived."); 63 } I'd suggest the following so that the code is easier to understand: if (expectArchived) { if (wb.isShared(md)) { System.out.println(name + " is archived. Expected."); } else { throw new RuntimeException( "FAILED. " + name + " is not archived. Expect archived."); } } else { if (!wb.isShared(md)) { System.out.println(name + " is not archived. Expected."); } else { throw new RuntimeException( "FAILED. " + name + " is archived. Expect not archived."); } } 5) ArchivedModuleWithCustomImageTest.java 178 private static void printCommand(String opts[]) { 179 StringBuilder cmdLine = new StringBuilder(); 180 for (String cmd : opts) 181 cmdLine.append(cmd).append(' '); 182 System.out.println("Command line: [" + cmdLine.toString() + "]"); 183 } Consider putting the above method in ProcessTools.java so that ProcessTools.createJavaProcessBuilder() and the above test can call it and avoiding duplicate code. A separate follow-up bug to address this is fine. 6) PrintSystemModulesApp.java I don't think it is being used? thanks, Calvin On 6/28/18, 4:15 PM, Jiangli Zhou wrote: > This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). > > The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. > > The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. > > webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ > RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 > > Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. > > Following are the details of system module archiving, which are duplicated in above bug report. > --------------------------------------------------------------------------------------------------------------------------- > Support archiving system module graph when the initial module is unnamed module from -cp currently. > > Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. > > Dump time system module object archiving > ================================= > At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. > > private static SystemModules archivedSystemModules; > private static ModuleFinder archivedSystemModuleFinder; > private static String archivedMainModule; > > The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. > > 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. > 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. > 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. > 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. > 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. > > Runtime initialization from archived system module objects > ============================================ > VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. > > If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. > > In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. > > Thanks, > Jiangli > > From jiangli.zhou at Oracle.COM Fri Jul 6 19:34:59 2018 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Fri, 6 Jul 2018 12:34:59 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <5B3F95AB.7060702@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <5B3F95AB.7060702@oracle.com> Message-ID: Hi Calvin, Thanks for the review! Here is the updated webrevs that address the feedbacks from you and Ioi: http://cr.openjdk.java.net/~jiangli/8202035/webrev_inc.01/ Full webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev_full.01/ > On Jul 6, 2018, at 9:15 AM, Calvin Cheung wrote: > > Hi Jiangli, > > Thanks for this start-up improvement. The changes look good overall. I've the following minor comments. > > 1) make/hotspot/symbols/symbols-unix > > 134 JVM_InitializeFromArchive > > If you want the symbols to be in alphabetical order, the above should be moved after JVM_InitStackTraceElementArray. Fixed. > > 2) metaspaceShared.cpp > > 1927 oop MetaspaceShared::materialize_archived_object(oop obj) { > 1928 if (obj != NULL) { > 1929 return G1CollectedHeap::heap()->materialize_archived_object(obj); > 1930 } > 1931 return NULL; > 1932 } > > Instead of two return statements, how about replacing lines 1928 - 1931 with the following? > > return (obj != NULL) ? G1CollectedHeap::heap()->materialize_archived_object(obj) : NULL; The original format probably is slightly easier to read, so I left it unchanged. Hope that?s okay with you. > > 3) ArchivedModuleComboTest.java > > 55 Path moduleDir = Files.createTempDirectory(userDir, "mods"); > > I don't see anything got placed under the "mods" dir, is it by design? Yes. > > For the "dump with --module-path" cases, there seems to be a missing test case with "--show-module-resolution" (similar to Test case 2). When --module-path is specified at dump time, system module graph is not archived currently. There is no need for additional test case with --show-module-resolution in this case since all module objects are created as normal. > > > 4) CheckArchivedModuleApp.java > > 53 if (expectArchived && wb.isShared(md)) { > 54 System.out.println(name + " is archived. Expected."); > 55 } else if (!expectArchived && !wb.isShared(md)) { > 56 System.out.println(name + " is not archived. Expected."); > 57 } else if (expectArchived) { > 58 throw new RuntimeException( > 59 "FAILED. " + name + " is not archived. Expect archived."); > 60 } else { > 61 throw new RuntimeException( > 62 "FAILED. " + name + " is archived. Expect not archived."); > 63 } > > I'd suggest the following so that the code is easier to understand: > > if (expectArchived) { > if (wb.isShared(md)) { > System.out.println(name + " is archived. Expected."); > } else { > throw new RuntimeException( > "FAILED. " + name + " is not archived. Expect archived."); > } > } else { > if (!wb.isShared(md)) { > System.out.println(name + " is not archived. Expected."); > } else { > throw new RuntimeException( > "FAILED. " + name + " is archived. Expect not archived."); > } > } Reformatted as suggested. > > 5) ArchivedModuleWithCustomImageTest.java > > 178 private static void printCommand(String opts[]) { > 179 StringBuilder cmdLine = new StringBuilder(); > 180 for (String cmd : opts) > 181 cmdLine.append(cmd).append(' '); > 182 System.out.println("Command line: [" + cmdLine.toString() + "]"); > 183 } > > Consider putting the above method in ProcessTools.java so that ProcessTools.createJavaProcessBuilder() and the above test can call it and avoiding duplicate code. > A separate follow-up bug to address this is fine. That sounds good to me. We might need some reformatting for consolidation. I will file a follow-up RFE. > > 6) PrintSystemModulesApp.java > > I don't think it is being used? It?s used by ArchivedModuleCompareTest.java. Looks like it was missing from the earlier webrev. Thanks for catching that. The file is included in the updated webrev. Thanks! Jiangli > > thanks, > Calvin > > On 6/28/18, 4:15 PM, Jiangli Zhou wrote: >> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). >> >> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. >> >> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 >> >> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. >> >> Following are the details of system module archiving, which are duplicated in above bug report. >> --------------------------------------------------------------------------------------------------------------------------- >> Support archiving system module graph when the initial module is unnamed module from -cp currently. >> >> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. >> >> Dump time system module object archiving >> ================================= >> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. >> >> private static SystemModules archivedSystemModules; >> private static ModuleFinder archivedSystemModuleFinder; >> private static String archivedMainModule; >> >> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. >> >> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. >> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. >> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. >> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. >> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. >> >> Runtime initialization from archived system module objects >> ============================================ >> VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. >> >> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. >> >> In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. >> >> Thanks, >> Jiangli >> >> From coleen.phillimore at oracle.com Fri Jul 6 19:41:32 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 6 Jul 2018 15:41:32 -0400 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread Message-ID: Summary: Only fetch Node::next once and use that result. A racing thread could NULL next->next()->next().? The Node itself is stable until the write_synchronize() but the pointers may be updated.? See bug for more detail. open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8206471 Tested with SymbolTable changes and tests that failed.? Also tested with mach5 hs-tier1-5 (in progress). This is actually Robbin's fix, and my review is that it looks good. Thanks, Coleen From harold.seigel at oracle.com Fri Jul 6 19:57:28 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 6 Jul 2018 15:57:28 -0400 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: References: Message-ID: <3c2f3431-3621-97c8-db28-1bc1d35fb366@oracle.com> Looks good. Thanks, Harold On 7/6/2018 3:41 PM, coleen.phillimore at oracle.com wrote: > Summary: Only fetch Node::next once and use that result. > > A racing thread could NULL next->next()->next().? The Node itself is > stable until the write_synchronize() but the pointers may be updated.? > See bug for more detail. > > open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8206471 > > Tested with SymbolTable changes and tests that failed.? Also tested > with mach5 hs-tier1-5 (in progress). > > This is actually Robbin's fix, and my review is that it looks good. > > Thanks, > Coleen From coleen.phillimore at oracle.com Fri Jul 6 20:02:46 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 6 Jul 2018 16:02:46 -0400 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: <3c2f3431-3621-97c8-db28-1bc1d35fb366@oracle.com> References: <3c2f3431-3621-97c8-db28-1bc1d35fb366@oracle.com> Message-ID: <7e5d543f-aa42-d98e-1255-3a0e443181f4@oracle.com> Thanks Harold! Coleen On 7/6/18 3:57 PM, Harold David Seigel wrote: > Looks good. > > Thanks, Harold > > > On 7/6/2018 3:41 PM, coleen.phillimore at oracle.com wrote: >> Summary: Only fetch Node::next once and use that result. >> >> A racing thread could NULL next->next()->next().? The Node itself is >> stable until the write_synchronize() but the pointers may be >> updated.? See bug for more detail. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8206471 >> >> Tested with SymbolTable changes and tests that failed.? Also tested >> with mach5 hs-tier1-5 (in progress). >> >> This is actually Robbin's fix, and my review is that it looks good. >> >> Thanks, >> Coleen > From coleen.phillimore at oracle.com Fri Jul 6 20:31:26 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 6 Jul 2018 16:31:26 -0400 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <5B3F95AB.7060702@oracle.com> Message-ID: Hi Jiangli, I've reviewed much of the runtime part of this code.? It looks really good!? It's great to have more archived objects for startup improvement, and this seems like a good foundation to build upon. Thanks, Coleen On 7/6/18 3:34 PM, Jiangli Zhou wrote: > Hi Calvin, > > Thanks for the review! Here is the updated webrevs that address the feedbacks from you and Ioi: > > http://cr.openjdk.java.net/~jiangli/8202035/webrev_inc.01/ > > Full webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev_full.01/ > >> On Jul 6, 2018, at 9:15 AM, Calvin Cheung wrote: >> >> Hi Jiangli, >> >> Thanks for this start-up improvement. The changes look good overall. I've the following minor comments. >> >> 1) make/hotspot/symbols/symbols-unix >> >> 134 JVM_InitializeFromArchive >> >> If you want the symbols to be in alphabetical order, the above should be moved after JVM_InitStackTraceElementArray. > Fixed. > >> 2) metaspaceShared.cpp >> >> 1927 oop MetaspaceShared::materialize_archived_object(oop obj) { >> 1928 if (obj != NULL) { >> 1929 return G1CollectedHeap::heap()->materialize_archived_object(obj); >> 1930 } >> 1931 return NULL; >> 1932 } >> >> Instead of two return statements, how about replacing lines 1928 - 1931 with the following? >> >> return (obj != NULL) ? G1CollectedHeap::heap()->materialize_archived_object(obj) : NULL; > The original format probably is slightly easier to read, so I left it unchanged. Hope that?s okay with you. > >> 3) ArchivedModuleComboTest.java >> >> 55 Path moduleDir = Files.createTempDirectory(userDir, "mods"); >> >> I don't see anything got placed under the "mods" dir, is it by design? > Yes. > >> For the "dump with --module-path" cases, there seems to be a missing test case with "--show-module-resolution" (similar to Test case 2). > When --module-path is specified at dump time, system module graph is not archived currently. There is no need for additional test case with --show-module-resolution in this case since all module objects are created as normal. > >> >> 4) CheckArchivedModuleApp.java >> >> 53 if (expectArchived && wb.isShared(md)) { >> 54 System.out.println(name + " is archived. Expected."); >> 55 } else if (!expectArchived && !wb.isShared(md)) { >> 56 System.out.println(name + " is not archived. Expected."); >> 57 } else if (expectArchived) { >> 58 throw new RuntimeException( >> 59 "FAILED. " + name + " is not archived. Expect archived."); >> 60 } else { >> 61 throw new RuntimeException( >> 62 "FAILED. " + name + " is archived. Expect not archived."); >> 63 } >> >> I'd suggest the following so that the code is easier to understand: >> >> if (expectArchived) { >> if (wb.isShared(md)) { >> System.out.println(name + " is archived. Expected."); >> } else { >> throw new RuntimeException( >> "FAILED. " + name + " is not archived. Expect archived."); >> } >> } else { >> if (!wb.isShared(md)) { >> System.out.println(name + " is not archived. Expected."); >> } else { >> throw new RuntimeException( >> "FAILED. " + name + " is archived. Expect not archived."); >> } >> } > Reformatted as suggested. > >> 5) ArchivedModuleWithCustomImageTest.java >> >> 178 private static void printCommand(String opts[]) { >> 179 StringBuilder cmdLine = new StringBuilder(); >> 180 for (String cmd : opts) >> 181 cmdLine.append(cmd).append(' '); >> 182 System.out.println("Command line: [" + cmdLine.toString() + "]"); >> 183 } >> >> Consider putting the above method in ProcessTools.java so that ProcessTools.createJavaProcessBuilder() and the above test can call it and avoiding duplicate code. >> A separate follow-up bug to address this is fine. > That sounds good to me. We might need some reformatting for consolidation. I will file a follow-up RFE. > >> 6) PrintSystemModulesApp.java >> >> I don't think it is being used? > It?s used by ArchivedModuleCompareTest.java. Looks like it was missing from the earlier webrev. Thanks for catching that. The file is included in the updated webrev. > > Thanks! > Jiangli > >> thanks, >> Calvin >> >> On 6/28/18, 4:15 PM, Jiangli Zhou wrote: >>> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). >>> >>> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. >>> >>> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. >>> >>> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >>> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 >>> >>> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. >>> >>> Following are the details of system module archiving, which are duplicated in above bug report. >>> --------------------------------------------------------------------------------------------------------------------------- >>> Support archiving system module graph when the initial module is unnamed module from -cp currently. >>> >>> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. >>> >>> Dump time system module object archiving >>> ================================= >>> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. >>> >>> private static SystemModules archivedSystemModules; >>> private static ModuleFinder archivedSystemModuleFinder; >>> private static String archivedMainModule; >>> >>> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. >>> >>> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. >>> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. >>> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. >>> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. >>> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. >>> >>> Runtime initialization from archived system module objects >>> ============================================ >>> VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. >>> >>> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. >>> >>> In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. >>> >>> Thanks, >>> Jiangli >>> >>> From jiangli.zhou at oracle.com Fri Jul 6 20:39:07 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 6 Jul 2018 13:39:07 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <5B3F95AB.7060702@oracle.com> Message-ID: <9D547BFA-1B98-40DF-B5E7-C63A0057D6F8@oracle.com> Hi Coleen, Thanks a lot for reviewing! Jiangli > On Jul 6, 2018, at 1:31 PM, coleen.phillimore at oracle.com wrote: > > > Hi Jiangli, > > I've reviewed much of the runtime part of this code. It looks really good! It's great to have more archived objects for startup improvement, and this seems like a good foundation to build upon. > > Thanks, > Coleen > > On 7/6/18 3:34 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> Thanks for the review! Here is the updated webrevs that address the feedbacks from you and Ioi: >> >> http://cr.openjdk.java.net/~jiangli/8202035/webrev_inc.01/ >> >> Full webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev_full.01/ >> >>> On Jul 6, 2018, at 9:15 AM, Calvin Cheung wrote: >>> >>> Hi Jiangli, >>> >>> Thanks for this start-up improvement. The changes look good overall. I've the following minor comments. >>> >>> 1) make/hotspot/symbols/symbols-unix >>> >>> 134 JVM_InitializeFromArchive >>> >>> If you want the symbols to be in alphabetical order, the above should be moved after JVM_InitStackTraceElementArray. >> Fixed. >> >>> 2) metaspaceShared.cpp >>> >>> 1927 oop MetaspaceShared::materialize_archived_object(oop obj) { >>> 1928 if (obj != NULL) { >>> 1929 return G1CollectedHeap::heap()->materialize_archived_object(obj); >>> 1930 } >>> 1931 return NULL; >>> 1932 } >>> >>> Instead of two return statements, how about replacing lines 1928 - 1931 with the following? >>> >>> return (obj != NULL) ? G1CollectedHeap::heap()->materialize_archived_object(obj) : NULL; >> The original format probably is slightly easier to read, so I left it unchanged. Hope that?s okay with you. >> >>> 3) ArchivedModuleComboTest.java >>> >>> 55 Path moduleDir = Files.createTempDirectory(userDir, "mods"); >>> >>> I don't see anything got placed under the "mods" dir, is it by design? >> Yes. >> >>> For the "dump with --module-path" cases, there seems to be a missing test case with "--show-module-resolution" (similar to Test case 2). >> When --module-path is specified at dump time, system module graph is not archived currently. There is no need for additional test case with --show-module-resolution in this case since all module objects are created as normal. >> >>> >>> 4) CheckArchivedModuleApp.java >>> >>> 53 if (expectArchived && wb.isShared(md)) { >>> 54 System.out.println(name + " is archived. Expected."); >>> 55 } else if (!expectArchived && !wb.isShared(md)) { >>> 56 System.out.println(name + " is not archived. Expected."); >>> 57 } else if (expectArchived) { >>> 58 throw new RuntimeException( >>> 59 "FAILED. " + name + " is not archived. Expect archived."); >>> 60 } else { >>> 61 throw new RuntimeException( >>> 62 "FAILED. " + name + " is archived. Expect not archived."); >>> 63 } >>> >>> I'd suggest the following so that the code is easier to understand: >>> >>> if (expectArchived) { >>> if (wb.isShared(md)) { >>> System.out.println(name + " is archived. Expected."); >>> } else { >>> throw new RuntimeException( >>> "FAILED. " + name + " is not archived. Expect archived."); >>> } >>> } else { >>> if (!wb.isShared(md)) { >>> System.out.println(name + " is not archived. Expected."); >>> } else { >>> throw new RuntimeException( >>> "FAILED. " + name + " is archived. Expect not archived."); >>> } >>> } >> Reformatted as suggested. >> >>> 5) ArchivedModuleWithCustomImageTest.java >>> >>> 178 private static void printCommand(String opts[]) { >>> 179 StringBuilder cmdLine = new StringBuilder(); >>> 180 for (String cmd : opts) >>> 181 cmdLine.append(cmd).append(' '); >>> 182 System.out.println("Command line: [" + cmdLine.toString() + "]"); >>> 183 } >>> >>> Consider putting the above method in ProcessTools.java so that ProcessTools.createJavaProcessBuilder() and the above test can call it and avoiding duplicate code. >>> A separate follow-up bug to address this is fine. >> That sounds good to me. We might need some reformatting for consolidation. I will file a follow-up RFE. >> >>> 6) PrintSystemModulesApp.java >>> >>> I don't think it is being used? >> It?s used by ArchivedModuleCompareTest.java. Looks like it was missing from the earlier webrev. Thanks for catching that. The file is included in the updated webrev. >> >> Thanks! >> Jiangli >> >>> thanks, >>> Calvin >>> >>> On 6/28/18, 4:15 PM, Jiangli Zhou wrote: >>>> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). >>>> >>>> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. >>>> >>>> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. >>>> >>>> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 >>>> >>>> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. >>>> >>>> Following are the details of system module archiving, which are duplicated in above bug report. >>>> --------------------------------------------------------------------------------------------------------------------------- >>>> Support archiving system module graph when the initial module is unnamed module from -cp currently. >>>> >>>> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. >>>> >>>> Dump time system module object archiving >>>> ================================= >>>> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. >>>> >>>> private static SystemModules archivedSystemModules; >>>> private static ModuleFinder archivedSystemModuleFinder; >>>> private static String archivedMainModule; >>>> >>>> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. >>>> >>>> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. >>>> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. >>>> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. >>>> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. >>>> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. >>>> >>>> Runtime initialization from archived system module objects >>>> ============================================ >>>> VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. >>>> >>>> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. >>>> >>>> In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> > From mandy.chung at oracle.com Fri Jul 6 20:40:03 2018 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 6 Jul 2018 13:40:03 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> Message-ID: <9aafedcf-4abf-77fd-e455-823c9e11c8b0@oracle.com> Hi Jiangli, On 6/28/18 4:15 PM, Jiangli Zhou wrote:> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ > RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 Good work. I'm glad to see a pretty good startup improvement. I reviewed java.base change that looks good. Mandy From jiangli.zhou at oracle.com Fri Jul 6 20:41:30 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 6 Jul 2018 13:41:30 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <9aafedcf-4abf-77fd-e455-823c9e11c8b0@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <9aafedcf-4abf-77fd-e455-823c9e11c8b0@oracle.com> Message-ID: <39F0EBB2-3721-4A08-9661-955E4B5E6920@oracle.com> Thanks a lot for reviewing, Mandy! Jiangli > On Jul 6, 2018, at 1:40 PM, mandy chung wrote: > > Hi Jiangli, > > On 6/28/18 4:15 PM, Jiangli Zhou wrote:> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 > > Good work. I'm glad to see a pretty good startup improvement. > > I reviewed java.base change that looks good. > > Mandy From zgu at redhat.com Sat Jul 7 11:36:57 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Sat, 7 Jul 2018 07:36:57 -0400 Subject: RFR(S) 8206183: Possible construct EMPTY_STACK and allocation stack, etc. on first use Message-ID: Hi, NMT has to workaround static initialization order issues: some of static objects, who allocate memory inside their constructors, may be initialized ahead of NMT, so NMT is forced to initialize itself early and risks its static objects may be reinitialized by C runtime. The workaround was to declare storage for the static objects as primitive arrays, then use placement new operator to initialize them, or just initialize them eagerly, if the results are constants. But the solution is not elegant, could break with some compilers. A better solution is to use "construct on First Use Idiom" pattern (https://isocpp.org/wiki/faq/ctors#static-init-order), cause we only have initialization order problems, those static objects do not have dependencies on other static objects, so we don't suffer from static deinitialization problems. Bug: https://bugs.openjdk.java.net/browse/JDK-8206183 Webrev: http://cr.openjdk.java.net/~zgu/8206183/webrev.00/ Test: hotspot_nmt on Linux 64 (fastdebug and release) Submit-test. Thanks, -Zhengyu From david.holmes at oracle.com Sun Jul 8 23:58:32 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Jul 2018 09:58:32 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Message-ID: <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> tl;dr skip the new regression test on Solaris New webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ This excludes the test from running on Solaris, so the makefile doesn't bother compiling this native test and the Java part of the test adds: ! * @requires os.family != "windows" & os.family != "solaris" * @summary Basic test of Thread and ThreadMXBean queries on a natively * attached thread that has failed to detach before terminating. + * @comment The native code only supports POSIX so no windows testing; also + * we have to skip solaris as a terminating thread that fails to + * detach will hit an infinite loop due to TLS destructor issues - see + * comments in JDK-8156708 Note this means that Solaris is not affected by the original issue because a still-attached native thread can't actually terminate due to the TLS destructor infinite-loop issue. Thanks, David On 6/07/2018 6:07 PM, David Holmes wrote: > The new test is hanging on Solaris. I just discovered we don't > run these tests on Solaris until tier4. > > David > > On 6/07/2018 8:40 AM, David Holmes wrote: >> Hi Chris, >> >> Thanks for looking at this. >> >> Updated webrev: >> >> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >> >> Only real changes in ji05t001.c. (And fixed typo in the new test) >> >> More below ... >> >> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>> Hi David, >>> >>> Solaris problems aside, overall it looks fine. Some minor things I >>> noted: >>> >>> I noticed that exitCode is never modified in agentA() or agentB(), so >>> there isn't much point to having it. If you reach the bottom of the >>> function, it passed, so PASSED can be returned. The code would be >>> more clear if it did this. As-is it is implied that you can reach the >>> bottom when it fails. >> >> I resisted any and all urges to do any kind of unrelated code cleanup >> in the tests - once you start you may end up doing a full rewrite. >> >>> Is detaching the threads along the failure paths really needed? >>> exit() is called, so this would seem to make it unnecessary. >> >> You're right that isn't necessary. I'll remove the changes from before >> the exits in ji05t001.c >> >>> I prefer assignments not to be embedded inside the "if" condition. >>> The DetachCurrentThread code in THREAD_return() is much more readable >>> than the similar code in agentA() and agentB(). >> >> It's an existing style already used in that test e.g. >> >> ??287???? if ((res = >> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != >> 0) { >> >> and I don't mind it, so I'd prefer not to change it. >> >>> In the test: >>> >>> ?? 54???????? // Generally as long as we don't crash of throw unexpected >>> ?? 55???????? // exceptions then the test passes. In some cases we >>> know exactly >>> >>> "of" should be "or". >> >> Well spotted. Thanks. >> >>> Shouldn't you be catching exceptions for all the Thread methods you >>> are calling? Otherwise the test will exit if one is thrown, and the >>> above comment indicates that you don't want this. >> >> I'm not expecting there to be any exceptions from any of the called >> methods. That would potentially indicate a problem in handling the >> terminated native thread, so would indicate a test failure. >> >>> Don't we normally put these tests in a package? >> >> Doesn't seem to be any hard and fast rule. I only uses packages when >> they are important for the test. In runtime we have 905 java files and >> only 116 have a package statement. It varies elsewhere. >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>> >>> On 7/5/18 2:58 AM, David Holmes wrote: >>>> Solaris compiler complains about doing a return from inside a >>>> do-while loop. I'll have to rework part of the fix tomorrow. >>>> >>>> David >>>> >>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>> >>>>> Problem: >>>>> >>>>> The tests create native threads that attach to the VM through >>>>> JNI_AttachCurrentThread but which then terminate without detaching >>>>> themselves. When the VM exits and we're using Flight Recorder >>>>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>>>> wants to print the per-thread CPU usage. When we encounter the >>>>> threads that have terminated already the low level >>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>>> expect that and so fails an assert in debug mode and can SEGV in >>>>> product mode. >>>>> >>>>> Solution: >>>>> >>>>> Serviceability-side: fix the tests >>>>> >>>>> Change the tests so that the threads detach before terminating. The >>>>> two tests are (surprisingly) written in completely different >>>>> styles, so the solution also takes on two different styles. >>>>> >>>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>>> threads that terminate before detaching, and add a regression test >>>>> >>>>> I took a good look at the low-level code for interacting with >>>>> arbitrary threads and as far as I can see the problem only exists >>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the >>>>> potential for a library call failure just reports an error value >>>>> (such as -1 for the cpu time used). >>>>> >>>>> So the fix is simply to allow for ESRCH when calling >>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>>> >>>>> I created a new regression test to create a new native thread, >>>>> attach it and then let it terminate while still attached. The java >>>>> code then calls various Thread and ThreadMXBean functions on it to >>>>> ensure there are no crashes or unexpected exceptions. >>>>> >>>>> Testing: >>>>> ??- old tests with fixed run-time >>>>> ??- old run-time with fixed tests >>>>> ??- mach tier4 (which exposed the problem - that's where we enable >>>>> Flight recorder for the tests) [in progress] >>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>> ??- new regression test >>>>> >>>>> Thanks, >>>>> David >>> >>> >>> From david.holmes at oracle.com Mon Jul 9 01:20:49 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Jul 2018 11:20:49 +1000 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: References: Message-ID: Hi Coleen, On 7/07/2018 5:41 AM, coleen.phillimore at oracle.com wrote: > Summary: Only fetch Node::next once and use that result. > > A racing thread could NULL next->next()->next().? The Node itself is > stable until the write_synchronize() but the pointers may be updated. > See bug for more detail. > > open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8206471 The change looks good. Could there be a similar race at: 552 bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); 553 rem_n = rem_n->next(); Even if not, it is marginally more performant to only do the load-acquire once. Similarly: 663 new_table->get_bucket(odd_index)->release_assign_node_ptr(odd, 664 aux->next()); 665 new_table->get_bucket(even_index)->release_assign_node_ptr(even, 666 aux->next()); combined with: 685 aux = aux->next(); makes for 3 load-acquire (and 2 if we take the else at line #675). And again: 982 bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); 983 rem_n = rem_n->next(); Thanks, David ----- > Tested with SymbolTable changes and tests that failed.? Also tested with > mach5 hs-tier1-5 (in progress). > > This is actually Robbin's fix, and my review is that it looks good. > > Thanks, > Coleen From david.holmes at oracle.com Mon Jul 9 01:37:12 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Jul 2018 11:37:12 +1000 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> Message-ID: <6bccaebc-9a3d-ef01-a8bf-58613fcfdd9c@oracle.com> Hi Peter, On 7/07/2018 2:10 AM, Peter Levart wrote: > Hi, > > On 07/05/2018 01:01 AM, David Holmes wrote: >> I dispute "they will understand this might have happened in another >> thread". > > What if the stack trace was like the following... Yes your suggestion makes it much clearer. But ... my whole objection here is doing all this extraneous execution of Java code in response to the initial exception. The more Java code we execute the more likely we will hit secondary exceptions and the greater the possibility of unintended interactions that might lead back to the class that can't be initialized. I just don't think this level of effort is warranted. Cheers, David ----- > Before patch: > > 1st attempt [ForkJoinPool.commonPool-worker-3]: > > java.lang.ExceptionInInitializerError > ??????? at ClinitFailure.lambda$main$0(ClinitFailure.java:20) > ??????? at > java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) > > ??????? at > java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) > > ??????? at > java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) > ??????? at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) > > ??????? at > java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) > ??????? at > java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) > > ??????? at > java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) > > Caused by: java.lang.RuntimeException: Can't get it! > ??????? at ClinitFailure$Faulty.(ClinitFailure.java:12) > ??????? ... 8 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of > bounds for length 0 > ??????? at ClinitFailure$Faulty.(ClinitFailure.java:10) > ??????? ... 8 more > > 2nd attempt [ForkJoinPool.commonPool-worker-5]: > > java.lang.NoClassDefFoundError: Could not initialize class > ClinitFailure$Faulty > ??????? at ClinitFailure.lambda$main$1(ClinitFailure.java:28) > ??????? at > java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) > > ??????? at > java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) > > ??????? at > java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) > ??????? at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) > > ??????? at > java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) > ??????? at > java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) > > ??????? at > java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) > > > > After patch: > > 1st attempt [ForkJoinPool.commonPool-worker-3]: > > java.lang.ExceptionInInitializerError > ??????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) > ??????? at > java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) > > ??????? at > java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) > > ??????? at > java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) > ??????? at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) > > ??????? at > java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) > ??????? at > java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) > > ??????? at > java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) > > Caused by: java.lang.RuntimeException: Can't get it! > ??????? at ClinitFailure$Faulty.(ClinitFailure.java:10) > ??????? ... 8 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of > bounds for length 0 > ??????? at ClinitFailure$Faulty.(ClinitFailure.java:8) > ??????? ... 8 more > > 2nd attempt [ForkJoinPool.commonPool-worker-5]: > > java.lang.NoClassDefFoundError: Could not initialize class > ClinitFailure$Faulty > ??????? at > java.base/java.lang.ClassLoader.throwReinitException(ClassLoader.java:3062) > ??????? at ClinitFailure.lambda$main$1(ClinitFailure.java:25) > ??????? at > java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) > > ??????? at > java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) > > ??????? at > java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) > ??????? at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) > > ??????? at > java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) > ??????? at > java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) > > ??????? at > java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) > > Caused by: java.lang.ExceptionInInitializerError: 11 ms ago in thread > ForkJoinPool.commonPool-worker-3 > ??????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) > ??????? at > java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) > > ??????? at > java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) > > ??????? ... 5 more > Caused by: java.lang.RuntimeException: Can't get it! > ??????? at ClinitFailure$Faulty.(ClinitFailure.java:10) > ??????? ... 8 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of > bounds for length 0 > ??????? at ClinitFailure$Faulty.(ClinitFailure.java:8) > ??????? ... 8 more > > > > This is what gets printed by the sample program: > > public class ClinitFailure { > > ??? static class Faulty { > ??????? static { > ??????????? try { > ??????????????? int i = (new int[0])[1]; > ??????????? } catch (Exception e) { > ??????????????? throw new RuntimeException("Can't get it!", e); > ??????????? } > ??????? } > ??? } > > ??? public static void main(String[] args) throws Exception { > ??????? CompletableFuture.runAsync(() -> { > ??????????? try { > ??????????????? new Faulty(); > ??????????? } catch (Throwable e) { > ??????????????? System.out.printf("\n1st attempt [%s]:\n\n", > Thread.currentThread().getName()); > ??????????????? e.printStackTrace(System.out); > ??????????? } > ??????? }).thenRunAsync(() -> { > ??????????? try { > ??????????????? new Faulty(); > ??????????? } catch (Throwable e) { > ??????????????? System.out.printf("\n2nd attempt [%s]:\n\n", > Thread.currentThread().getName()); > ??????????????? e.printStackTrace(System.out); > ??????????? } > ??????? }).join(); > ??? } > } > > > When the following patch is applied: > > http://cr.openjdk.java.net/~plevart/jdk-dev/8203826_NoClassDefFoundError.cause/webrev.01/ > > > > I took Volker's patch and modified it a bit: > > - The logic to construct and throw NoClassDefFoundError and to record > initial exception is in java now. It uses ClassLoaderValue > internal API to save the chains of exception(s) for faulty classes. It > is easier to do such logic in Java and less error prone. > - The chain of original exception(s) is replaced with > substitutes that mimic .toString() and .printStackTrace() methods of > original chain, but don't reference any classes outside bootstrap class > loader > - The replacement chain of original exceptions adds a custom message > insert into the top exception as a hint to the user: > > ??? ??? java.lang.ExceptionInInitializerError: 11 ms ago in thread > ForkJoinPool.commonPool-worker-3 > > > So, what do you think of this one? > > Regards, Peter > > From david.holmes at oracle.com Mon Jul 9 04:37:42 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Jul 2018 14:37:42 +1000 Subject: RFR(S) 8206183: Possible construct EMPTY_STACK and allocation stack, etc. on first use In-Reply-To: References: Message-ID: <2cd7efd8-d6d2-0a65-d87b-b264d8bd3970@oracle.com> Hi Zhengyu, On 7/07/2018 9:36 PM, Zhengyu Gu wrote: > Hi, > > NMT has to workaround static initialization order issues: some of static > objects, who allocate memory inside their constructors, may be > initialized ahead of NMT, so NMT is forced to initialize itself early > and risks its static objects may be reinitialized by C runtime. > > The workaround was to declare storage for the static objects as > primitive arrays, then use placement new operator to initialize them, or > just initialize them eagerly, if the results are constants. > > But the solution is not elegant, could break with some compilers. > A better solution is to use "construct on First Use Idiom" pattern > (https://isocpp.org/wiki/faq/ctors#static-init-order), cause we only > have initialization order problems, those static objects do not have > dependencies on other static objects, so we don't suffer from static > deinitialization problems. Okay but this relies on C+11 thread-safe static initialization. That's only available in VS2015 and above (which should be okay for JDK 12+). What about other compilers? Does it have to be enabled via any compilation flags? I'm currently running this through some additional internal build/tests. Thanks, David ----- > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206183 > Webrev: http://cr.openjdk.java.net/~zgu/8206183/webrev.00/ > > Test: > > ? hotspot_nmt on Linux 64 (fastdebug and release) > ? Submit-test. > > > Thanks, > > -Zhengyu From peter.levart at gmail.com Mon Jul 9 07:22:30 2018 From: peter.levart at gmail.com (Peter Levart) Date: Mon, 9 Jul 2018 09:22:30 +0200 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: <6bccaebc-9a3d-ef01-a8bf-58613fcfdd9c@oracle.com> References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> <6bccaebc-9a3d-ef01-a8bf-58613fcfdd9c@oracle.com> Message-ID: Hi David, On 07/09/2018 03:37 AM, David Holmes wrote: > Hi Peter, > > On 7/07/2018 2:10 AM, Peter Levart wrote: >> Hi, >> >> On 07/05/2018 01:01 AM, David Holmes wrote: >>> I dispute "they will understand this might have happened in another >>> thread". >> >> What if the stack trace was like the following... > > Yes your suggestion makes it much clearer. > > But ... my whole objection here is doing all this extraneous execution > of Java code in response to the initial exception. The more Java code > we execute the more likely we will hit secondary exceptions and the > greater the possibility of unintended interactions that might lead > back to the class that can't be initialized. I just don't think this > level of effort is warranted. I I agree that more classes are involved, but they are all JDK classes and their number is constant. Meaning, if they are OK and don't fail when initializing, there's no danger of unintended interactions that would be caused by initialization errors in other classes. And even if those additional needed classes had problems in their initialization, I think that the consequences would be under control. Let's see what additional classes are needed when the presented patch is used as opposed to classes needed in current logic: In step 7, when super class/interface initialization fails and in steps 10/11 when the class initialization fails, we record the error thrown (record_init_exception). In addition to previously needed classes we also need: - ClassLoader, - ClassLoader$InitExceptionSubst (with dependencies: RuntimeException, Exception), - ClassLoaderValue (with dependencies: AbstractClassLoaderValue, AbstractClassLoaderValue$Sub, AbstractClassLoaderValue$Memoizer, ConcurrentHashMap + deps) When we throw NoClassDefFoundError, we don't need any other additional classes that wouldn't already be needed originally. So I can see that when above additional classes had problems initializing themselves, there would be errors thrown from their usage when recording initial initialization exception of some unrelated class, but such errors would be ignored (step 7): ?979???????? // Record the exception thrown from super class/interface initialization so that ?980???????? // it can be chained into potential later NoClassDefFoundErrors. ?981 class_loader_data()->record_init_exception(java_mirror_handle(), e, THREAD); ?982???????? // Locks object, set state, and notify all waiting threads ?983 set_initialization_state_and_notify(initialization_error, THREAD); ?984 *CLEAR_PENDING_EXCEPTION*; steps 10/11: 1037?????? // Record the exception that originally caused to fail so 1038?????? // it can be chained into potential later NoClassDefFoundErrors. 1039 class_loader_data()->record_init_exception(java_mirror_handle(), e, THREAD); 1040?????? // Locks object, set state, and notify all waiting threads 1041?????? set_initialization_state_and_notify(initialization_error, THREAD); 1042 *CLEAR_PENDING_EXCEPTION*; It might be that those ignored exceptions would cause later use of those additional classes to throw NoClassDefFoundError instead of ExceptionInInitializerError (depending on whether it was an initial attempt to initialize those additional classes or not), but I can't see any other undesirable consequence. Do you? I'll try to provoke initialization errors in those additional classes to see what happens. Will get back when I have results of the experiment... Regards, Peter P.S. Executing java code as part of VM logic plays well in Jigsaw for example. If there is an acceptable fallback in case of java logic failure, everything seems to be OK. > Cheers, > David > ----- > >> Before patch: >> >> 1st attempt [ForkJoinPool.commonPool-worker-3]: >> >> java.lang.ExceptionInInitializerError >> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:20) >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >> >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >> >> Caused by: java.lang.RuntimeException: Can't get it! >> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:12) >> ???????? ... 8 more >> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >> bounds for length 0 >> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >> ???????? ... 8 more >> >> 2nd attempt [ForkJoinPool.commonPool-worker-5]: >> >> java.lang.NoClassDefFoundError: Could not initialize class >> ClinitFailure$Faulty >> ???????? at ClinitFailure.lambda$main$1(ClinitFailure.java:28) >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) >> >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >> >> >> >> After patch: >> >> 1st attempt [ForkJoinPool.commonPool-worker-3]: >> >> java.lang.ExceptionInInitializerError >> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >> >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >> >> Caused by: java.lang.RuntimeException: Can't get it! >> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >> ???????? ... 8 more >> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >> bounds for length 0 >> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:8) >> ???????? ... 8 more >> >> 2nd attempt [ForkJoinPool.commonPool-worker-5]: >> >> java.lang.NoClassDefFoundError: Could not initialize class >> ClinitFailure$Faulty >> ???????? at >> java.base/java.lang.ClassLoader.throwReinitException(ClassLoader.java:3062) >> ???????? at ClinitFailure.lambda$main$1(ClinitFailure.java:25) >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) >> >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >> ???????? at >> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >> >> ???????? at >> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >> >> Caused by: java.lang.ExceptionInInitializerError: 11 ms ago in thread >> ForkJoinPool.commonPool-worker-3 >> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >> >> ???????? at >> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >> >> ???????? ... 5 more >> Caused by: java.lang.RuntimeException: Can't get it! >> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >> ???????? ... 8 more >> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >> bounds for length 0 >> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:8) >> ???????? ... 8 more >> >> >> >> This is what gets printed by the sample program: >> >> public class ClinitFailure { >> >> ???? static class Faulty { >> ???????? static { >> ???????????? try { >> ???????????????? int i = (new int[0])[1]; >> ???????????? } catch (Exception e) { >> ???????????????? throw new RuntimeException("Can't get it!", e); >> ???????????? } >> ???????? } >> ???? } >> >> ???? public static void main(String[] args) throws Exception { >> ???????? CompletableFuture.runAsync(() -> { >> ???????????? try { >> ???????????????? new Faulty(); >> ???????????? } catch (Throwable e) { >> ???????????????? System.out.printf("\n1st attempt [%s]:\n\n", >> Thread.currentThread().getName()); >> ???????????????? e.printStackTrace(System.out); >> ???????????? } >> ???????? }).thenRunAsync(() -> { >> ???????????? try { >> ???????????????? new Faulty(); >> ???????????? } catch (Throwable e) { >> ???????????????? System.out.printf("\n2nd attempt [%s]:\n\n", >> Thread.currentThread().getName()); >> ???????????????? e.printStackTrace(System.out); >> ???????????? } >> ???????? }).join(); >> ???? } >> } >> >> >> When the following patch is applied: >> >> http://cr.openjdk.java.net/~plevart/jdk-dev/8203826_NoClassDefFoundError.cause/webrev.01/ >> >> >> >> I took Volker's patch and modified it a bit: >> >> - The logic to construct and throw NoClassDefFoundError and to record >> initial exception is in java now. It uses ClassLoaderValue >> internal API to save the chains of exception(s) for faulty classes. >> It is easier to do such logic in Java and less error prone. >> - The chain of original exception(s) is replaced with >> substitutes that mimic .toString() and .printStackTrace() methods of >> original chain, but don't reference any classes outside bootstrap >> class loader >> - The replacement chain of original exceptions adds a custom message >> insert into the top exception as a hint to the user: >> >> ???? ??? java.lang.ExceptionInInitializerError: 11 ms ago in thread >> ForkJoinPool.commonPool-worker-3 >> >> >> So, what do you think of this one? >> >> Regards, Peter >> >> From david.holmes at oracle.com Mon Jul 9 07:33:07 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Jul 2018 17:33:07 +1000 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> <6bccaebc-9a3d-ef01-a8bf-58613fcfdd9c@oracle.com> Message-ID: On 9/07/2018 5:22 PM, Peter Levart wrote: > Hi David, > > On 07/09/2018 03:37 AM, David Holmes wrote: >> Hi Peter, >> >> On 7/07/2018 2:10 AM, Peter Levart wrote: >>> Hi, >>> >>> On 07/05/2018 01:01 AM, David Holmes wrote: >>>> I dispute "they will understand this might have happened in another >>>> thread". >>> >>> What if the stack trace was like the following... >> >> Yes your suggestion makes it much clearer. >> >> But ... my whole objection here is doing all this extraneous execution >> of Java code in response to the initial exception. The more Java code >> we execute the more likely we will hit secondary exceptions and the >> greater the possibility of unintended interactions that might lead >> back to the class that can't be initialized. I just don't think this >> level of effort is warranted. I > > I agree that more classes are involved, but they are all JDK classes and > their number is constant. Meaning, if they are OK and don't fail when > initializing, there's no danger of unintended interactions that would be > caused by initialization errors in other classes. And even if those > additional needed classes had problems in their initialization, I think > that the consequences would be under control. That's not what I mean. I'm not concerned about circular initialization failures due to failing to initialize the classes used in this "hook". I'm concerned about the overall amount of Java code execution that this involves, which may trigger other exceptions (e.g. OOME) and which may incur additional logging or event generation that may in turn interact in some way with the original class being initialized. I just think this is complete overkill for addressing the perceived problem. David ----- > > Let's see what additional classes are needed when the presented patch is > used as opposed to classes needed in current logic: > > In step 7, when super class/interface initialization fails and in steps > 10/11 when the class initialization fails, we record the error thrown > (record_init_exception). In addition to previously needed classes we > also need: > - ClassLoader, > - ClassLoader$InitExceptionSubst (with dependencies: RuntimeException, > Exception), > - ClassLoaderValue (with dependencies: AbstractClassLoaderValue, > AbstractClassLoaderValue$Sub, AbstractClassLoaderValue$Memoizer, > ConcurrentHashMap + deps) > > When we throw NoClassDefFoundError, we don't need any other additional > classes that wouldn't already be needed originally. > > So I can see that when above additional classes had problems > initializing themselves, there would be errors thrown from their usage > when recording initial initialization exception of some unrelated class, > but such errors would be ignored (step 7): > > ?979???????? // Record the exception thrown from super class/interface > initialization so that > ?980???????? // it can be chained into potential later > NoClassDefFoundErrors. > ?981 class_loader_data()->record_init_exception(java_mirror_handle(), > e, THREAD); > ?982???????? // Locks object, set state, and notify all waiting threads > ?983 set_initialization_state_and_notify(initialization_error, THREAD); > ?984 *CLEAR_PENDING_EXCEPTION*; > > steps 10/11: > > 1037?????? // Record the exception that originally caused to > fail so > 1038?????? // it can be chained into potential later NoClassDefFoundErrors. > 1039 class_loader_data()->record_init_exception(java_mirror_handle(), e, > THREAD); > 1040?????? // Locks object, set state, and notify all waiting threads > 1041?????? set_initialization_state_and_notify(initialization_error, > THREAD); > 1042 *CLEAR_PENDING_EXCEPTION*; > > > It might be that those ignored exceptions would cause later use of those > additional classes to throw NoClassDefFoundError instead of > ExceptionInInitializerError (depending on whether it was an initial > attempt to initialize those additional classes or not), but I can't see > any other undesirable consequence. Do you? > > I'll try to provoke initialization errors in those additional classes to > see what happens. Will get back when I have results of the experiment... > > Regards, Peter > > > P.S. > > Executing java code as part of VM logic plays well in Jigsaw for > example. If there is an acceptable fallback in case of java logic > failure, everything seems to be OK. > >> Cheers, >> David >> ----- >> >>> Before patch: >>> >>> 1st attempt [ForkJoinPool.commonPool-worker-3]: >>> >>> java.lang.ExceptionInInitializerError >>> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:20) >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >>> >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>> >>> Caused by: java.lang.RuntimeException: Can't get it! >>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:12) >>> ???????? ... 8 more >>> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >>> bounds for length 0 >>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >>> ???????? ... 8 more >>> >>> 2nd attempt [ForkJoinPool.commonPool-worker-5]: >>> >>> java.lang.NoClassDefFoundError: Could not initialize class >>> ClinitFailure$Faulty >>> ???????? at ClinitFailure.lambda$main$1(ClinitFailure.java:28) >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) >>> >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>> >>> >>> >>> After patch: >>> >>> 1st attempt [ForkJoinPool.commonPool-worker-3]: >>> >>> java.lang.ExceptionInInitializerError >>> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >>> >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>> >>> Caused by: java.lang.RuntimeException: Can't get it! >>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >>> ???????? ... 8 more >>> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >>> bounds for length 0 >>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:8) >>> ???????? ... 8 more >>> >>> 2nd attempt [ForkJoinPool.commonPool-worker-5]: >>> >>> java.lang.NoClassDefFoundError: Could not initialize class >>> ClinitFailure$Faulty >>> ???????? at >>> java.base/java.lang.ClassLoader.throwReinitException(ClassLoader.java:3062) >>> ???????? at ClinitFailure.lambda$main$1(ClinitFailure.java:25) >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) >>> >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>> >>> ???????? at >>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>> >>> Caused by: java.lang.ExceptionInInitializerError: 11 ms ago in thread >>> ForkJoinPool.commonPool-worker-3 >>> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >>> >>> ???????? at >>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >>> >>> ???????? ... 5 more >>> Caused by: java.lang.RuntimeException: Can't get it! >>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >>> ???????? ... 8 more >>> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >>> bounds for length 0 >>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:8) >>> ???????? ... 8 more >>> >>> >>> >>> This is what gets printed by the sample program: >>> >>> public class ClinitFailure { >>> >>> ???? static class Faulty { >>> ???????? static { >>> ???????????? try { >>> ???????????????? int i = (new int[0])[1]; >>> ???????????? } catch (Exception e) { >>> ???????????????? throw new RuntimeException("Can't get it!", e); >>> ???????????? } >>> ???????? } >>> ???? } >>> >>> ???? public static void main(String[] args) throws Exception { >>> ???????? CompletableFuture.runAsync(() -> { >>> ???????????? try { >>> ???????????????? new Faulty(); >>> ???????????? } catch (Throwable e) { >>> ???????????????? System.out.printf("\n1st attempt [%s]:\n\n", >>> Thread.currentThread().getName()); >>> ???????????????? e.printStackTrace(System.out); >>> ???????????? } >>> ???????? }).thenRunAsync(() -> { >>> ???????????? try { >>> ???????????????? new Faulty(); >>> ???????????? } catch (Throwable e) { >>> ???????????????? System.out.printf("\n2nd attempt [%s]:\n\n", >>> Thread.currentThread().getName()); >>> ???????????????? e.printStackTrace(System.out); >>> ???????????? } >>> ???????? }).join(); >>> ???? } >>> } >>> >>> >>> When the following patch is applied: >>> >>> http://cr.openjdk.java.net/~plevart/jdk-dev/8203826_NoClassDefFoundError.cause/webrev.01/ >>> >>> >>> >>> I took Volker's patch and modified it a bit: >>> >>> - The logic to construct and throw NoClassDefFoundError and to record >>> initial exception is in java now. It uses ClassLoaderValue >>> internal API to save the chains of exception(s) for faulty classes. >>> It is easier to do such logic in Java and less error prone. >>> - The chain of original exception(s) is replaced with >>> substitutes that mimic .toString() and .printStackTrace() methods of >>> original chain, but don't reference any classes outside bootstrap >>> class loader >>> - The replacement chain of original exceptions adds a custom message >>> insert into the top exception as a hint to the user: >>> >>> ???? ??? java.lang.ExceptionInInitializerError: 11 ms ago in thread >>> ForkJoinPool.commonPool-worker-3 >>> >>> >>> So, what do you think of this one? >>> >>> Regards, Peter >>> >>> > From peter.levart at gmail.com Mon Jul 9 07:51:41 2018 From: peter.levart at gmail.com (Peter Levart) Date: Mon, 9 Jul 2018 09:51:41 +0200 Subject: RFR(M): 8203826: Chain class initialization exceptions into later NoClassDefFoundErrors In-Reply-To: References: <06ed3db2-e98c-014b-564a-6080dec06837@oracle.com> <75e66ebc9ebe475d8c8fbcdba4722138@sap.com> <6bccaebc-9a3d-ef01-a8bf-58613fcfdd9c@oracle.com> Message-ID: <55f3a8df-3a9e-84e1-f729-ec5a0d119360@gmail.com> Hi David, On 07/09/2018 09:33 AM, David Holmes wrote: > On 9/07/2018 5:22 PM, Peter Levart wrote: >> Hi David, >> >> On 07/09/2018 03:37 AM, David Holmes wrote: >>> Hi Peter, >>> >>> On 7/07/2018 2:10 AM, Peter Levart wrote: >>>> Hi, >>>> >>>> On 07/05/2018 01:01 AM, David Holmes wrote: >>>>> I dispute "they will understand this might have happened in >>>>> another thread". >>>> >>>> What if the stack trace was like the following... >>> >>> Yes your suggestion makes it much clearer. >>> >>> But ... my whole objection here is doing all this extraneous >>> execution of Java code in response to the initial exception. The >>> more Java code we execute the more likely we will hit secondary >>> exceptions and the greater the possibility of unintended >>> interactions that might lead back to the class that can't be >>> initialized. I just don't think this level of effort is warranted. I >> >> I agree that more classes are involved, but they are all JDK classes >> and their number is constant. Meaning, if they are OK and don't fail >> when initializing, there's no danger of unintended interactions that >> would be caused by initialization errors in other classes. And even >> if those additional needed classes had problems in their >> initialization, I think that the consequences would be under control. > > That's not what I mean. I'm not concerned about circular > initialization failures due to failing to initialize the classes used > in this "hook". I'm concerned about the overall amount of Java code > execution that this involves, which may trigger other exceptions (e.g. > OOME) and which may incur additional logging or event generation that > may in turn interact in some way with the original class being > initialized. > I still can't see what you see. If java code that records initial initialization error throws OOME, it will be ignored and initial exception will not be recorded. Later NoClassDefFoundError would not contain the nice chain of causes, but that's the only undesirable consequence. The code doesn't log OOME, it simply ignores it. If those additional exceptions cause any events on the VM level and those events interact in some way with the original class being initialized, this interaction will fail, but such interaction with original class would fail even if it was caused by anything else, because it is the original class that had problems initializing itself in the 1st place. > I just think this is complete overkill for addressing the perceived > problem. Perhaps. But it would be nice to have. Regards, Peter > > David > ----- > >> >> Let's see what additional classes are needed when the presented patch >> is used as opposed to classes needed in current logic: >> >> In step 7, when super class/interface initialization fails and in >> steps 10/11 when the class initialization fails, we record the error >> thrown (record_init_exception). In addition to previously needed >> classes we also need: >> - ClassLoader, >> - ClassLoader$InitExceptionSubst (with dependencies: >> RuntimeException, Exception), >> - ClassLoaderValue (with dependencies: AbstractClassLoaderValue, >> AbstractClassLoaderValue$Sub, AbstractClassLoaderValue$Memoizer, >> ConcurrentHashMap + deps) >> >> When we throw NoClassDefFoundError, we don't need any other >> additional classes that wouldn't already be needed originally. >> >> So I can see that when above additional classes had problems >> initializing themselves, there would be errors thrown from their >> usage when recording initial initialization exception of some >> unrelated class, but such errors would be ignored (step 7): >> >> ??979???????? // Record the exception thrown from super >> class/interface initialization so that >> ??980???????? // it can be chained into potential later >> NoClassDefFoundErrors. >> ??981 >> class_loader_data()->record_init_exception(java_mirror_handle(), e, >> THREAD); >> ??982???????? // Locks object, set state, and notify all waiting threads >> ??983 set_initialization_state_and_notify(initialization_error, THREAD); >> ??984 *CLEAR_PENDING_EXCEPTION*; >> >> steps 10/11: >> >> 1037?????? // Record the exception that originally caused to >> fail so >> 1038?????? // it can be chained into potential later >> NoClassDefFoundErrors. >> 1039 class_loader_data()->record_init_exception(java_mirror_handle(), >> e, THREAD); >> 1040?????? // Locks object, set state, and notify all waiting threads >> 1041 set_initialization_state_and_notify(initialization_error, THREAD); >> 1042 *CLEAR_PENDING_EXCEPTION*; >> >> >> It might be that those ignored exceptions would cause later use of >> those additional classes to throw NoClassDefFoundError instead of >> ExceptionInInitializerError (depending on whether it was an initial >> attempt to initialize those additional classes or not), but I can't >> see any other undesirable consequence. Do you? >> >> I'll try to provoke initialization errors in those additional classes >> to see what happens. Will get back when I have results of the >> experiment... >> >> Regards, Peter >> >> >> P.S. >> >> Executing java code as part of VM logic plays well in Jigsaw for >> example. If there is an acceptable fallback in case of java logic >> failure, everything seems to be OK. >> >>> Cheers, >>> David >>> ----- >>> >>>> Before patch: >>>> >>>> 1st attempt [ForkJoinPool.commonPool-worker-3]: >>>> >>>> java.lang.ExceptionInInitializerError >>>> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:20) >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>>> >>>> Caused by: java.lang.RuntimeException: Can't get it! >>>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:12) >>>> ???????? ... 8 more >>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >>>> bounds for length 0 >>>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >>>> ???????? ... 8 more >>>> >>>> 2nd attempt [ForkJoinPool.commonPool-worker-5]: >>>> >>>> java.lang.NoClassDefFoundError: Could not initialize class >>>> ClinitFailure$Faulty >>>> ???????? at ClinitFailure.lambda$main$1(ClinitFailure.java:28) >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>>> >>>> >>>> >>>> After patch: >>>> >>>> 1st attempt [ForkJoinPool.commonPool-worker-3]: >>>> >>>> java.lang.ExceptionInInitializerError >>>> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>>> >>>> Caused by: java.lang.RuntimeException: Can't get it! >>>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >>>> ???????? ... 8 more >>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >>>> bounds for length 0 >>>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:8) >>>> ???????? ... 8 more >>>> >>>> 2nd attempt [ForkJoinPool.commonPool-worker-5]: >>>> >>>> java.lang.NoClassDefFoundError: Could not initialize class >>>> ClinitFailure$Faulty >>>> ???????? at >>>> java.base/java.lang.ClassLoader.throwReinitException(ClassLoader.java:3062) >>>> ???????? at ClinitFailure.lambda$main$1(ClinitFailure.java:25) >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) >>>> >>>> Caused by: java.lang.ExceptionInInitializerError: 11 ms ago in >>>> thread ForkJoinPool.commonPool-worker-3 >>>> ???????? at ClinitFailure.lambda$main$0(ClinitFailure.java:18) >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) >>>> >>>> ???????? at >>>> java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1728) >>>> >>>> ???????? ... 5 more >>>> Caused by: java.lang.RuntimeException: Can't get it! >>>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:10) >>>> ???????? ... 8 more >>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1 out of >>>> bounds for length 0 >>>> ???????? at ClinitFailure$Faulty.(ClinitFailure.java:8) >>>> ???????? ... 8 more >>>> >>>> >>>> >>>> This is what gets printed by the sample program: >>>> >>>> public class ClinitFailure { >>>> >>>> ???? static class Faulty { >>>> ???????? static { >>>> ???????????? try { >>>> ???????????????? int i = (new int[0])[1]; >>>> ???????????? } catch (Exception e) { >>>> ???????????????? throw new RuntimeException("Can't get it!", e); >>>> ???????????? } >>>> ???????? } >>>> ???? } >>>> >>>> ???? public static void main(String[] args) throws Exception { >>>> ???????? CompletableFuture.runAsync(() -> { >>>> ???????????? try { >>>> ???????????????? new Faulty(); >>>> ???????????? } catch (Throwable e) { >>>> ???????????????? System.out.printf("\n1st attempt [%s]:\n\n", >>>> Thread.currentThread().getName()); >>>> ???????????????? e.printStackTrace(System.out); >>>> ???????????? } >>>> ???????? }).thenRunAsync(() -> { >>>> ???????????? try { >>>> ???????????????? new Faulty(); >>>> ???????????? } catch (Throwable e) { >>>> ???????????????? System.out.printf("\n2nd attempt [%s]:\n\n", >>>> Thread.currentThread().getName()); >>>> ???????????????? e.printStackTrace(System.out); >>>> ???????????? } >>>> ???????? }).join(); >>>> ???? } >>>> } >>>> >>>> >>>> When the following patch is applied: >>>> >>>> http://cr.openjdk.java.net/~plevart/jdk-dev/8203826_NoClassDefFoundError.cause/webrev.01/ >>>> >>>> >>>> >>>> I took Volker's patch and modified it a bit: >>>> >>>> - The logic to construct and throw NoClassDefFoundError and to >>>> record initial exception is in java now. It uses >>>> ClassLoaderValue internal API to save the chains of exception(s) >>>> for faulty classes. It is easier to do such logic in Java and less >>>> error prone. >>>> - The chain of original exception(s) is replaced with >>>> substitutes that mimic .toString() and .printStackTrace() methods >>>> of original chain, but don't reference any classes outside >>>> bootstrap class loader >>>> - The replacement chain of original exceptions adds a custom >>>> message insert into the top exception as a hint to the user: >>>> >>>> ???? ??? java.lang.ExceptionInInitializerError: 11 ms ago in thread >>>> ForkJoinPool.commonPool-worker-3 >>>> >>>> >>>> So, what do you think of this one? >>>> >>>> Regards, Peter >>>> >>>> >> From gunter.haug at sap.com Mon Jul 9 10:31:28 2018 From: gunter.haug at sap.com (Haug, Gunter) Date: Mon, 9 Jul 2018 10:31:28 +0000 Subject: RFR(S): 8206408: Add missing CPU/system info to vm_version_ext on PPC64 In-Reply-To: References: <7F040F83-7B83-493C-8DFB-059509A55272@sap.com> Message-ID: <74DCB0F2-22CC-4753-AA38-D7CD0E2ECEFA@sap.com> Hi Martin and Volker, Thanks for your reviews! I've incorporated your suggestions in an updated version. To answer your question: Is it guaranteed, that PowerArchitecturePPC64 and VM_Version::_features_strings will be always initialized before they are called from VM_Version_Ext::initialize_cpu_information ? Yes, it is. This is done at the very beginning of the initialization of the VM. Here is the updated webrev: http://cr.openjdk.java.net/~ghaug/webrevs/8206408.v1 Maybe one of you could be so kind and push the change? Thanks, Gunter ?On 06.07.18, 16:15, "Volker Simonis" wrote: Hi Gunter, in general, your change looks good! Is it guaranteed, that PowerArchitecturePPC64 and VM_Version::_features_strings will be always initialized before they are called from VM_Version_Ext::initialize_cpu_information ? And finally, I'm wondering why you are using "CPU_TYPE_DESC_BUF_SIZE - 1" as the length argument in the first snprintf() call. Wouldn't "CPU_TYPE_DESC_BUF_SIZE" be just fine like in the second call where you are using "CPU_DETAILED_DESC_BUF_SIZE". Thank you and best regards, Volker On Fri, Jul 6, 2018 at 2:51 PM, Haug, Gunter wrote: > Hi all, > > can I please have reviews and a sponsor for the following tiny fix: > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8206408 > http://cr.openjdk.java.net/~ghaug/webrevs/8206408 > > The solution is not really accurate as there is no obvious way to detect the number of cores/slots on a PPC64 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. > > Thanks and best regards, > Gunter > From coleen.phillimore at oracle.com Mon Jul 9 11:32:25 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 07:32:25 -0400 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: References: Message-ID: <2a218f81-4ecc-ad6a-3110-80978e17473d@oracle.com> On 7/8/18 9:20 PM, David Holmes wrote: > Hi Coleen, > > On 7/07/2018 5:41 AM, coleen.phillimore at oracle.com wrote: >> Summary: Only fetch Node::next once and use that result. >> >> A racing thread could NULL next->next()->next().? The Node itself is >> stable until the write_synchronize() but the pointers may be updated. >> See bug for more detail. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8206471 > > The change looks good. > > Could there be a similar race at: > > ?552?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); > ?553?????? rem_n = rem_n->next(); > > Even if not, it is marginally more performant to only do the > load-acquire once. > > Similarly: > > ?663 new_table->get_bucket(odd_index)->release_assign_node_ptr(odd, > ?664 aux->next()); > ?665 new_table->get_bucket(even_index)->release_assign_node_ptr(even, > ?666 aux->next()); > > combined with: > > ?685???? aux = aux->next(); > > makes for 3 load-acquire (and 2 if we take the else at line #675). > > And again: > > ?982?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); > ?983?????? rem_n = rem_n->next(); Thank you for noticing these other double loads.? I'll study them to see if there's a race or see if there's a reason they have double loads, but I'll change them unless there is a reason not to. Thanks! Coleen > > Thanks, > David > ----- > >> Tested with SymbolTable changes and tests that failed.? Also tested >> with mach5 hs-tier1-5 (in progress). >> >> This is actually Robbin's fix, and my review is that it looks good. >> >> Thanks, >> Coleen From zgu at redhat.com Mon Jul 9 11:45:01 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 9 Jul 2018 07:45:01 -0400 Subject: RFR(S) 8206183: Possible construct EMPTY_STACK and allocation stack, etc. on first use In-Reply-To: <2cd7efd8-d6d2-0a65-d87b-b264d8bd3970@oracle.com> References: <2cd7efd8-d6d2-0a65-d87b-b264d8bd3970@oracle.com> Message-ID: <9503ded0-bc68-543b-a1ab-6d884854dc9a@redhat.com> Hi David, On 07/09/2018 12:37 AM, David Holmes wrote: > Hi Zhengyu, > > On 7/07/2018 9:36 PM, Zhengyu Gu wrote: >> Hi, >> >> NMT has to workaround static initialization order issues: some of >> static objects, who allocate memory inside their constructors, may be >> initialized ahead of NMT, so NMT is forced to initialize itself early >> and risks its static objects may be reinitialized by C runtime. >> >> The workaround was to declare storage for the static objects as >> primitive arrays, then use placement new operator to initialize them, >> or just initialize them eagerly, if the results are constants. >> >> But the solution is not elegant, could break with some compilers. >> A better solution is to use "construct on First Use Idiom" pattern >> (https://isocpp.org/wiki/faq/ctors#static-init-order), cause we only >> have initialization order problems, those static objects do not have >> dependencies on other static objects, so we don't suffer from static >> deinitialization problems. > > Okay but this relies on C+11 thread-safe static initialization. That's > only available in VS2015 and above (which should be okay for JDK 12+). > What about other compilers? Does it have to be enabled via any > compilation flags? Thanks for pointing out. NMT is always initialized while JVM is still in single-thread mode, so I think it is safe even without language support, or I miss something here? -Zhengyu > > I'm currently running this through some additional internal build/tests. > > Thanks, > David > ----- > >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206183 >> Webrev: http://cr.openjdk.java.net/~zgu/8206183/webrev.00/ >> >> Test: >> >> ?? hotspot_nmt on Linux 64 (fastdebug and release) >> ?? Submit-test. >> >> >> Thanks, >> >> -Zhengyu From goetz.lindenmaier at sap.com Mon Jul 9 13:21:06 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 9 Jul 2018 13:21:06 +0000 Subject: RFR(S): 8206459: [s390] Prevent restoring incorrect bcp and locals in interpreter and avoid incorrect size of partialSubtypeCheckNode in C2 In-Reply-To: References: Message-ID: <0bdefbf94a624e7abf88ee87a5503519@sap.com> Hi Martin, good catch, looks good. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Freitag, 6. Juli 2018 14:59 > To: hotspot-runtime-dev at openjdk.java.net; Lindenmaier, Goetz > > Subject: RFR(S): 8206459: [s390] Prevent restoring incorrect bcp and locals in > interpreter and avoid incorrect size of partialSubtypeCheckNode in C2 > > Hi, > > > > TestInterfaceMethodSelection has shown a bug in the template interpreter > on s390. Restore functions for locals (R12 = Z_tmp_3) and bcp (R13 = > Z_tmp_4) are used without having saved the correct values. > > In addition, C2 currently uses a constant size for partialSubtypeCheckNode > which uses load_const_optimized with variable size. > > > > We can simply preserve these 2 registers and remove the restore function > calls. Webrev: > > http://cr.openjdk.java.net/~mdoerr/8206459_s390_fixes/webrev.00/ > > > > Please review. > > > > Best regards, > > Martin > > From volker.simonis at gmail.com Mon Jul 9 14:06:00 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 9 Jul 2018 16:06:00 +0200 Subject: RFR(S): 8206408: Add missing CPU/system info to vm_version_ext on PPC64 In-Reply-To: <74DCB0F2-22CC-4753-AA38-D7CD0E2ECEFA@sap.com> References: <7F040F83-7B83-493C-8DFB-059509A55272@sap.com> <74DCB0F2-22CC-4753-AA38-D7CD0E2ECEFA@sap.com> Message-ID: Thanks, looks good now! Regards, Volker On Mon, Jul 9, 2018 at 12:31 PM, Haug, Gunter wrote: > Hi Martin and Volker, > > Thanks for your reviews! I've incorporated your suggestions in an updated version. > To answer your question: > > Is it guaranteed, that PowerArchitecturePPC64 and > VM_Version::_features_strings will be always initialized before they > are called from VM_Version_Ext::initialize_cpu_information ? > > Yes, it is. This is done at the very beginning of the initialization of the VM. > > Here is the updated webrev: > > http://cr.openjdk.java.net/~ghaug/webrevs/8206408.v1 > > Maybe one of you could be so kind and push the change? > > Thanks, > Gunter > > > ?On 06.07.18, 16:15, "Volker Simonis" wrote: > > Hi Gunter, > > in general, your change looks good! > > Is it guaranteed, that PowerArchitecturePPC64 and > VM_Version::_features_strings will be always initialized before they > are called from VM_Version_Ext::initialize_cpu_information ? > > And finally, I'm wondering why you are using "CPU_TYPE_DESC_BUF_SIZE - > 1" as the length argument in the first snprintf() call. Wouldn't > "CPU_TYPE_DESC_BUF_SIZE" be just fine like in the second call where > you are using "CPU_DETAILED_DESC_BUF_SIZE". > > Thank you and best regards, > Volker > > > On Fri, Jul 6, 2018 at 2:51 PM, Haug, Gunter wrote: > > Hi all, > > > > can I please have reviews and a sponsor for the following tiny fix: > > > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8206408 > > http://cr.openjdk.java.net/~ghaug/webrevs/8206408 > > > > The solution is not really accurate as there is no obvious way to detect the number of cores/slots on a PPC64 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. > > > > Thanks and best regards, > > Gunter > > > > From coleen.phillimore at oracle.com Mon Jul 9 17:13:18 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 13:13:18 -0400 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: <2a218f81-4ecc-ad6a-3110-80978e17473d@oracle.com> References: <2a218f81-4ecc-ad6a-3110-80978e17473d@oracle.com> Message-ID: <515e0337-0500-5599-8940-d2b54d8c7b6e@oracle.com> On 7/9/18 7:32 AM, coleen.phillimore at oracle.com wrote: > > On 7/8/18 9:20 PM, David Holmes wrote: >> Hi Coleen, >> >> On 7/07/2018 5:41 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Only fetch Node::next once and use that result. >>> >>> A racing thread could NULL next->next()->next().? The Node itself is >>> stable until the write_synchronize() but the pointers may be >>> updated. See bug for more detail. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8206471 >> >> The change looks good. >> >> Could there be a similar race at: >> >> ?552?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); >> ?553?????? rem_n = rem_n->next(); >> Probably not because this instance has the bucket lock, but I'll change it anyway. >> Even if not, it is marginally more performant to only do the >> load-acquire once. >> >> Similarly: >> >> ?663 new_table->get_bucket(odd_index)->release_assign_node_ptr(odd, >> ?664 aux->next()); >> ?665 new_table->get_bucket(even_index)->release_assign_node_ptr(even, >> ?666 aux->next()); >> >> combined with: >> >> ?685???? aux = aux->next(); >> >> makes for 3 load-acquire (and 2 if we take the else at line #675). >> >> And again: >> >> ?982?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); >> ?983?????? rem_n = rem_n->next(); > I believe the value of next() is stable in all these cases, and it's fine to only load once. open webrev at http://cr.openjdk.java.net/~coleenp/8206471.02/webrev Ran the extensive gtests that Robbin wrote to cover zipping and unzipping the hashtable and rerunning hs-tier1,2. Thanks, Coleen > Thank you for noticing these other double loads.? I'll study them to > see if there's a race or see if there's a reason they have double > loads, but I'll change them unless there is a reason not to. > > Thanks! > Coleen > >> >> Thanks, >> David >> ----- >> >>> Tested with SymbolTable changes and tests that failed.? Also tested >>> with mach5 hs-tier1-5 (in progress). >>> >>> This is actually Robbin's fix, and my review is that it looks good. >>> >>> Thanks, >>> Coleen > From calvin.cheung at oracle.com Mon Jul 9 17:29:51 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Mon, 09 Jul 2018 10:29:51 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() Message-ID: <5B439B8F.3020709@oracle.com> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ The JVM crash could be simulated by renaming/removing the modules file under the jdk/lib directory. The proposed simple fix is to perform a vm_exit_during_initialization(). Ran hs-tier{1,2,3} tests successfully including the new test case. thanks, Calvin From coleen.phillimore at oracle.com Mon Jul 9 17:48:18 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 13:48:18 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options Message-ID: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> Summary: Convert PrintSafepointStatistics to UL open webrev at http://cr.openjdk.java.net/~coleenp/8198720.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8198720 See bug and linked bug to deprecate the option for more information.? Also see bug for output (it's too wide to print here). Tested with mach5 hs-tier1-5. Thanks, Coleen From shade at redhat.com Mon Jul 9 17:58:47 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 9 Jul 2018 19:58:47 +0200 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> Message-ID: <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: > Summary: Convert PrintSafepointStatistics to UL > > open webrev at http://cr.openjdk.java.net/~coleenp/8198720.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8198720 The synopsis is misleading: it is not only obsoleting PrintSafepoint* options, it also reformats the output! We did JDK-8180482 not that long ago, and the reason was that both people and machine tools are accustomed to the particular non-noisy format for that table. I am not at all convinced that proposed format [2] is better than current version [3]. Can we keep (at least some resemblance of) the old format, please? -Aleksey [1] https://bugs.openjdk.java.net/browse/JDK-8180482 [2] https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging [3] http://cr.openjdk.java.net/~shade/8180482/after.txt From chris.plummer at oracle.com Mon Jul 9 18:22:46 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 9 Jul 2018 11:22:46 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> Message-ID: <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> Hi David, Would it be better to problem list this test on solaris using JDK-8156708. That way when JDK-8156708 is fixed it can come off the problem list and start executing on solaris. thanks, Chris On 7/8/18 4:58 PM, David Holmes wrote: > tl;dr skip the new regression test on Solaris > > New webrev: > > http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ > > This excludes the test from running on Solaris, so the makefile > doesn't bother compiling this native test and the Java part of the > test adds: > > ! * @requires os.family != "windows" & os.family != "solaris" > ? * @summary Basic test of Thread and ThreadMXBean queries on a natively > ? *????????? attached thread that has failed to detach before > terminating. > + * @comment The native code only supports POSIX so no windows > testing; also > + *????????? we have to skip solaris as a terminating thread that > fails to > + *????????? detach will hit an infinite loop due to TLS destructor > issues - see > + *????????? comments in JDK-8156708 > > Note this means that Solaris is not affected by the original issue > because a still-attached native thread can't actually terminate due to > the TLS destructor infinite-loop issue. > > Thanks, > David > > On 6/07/2018 6:07 PM, David Holmes wrote: >> The new test is hanging on Solaris. I just discovered we don't >> run these tests on Solaris until tier4. >> >> David >> >> On 6/07/2018 8:40 AM, David Holmes wrote: >>> Hi Chris, >>> >>> Thanks for looking at this. >>> >>> Updated webrev: >>> >>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>> >>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>> >>> More below ... >>> >>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Solaris problems aside, overall it looks fine. Some minor things I >>>> noted: >>>> >>>> I noticed that exitCode is never modified in agentA() or agentB(), >>>> so there isn't much point to having it. If you reach the bottom of >>>> the function, it passed, so PASSED can be returned. The code would >>>> be more clear if it did this. As-is it is implied that you can >>>> reach the bottom when it fails. >>> >>> I resisted any and all urges to do any kind of unrelated code >>> cleanup in the tests - once you start you may end up doing a full >>> rewrite. >>> >>>> Is detaching the threads along the failure paths really needed? >>>> exit() is called, so this would seem to make it unnecessary. >>> >>> You're right that isn't necessary. I'll remove the changes from >>> before the exits in ji05t001.c >>> >>>> I prefer assignments not to be embedded inside the "if" condition. >>>> The DetachCurrentThread code in THREAD_return() is much more >>>> readable than the similar code in agentA() and agentB(). >>> >>> It's an existing style already used in that test e.g. >>> >>> ??287???? if ((res = >>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) >>> != 0) { >>> >>> and I don't mind it, so I'd prefer not to change it. >>> >>>> In the test: >>>> >>>> ?? 54???????? // Generally as long as we don't crash of throw >>>> unexpected >>>> ?? 55???????? // exceptions then the test passes. In some cases we >>>> know exactly >>>> >>>> "of" should be "or". >>> >>> Well spotted. Thanks. >>> >>>> Shouldn't you be catching exceptions for all the Thread methods you >>>> are calling? Otherwise the test will exit if one is thrown, and the >>>> above comment indicates that you don't want this. >>> >>> I'm not expecting there to be any exceptions from any of the called >>> methods. That would potentially indicate a problem in handling the >>> terminated native thread, so would indicate a test failure. >>> >>>> Don't we normally put these tests in a package? >>> >>> Doesn't seem to be any hard and fast rule. I only uses packages when >>> they are important for the test. In runtime we have 905 java files >>> and only 116 have a package statement. It varies elsewhere. >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>> Solaris compiler complains about doing a return from inside >>>>> a do-while loop. I'll have to rework part of the fix tomorrow. >>>>> >>>>> David >>>>> >>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>> >>>>>> Problem: >>>>>> >>>>>> The tests create native threads that attach to the VM through >>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>> encounter the threads that have terminated already the low level >>>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>>>> expect that and so fails an assert in debug mode and can SEGV in >>>>>> product mode. >>>>>> >>>>>> Solution: >>>>>> >>>>>> Serviceability-side: fix the tests >>>>>> >>>>>> Change the tests so that the threads detach before terminating. >>>>>> The two tests are (surprisingly) written in completely different >>>>>> styles, so the solution also takes on two different styles. >>>>>> >>>>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>>>> threads that terminate before detaching, and add a regression test >>>>>> >>>>>> I took a good look at the low-level code for interacting with >>>>>> arbitrary threads and as far as I can see the problem only exists >>>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere >>>>>> the potential for a library call failure just reports an error >>>>>> value (such as -1 for the cpu time used). >>>>>> >>>>>> So the fix is simply to allow for ESRCH when calling >>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>>>> >>>>>> I created a new regression test to create a new native thread, >>>>>> attach it and then let it terminate while still attached. The >>>>>> java code then calls various Thread and ThreadMXBean functions on >>>>>> it to ensure there are no crashes or unexpected exceptions. >>>>>> >>>>>> Testing: >>>>>> ??- old tests with fixed run-time >>>>>> ??- old run-time with fixed tests >>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>> enable Flight recorder for the tests) [in progress] >>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>> ??- new regression test >>>>>> >>>>>> Thanks, >>>>>> David >>>> >>>> >>>> From coleen.phillimore at oracle.com Mon Jul 9 18:35:47 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 14:35:47 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> Message-ID: <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> Okay, somehow the columns of numbers didn't look very useful on my screen to me, and I wanted to convert this to UL (and straighten out the logic), so that's why I made this change.?? I asked around internally to see which people would care about the format change and didn't find anyone specific.? Now I know! Let me rework this to use UL but keep the table. I'll withdraw this change for now. Thank you for the quick feedback. Coleen On 7/9/18 1:58 PM, Aleksey Shipilev wrote: > On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: >> Summary: Convert PrintSafepointStatistics to UL >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8198720.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8198720 > The synopsis is misleading: it is not only obsoleting PrintSafepoint* options, it also reformats the > output! > > We did JDK-8180482 not that long ago, and the reason was that both people and machine tools are > accustomed to the particular non-noisy format for that table. I am not at all convinced that > proposed format [2] is better than current version [3]. Can we keep (at least some resemblance of) > the old format, please? > > -Aleksey > > [1] https://bugs.openjdk.java.net/browse/JDK-8180482 > [2] https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging > [3] http://cr.openjdk.java.net/~shade/8180482/after.txt > From lois.foltan at oracle.com Mon Jul 9 18:58:48 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 9 Jul 2018 14:58:48 -0400 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B439B8F.3020709@oracle.com> References: <5B439B8F.3020709@oracle.com> Message-ID: On 7/9/2018 1:29 PM, Calvin Cheung wrote: > bug: https://bugs.openjdk.java.net/browse/JDK-8205946 > > webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ > > The JVM crash could be simulated by renaming/removing the modules file > under the jdk/lib directory. > The proposed simple fix is to perform a vm_exit_during_initialization(). Hi Calvin, Some clarifying questions.? Is this just an issue for exploded builds?? I would prefer the exit to occur if the os::stat() fails for the system class path in os::set_boot_path().? With certainly an added assert later in ClassLoader::setup_bootstrap_search_path() to ensure that the system class path is never NULL. Thanks, Lois > > Ran hs-tier{1,2,3} tests successfully including the new test case. > > thanks, > Calvin From shade at redhat.com Mon Jul 9 20:08:22 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 9 Jul 2018 22:08:22 +0200 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> Message-ID: <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> Thank you! Most latency-savvy folks "out there" run with some sort of safepointing profiling, which in many cases include PrintSafepointStatistics tables. -Aleksey On 07/09/2018 08:35 PM, coleen.phillimore at oracle.com wrote: > > Okay, somehow the columns of numbers didn't look very useful on my screen to me, and I wanted to > convert this to UL (and straighten out the logic), so that's why I made this change.?? I asked > around internally to see which people would care about the format change and didn't find anyone > specific.? Now I know! > > Let me rework this to use UL but keep the table. > > I'll withdraw this change for now. > > Thank you for the quick feedback. > Coleen > > On 7/9/18 1:58 PM, Aleksey Shipilev wrote: >> On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Convert PrintSafepointStatistics to UL >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8198720.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8198720 >> The synopsis is misleading: it is not only obsoleting PrintSafepoint* options, it also reformats the >> output! >> >> We did JDK-8180482 not that long ago, and the reason was that both people and machine tools are >> accustomed to the particular non-noisy format for that table. I am not at all convinced that >> proposed format [2] is better than current version [3]. Can we keep (at least some resemblance of) >> the old format, please? >> >> -Aleksey >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8180482 >> [2] https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging >> [3] http://cr.openjdk.java.net/~shade/8180482/after.txt >> > From calvin.cheung at oracle.com Mon Jul 9 20:43:20 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Mon, 09 Jul 2018 13:43:20 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: References: <5B439B8F.3020709@oracle.com> Message-ID: <5B43C8E8.2060206@oracle.com> Hi Lois, Thanks for your review. On 7/9/18, 11:58 AM, Lois Foltan wrote: > On 7/9/2018 1:29 PM, Calvin Cheung wrote: > >> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >> >> The JVM crash could be simulated by renaming/removing the modules >> file under the jdk/lib directory. >> The proposed simple fix is to perform a vm_exit_during_initialization(). > > Hi Calvin, > > Some clarifying questions. Is this just an issue for exploded builds? I don't think so. As mentioned above, I could reproduce the crash with a regular jdk image build by renaming the modules file under the jdk/lib directory. > I would prefer the exit to occur if the os::stat() fails for the > system class path in os::set_boot_path(). Instead of exiting in os::set_boot_path(), how about checking the return status of os::set_boot_path() in the caller and exiting there like the following: bash-4.2$ hg diff os_linux.cpp diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp --- a/src/hotspot/os/linux/os_linux.cpp +++ b/src/hotspot/os/linux/os_linux.cpp @@ -367,7 +367,9 @@ } } Arguments::set_java_home(buf); - set_boot_path('/', ':'); + if (!set_boot_path('/', ':')) { + vm_exit_during_initialization("Failed setting boot class path.", NULL); + } } Note that before the above change, the return status of set_boot_path() isn't checked. The above would involve changing 5 of those os_*.cpp files, one for each O/S. > With certainly an added assert later in > ClassLoader::setup_bootstrap_search_path() to ensure that the system > class path is never NULL. Sure, I can add an assert there. I'll post updated webrev once I've made the change and done testing. thanks, Calvin > > Thanks, > Lois > >> >> Ran hs-tier{1,2,3} tests successfully including the new test case. >> >> thanks, >> Calvin > From david.holmes at oracle.com Mon Jul 9 21:41:02 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 07:41:02 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> Message-ID: <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Hi Chris, On 10/07/2018 4:22 AM, Chris Plummer wrote: > Hi David, > > Would it be better to problem list this test on solaris using > JDK-8156708. That way when JDK-8156708 is fixed it can come off the > problem list and start executing on solaris. JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could only fix this for VM created threads. The general problem of TLS destructors looping if a thread terminates without detaching from the VM is not solvable - other than by not using TLS in the VM. Thanks, David > thanks, > > Chris > > On 7/8/18 4:58 PM, David Holmes wrote: >> tl;dr skip the new regression test on Solaris >> >> New webrev: >> >> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >> >> This excludes the test from running on Solaris, so the makefile >> doesn't bother compiling this native test and the Java part of the >> test adds: >> >> ! * @requires os.family != "windows" & os.family != "solaris" >> ? * @summary Basic test of Thread and ThreadMXBean queries on a natively >> ? *????????? attached thread that has failed to detach before >> terminating. >> + * @comment The native code only supports POSIX so no windows >> testing; also >> + *????????? we have to skip solaris as a terminating thread that >> fails to >> + *????????? detach will hit an infinite loop due to TLS destructor >> issues - see >> + *????????? comments in JDK-8156708 >> >> Note this means that Solaris is not affected by the original issue >> because a still-attached native thread can't actually terminate due to >> the TLS destructor infinite-loop issue. >> >> Thanks, >> David >> >> On 6/07/2018 6:07 PM, David Holmes wrote: >>> The new test is hanging on Solaris. I just discovered we don't >>> run these tests on Solaris until tier4. >>> >>> David >>> >>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> Thanks for looking at this. >>>> >>>> Updated webrev: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>> >>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>> >>>> More below ... >>>> >>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Solaris problems aside, overall it looks fine. Some minor things I >>>>> noted: >>>>> >>>>> I noticed that exitCode is never modified in agentA() or agentB(), >>>>> so there isn't much point to having it. If you reach the bottom of >>>>> the function, it passed, so PASSED can be returned. The code would >>>>> be more clear if it did this. As-is it is implied that you can >>>>> reach the bottom when it fails. >>>> >>>> I resisted any and all urges to do any kind of unrelated code >>>> cleanup in the tests - once you start you may end up doing a full >>>> rewrite. >>>> >>>>> Is detaching the threads along the failure paths really needed? >>>>> exit() is called, so this would seem to make it unnecessary. >>>> >>>> You're right that isn't necessary. I'll remove the changes from >>>> before the exits in ji05t001.c >>>> >>>>> I prefer assignments not to be embedded inside the "if" condition. >>>>> The DetachCurrentThread code in THREAD_return() is much more >>>>> readable than the similar code in agentA() and agentB(). >>>> >>>> It's an existing style already used in that test e.g. >>>> >>>> ??287???? if ((res = >>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) >>>> != 0) { >>>> >>>> and I don't mind it, so I'd prefer not to change it. >>>> >>>>> In the test: >>>>> >>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>> unexpected >>>>> ?? 55???????? // exceptions then the test passes. In some cases we >>>>> know exactly >>>>> >>>>> "of" should be "or". >>>> >>>> Well spotted. Thanks. >>>> >>>>> Shouldn't you be catching exceptions for all the Thread methods you >>>>> are calling? Otherwise the test will exit if one is thrown, and the >>>>> above comment indicates that you don't want this. >>>> >>>> I'm not expecting there to be any exceptions from any of the called >>>> methods. That would potentially indicate a problem in handling the >>>> terminated native thread, so would indicate a test failure. >>>> >>>>> Don't we normally put these tests in a package? >>>> >>>> Doesn't seem to be any hard and fast rule. I only uses packages when >>>> they are important for the test. In runtime we have 905 java files >>>> and only 116 have a package statement. It varies elsewhere. >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>> Solaris compiler complains about doing a return from inside >>>>>> a do-while loop. I'll have to rework part of the fix tomorrow. >>>>>> >>>>>> David >>>>>> >>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>> >>>>>>> Problem: >>>>>>> >>>>>>> The tests create native threads that attach to the VM through >>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>>> encounter the threads that have terminated already the low level >>>>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>>>>> expect that and so fails an assert in debug mode and can SEGV in >>>>>>> product mode. >>>>>>> >>>>>>> Solution: >>>>>>> >>>>>>> Serviceability-side: fix the tests >>>>>>> >>>>>>> Change the tests so that the threads detach before terminating. >>>>>>> The two tests are (surprisingly) written in completely different >>>>>>> styles, so the solution also takes on two different styles. >>>>>>> >>>>>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>>>>> threads that terminate before detaching, and add a regression test >>>>>>> >>>>>>> I took a good look at the low-level code for interacting with >>>>>>> arbitrary threads and as far as I can see the problem only exists >>>>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere >>>>>>> the potential for a library call failure just reports an error >>>>>>> value (such as -1 for the cpu time used). >>>>>>> >>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>>>>> >>>>>>> I created a new regression test to create a new native thread, >>>>>>> attach it and then let it terminate while still attached. The >>>>>>> java code then calls various Thread and ThreadMXBean functions on >>>>>>> it to ensure there are no crashes or unexpected exceptions. >>>>>>> >>>>>>> Testing: >>>>>>> ??- old tests with fixed run-time >>>>>>> ??- old run-time with fixed tests >>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>> ??- new regression test >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>> >>>>> >>>>> > > From coleen.phillimore at oracle.com Mon Jul 9 21:42:43 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 17:42:43 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> Message-ID: On 7/9/18 4:08 PM, Aleksey Shipilev wrote: > Thank you! > > Most latency-savvy folks "out there" run with some sort of safepointing profiling, which in many > cases include PrintSafepointStatistics tables. That was the original reason I was looking at this logging.? I think the trouble with the times is that they are ms and mostly zero.? I wonder if MILLIUNITS would be better for these times: ???????????? (int64_t)(sstats->_time_to_spin / MICROUNITS), ???????????? (int64_t)(sstats->_time_to_wait_to_block / MICROUNITS), ???????????? (int64_t)(sstats->_time_to_sync / MICROUNITS), ???????????? (int64_t)(sstats->_time_to_do_cleanups / MICROUNITS), ???????????? (int64_t)(sstats->_time_to_exec_vmop / MICROUNITS));?? <= this has nonzero values for GC pauses What do you think? thanks, Coleen > > -Aleksey > > On 07/09/2018 08:35 PM, coleen.phillimore at oracle.com wrote: >> Okay, somehow the columns of numbers didn't look very useful on my screen to me, and I wanted to >> convert this to UL (and straighten out the logic), so that's why I made this change.?? I asked >> around internally to see which people would care about the format change and didn't find anyone >> specific.? Now I know! >> >> Let me rework this to use UL but keep the table. >> >> I'll withdraw this change for now. >> >> Thank you for the quick feedback. >> Coleen >> >> On 7/9/18 1:58 PM, Aleksey Shipilev wrote: >>> On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Convert PrintSafepointStatistics to UL >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8198720.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8198720 >>> The synopsis is misleading: it is not only obsoleting PrintSafepoint* options, it also reformats the >>> output! >>> >>> We did JDK-8180482 not that long ago, and the reason was that both people and machine tools are >>> accustomed to the particular non-noisy format for that table. I am not at all convinced that >>> proposed format [2] is better than current version [3]. Can we keep (at least some resemblance of) >>> the old format, please? >>> >>> -Aleksey >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8180482 >>> [2] https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging >>> [3] http://cr.openjdk.java.net/~shade/8180482/after.txt >>> > From chris.plummer at oracle.com Mon Jul 9 21:50:09 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 9 Jul 2018 14:50:09 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: On 7/9/18 2:41 PM, David Holmes wrote: > Hi Chris, > > On 10/07/2018 4:22 AM, Chris Plummer wrote: >> Hi David, >> >> Would it be better to problem list this test on solaris using >> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >> problem list and start executing on solaris. > > JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could > only fix this for VM created threads. The general problem of TLS > destructors looping if a thread terminates without detaching from the > VM is not solvable - other than by not using TLS in the VM. Ok, I misunderstood your comments in the test. Changes look fine. Chris > > Thanks, > David > >> thanks, >> >> Chris >> >> On 7/8/18 4:58 PM, David Holmes wrote: >>> tl;dr skip the new regression test on Solaris >>> >>> New webrev: >>> >>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>> >>> This excludes the test from running on Solaris, so the makefile >>> doesn't bother compiling this native test and the Java part of the >>> test adds: >>> >>> ! * @requires os.family != "windows" & os.family != "solaris" >>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>> natively >>> ? *????????? attached thread that has failed to detach before >>> terminating. >>> + * @comment The native code only supports POSIX so no windows >>> testing; also >>> + *????????? we have to skip solaris as a terminating thread that >>> fails to >>> + *????????? detach will hit an infinite loop due to TLS destructor >>> issues - see >>> + *????????? comments in JDK-8156708 >>> >>> Note this means that Solaris is not affected by the original issue >>> because a still-attached native thread can't actually terminate due >>> to the TLS destructor infinite-loop issue. >>> >>> Thanks, >>> David >>> >>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>> The new test is hanging on Solaris. I just discovered we >>>> don't run these tests on Solaris until tier4. >>>> >>>> David >>>> >>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> Thanks for looking at this. >>>>> >>>>> Updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>> >>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>> >>>>> More below ... >>>>> >>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Solaris problems aside, overall it looks fine. Some minor things >>>>>> I noted: >>>>>> >>>>>> I noticed that exitCode is never modified in agentA() or >>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>> the bottom of the function, it passed, so PASSED can be returned. >>>>>> The code would be more clear if it did this. As-is it is implied >>>>>> that you can reach the bottom when it fails. >>>>> >>>>> I resisted any and all urges to do any kind of unrelated code >>>>> cleanup in the tests - once you start you may end up doing a full >>>>> rewrite. >>>>> >>>>>> Is detaching the threads along the failure paths really needed? >>>>>> exit() is called, so this would seem to make it unnecessary. >>>>> >>>>> You're right that isn't necessary. I'll remove the changes from >>>>> before the exits in ji05t001.c >>>>> >>>>>> I prefer assignments not to be embedded inside the "if" >>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>> much more readable than the similar code in agentA() and agentB(). >>>>> >>>>> It's an existing style already used in that test e.g. >>>>> >>>>> ??287???? if ((res = >>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>> 0)) != 0) { >>>>> >>>>> and I don't mind it, so I'd prefer not to change it. >>>>> >>>>>> In the test: >>>>>> >>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>> unexpected >>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>> we know exactly >>>>>> >>>>>> "of" should be "or". >>>>> >>>>> Well spotted. Thanks. >>>>> >>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>> and the above comment indicates that you don't want this. >>>>> >>>>> I'm not expecting there to be any exceptions from any of the >>>>> called methods. That would potentially indicate a problem in >>>>> handling the terminated native thread, so would indicate a test >>>>> failure. >>>>> >>>>>> Don't we normally put these tests in a package? >>>>> >>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>> when they are important for the test. In runtime we have 905 java >>>>> files and only 116 have a package statement. It varies elsewhere. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>> Solaris compiler complains about doing a return from >>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>> tomorrow. >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>> >>>>>>>> Problem: >>>>>>>> >>>>>>>> The tests create native threads that attach to the VM through >>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>>>> encounter the threads that have terminated already the low >>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code >>>>>>>> doesn't expect that and so fails an assert in debug mode and >>>>>>>> can SEGV in product mode. >>>>>>>> >>>>>>>> Solution: >>>>>>>> >>>>>>>> Serviceability-side: fix the tests >>>>>>>> >>>>>>>> Change the tests so that the threads detach before terminating. >>>>>>>> The two tests are (surprisingly) written in completely >>>>>>>> different styles, so the solution also takes on two different >>>>>>>> styles. >>>>>>>> >>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>> regression test >>>>>>>> >>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>> Elsewhere the potential for a library call failure just reports >>>>>>>> an error value (such as -1 for the cpu time used). >>>>>>>> >>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>> case. >>>>>>>> >>>>>>>> I created a new regression test to create a new native thread, >>>>>>>> attach it and then let it terminate while still attached. The >>>>>>>> java code then calls various Thread and ThreadMXBean functions >>>>>>>> on it to ensure there are no crashes or unexpected exceptions. >>>>>>>> >>>>>>>> Testing: >>>>>>>> ??- old tests with fixed run-time >>>>>>>> ??- old run-time with fixed tests >>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>> ??- new regression test >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>> >>>>>> >>>>>> >> >> From david.holmes at oracle.com Mon Jul 9 22:17:13 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 08:17:13 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: Thanks Chris! Can I please get a second review. David On 10/07/2018 7:50 AM, Chris Plummer wrote: > On 7/9/18 2:41 PM, David Holmes wrote: >> Hi Chris, >> >> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>> Hi David, >>> >>> Would it be better to problem list this test on solaris using >>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>> problem list and start executing on solaris. >> >> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >> only fix this for VM created threads. The general problem of TLS >> destructors looping if a thread terminates without detaching from the >> VM is not solvable - other than by not using TLS in the VM. > Ok, I misunderstood your comments in the test. > > Changes look fine. > > Chris >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>> >>> On 7/8/18 4:58 PM, David Holmes wrote: >>>> tl;dr skip the new regression test on Solaris >>>> >>>> New webrev: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>> >>>> This excludes the test from running on Solaris, so the makefile >>>> doesn't bother compiling this native test and the Java part of the >>>> test adds: >>>> >>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>> natively >>>> ? *????????? attached thread that has failed to detach before >>>> terminating. >>>> + * @comment The native code only supports POSIX so no windows >>>> testing; also >>>> + *????????? we have to skip solaris as a terminating thread that >>>> fails to >>>> + *????????? detach will hit an infinite loop due to TLS destructor >>>> issues - see >>>> + *????????? comments in JDK-8156708 >>>> >>>> Note this means that Solaris is not affected by the original issue >>>> because a still-attached native thread can't actually terminate due >>>> to the TLS destructor infinite-loop issue. >>>> >>>> Thanks, >>>> David >>>> >>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>> The new test is hanging on Solaris. I just discovered we >>>>> don't run these tests on Solaris until tier4. >>>>> >>>>> David >>>>> >>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thanks for looking at this. >>>>>> >>>>>> Updated webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>> >>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>> >>>>>> More below ... >>>>>> >>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Solaris problems aside, overall it looks fine. Some minor things >>>>>>> I noted: >>>>>>> >>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>> the bottom of the function, it passed, so PASSED can be returned. >>>>>>> The code would be more clear if it did this. As-is it is implied >>>>>>> that you can reach the bottom when it fails. >>>>>> >>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>> cleanup in the tests - once you start you may end up doing a full >>>>>> rewrite. >>>>>> >>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>> >>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>> before the exits in ji05t001.c >>>>>> >>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>> >>>>>> It's an existing style already used in that test e.g. >>>>>> >>>>>> ??287???? if ((res = >>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>> 0)) != 0) { >>>>>> >>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>> >>>>>>> In the test: >>>>>>> >>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>> unexpected >>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>> we know exactly >>>>>>> >>>>>>> "of" should be "or". >>>>>> >>>>>> Well spotted. Thanks. >>>>>> >>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>> and the above comment indicates that you don't want this. >>>>>> >>>>>> I'm not expecting there to be any exceptions from any of the >>>>>> called methods. That would potentially indicate a problem in >>>>>> handling the terminated native thread, so would indicate a test >>>>>> failure. >>>>>> >>>>>>> Don't we normally put these tests in a package? >>>>>> >>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>> when they are important for the test. In runtime we have 905 java >>>>>> files and only 116 have a package statement. It varies elsewhere. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>> Solaris compiler complains about doing a return from >>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>> tomorrow. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>> >>>>>>>>> Problem: >>>>>>>>> >>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>>>>> encounter the threads that have terminated already the low >>>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code >>>>>>>>> doesn't expect that and so fails an assert in debug mode and >>>>>>>>> can SEGV in product mode. >>>>>>>>> >>>>>>>>> Solution: >>>>>>>>> >>>>>>>>> Serviceability-side: fix the tests >>>>>>>>> >>>>>>>>> Change the tests so that the threads detach before terminating. >>>>>>>>> The two tests are (surprisingly) written in completely >>>>>>>>> different styles, so the solution also takes on two different >>>>>>>>> styles. >>>>>>>>> >>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>> regression test >>>>>>>>> >>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>> Elsewhere the potential for a library call failure just reports >>>>>>>>> an error value (such as -1 for the cpu time used). >>>>>>>>> >>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>> case. >>>>>>>>> >>>>>>>>> I created a new regression test to create a new native thread, >>>>>>>>> attach it and then let it terminate while still attached. The >>>>>>>>> java code then calls various Thread and ThreadMXBean functions >>>>>>>>> on it to ensure there are no crashes or unexpected exceptions. >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> ??- old tests with fixed run-time >>>>>>>>> ??- old run-time with fixed tests >>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>> ??- new regression test >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>> >>>>>>> >>>>>>> >>> >>> > > From david.holmes at oracle.com Mon Jul 9 23:16:49 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 09:16:49 +1000 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: <515e0337-0500-5599-8940-d2b54d8c7b6e@oracle.com> References: <2a218f81-4ecc-ad6a-3110-80978e17473d@oracle.com> <515e0337-0500-5599-8940-d2b54d8c7b6e@oracle.com> Message-ID: <89b5dca7-5e84-c91f-804f-7dc5858c70f8@oracle.com> Looks good Coleen - thanks! David On 10/07/2018 3:13 AM, coleen.phillimore at oracle.com wrote: > > > On 7/9/18 7:32 AM, coleen.phillimore at oracle.com wrote: >> >> On 7/8/18 9:20 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 7/07/2018 5:41 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: Only fetch Node::next once and use that result. >>>> >>>> A racing thread could NULL next->next()->next().? The Node itself is >>>> stable until the write_synchronize() but the pointers may be >>>> updated. See bug for more detail. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8206471 >>> >>> The change looks good. >>> >>> Could there be a similar race at: >>> >>> ?552?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); >>> ?553?????? rem_n = rem_n->next(); >>> > > Probably not because this instance has the bucket lock, but I'll change > it anyway. > >>> Even if not, it is marginally more performant to only do the >>> load-acquire once. >>> >>> Similarly: >>> >>> ?663 new_table->get_bucket(odd_index)->release_assign_node_ptr(odd, >>> ?664 aux->next()); >>> ?665 new_table->get_bucket(even_index)->release_assign_node_ptr(even, >>> ?666 aux->next()); >>> >>> combined with: >>> >>> ?685???? aux = aux->next(); >>> >>> makes for 3 load-acquire (and 2 if we take the else at line #675). >>> >>> And again: >>> >>> ?982?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); >>> ?983?????? rem_n = rem_n->next(); >> > I believe the value of next() is stable in all these cases, and it's > fine to only load once. > > open webrev at http://cr.openjdk.java.net/~coleenp/8206471.02/webrev > > Ran the extensive gtests that Robbin wrote to cover zipping and > unzipping the hashtable and rerunning hs-tier1,2. > > Thanks, > Coleen > >> Thank you for noticing these other double loads.? I'll study them to >> see if there's a race or see if there's a reason they have double >> loads, but I'll change them unless there is a reason not to. >> >> Thanks! >> Coleen >> >>> >>> Thanks, >>> David >>> ----- >>> >>>> Tested with SymbolTable changes and tests that failed.? Also tested >>>> with mach5 hs-tier1-5 (in progress). >>>> >>>> This is actually Robbin's fix, and my review is that it looks good. >>>> >>>> Thanks, >>>> Coleen >> > From david.holmes at oracle.com Mon Jul 9 23:22:21 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 09:22:21 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com> Adding back runtime On 10/07/2018 8:45 AM, Alex Menkov wrote: > +1 Thanks for looking at this Alex! > couple minor notes (no need to resend review) Webrev updated in place (v3) for others to see. > src/hotspot/os/linux/os_linux.cpp > please replace > > 5581???? } > 5582???? else { > > with > ??? } else { Done. > > test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c > please fix error reporting (I suppose you mean "TEST ERROR: > pthread_create failed"/"TEST ERROR: pthread_join failed"): > > ? 85?? if ((res = pthread_create(&thread, NULL, thread_start, NULL)) != > 0) { > ? 86???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s > (%d)\n", strerror(res), res); > ? 87???? exit(1); > ? 88?? } > ? 89 > ? 90?? if ((res = pthread_join(thread, NULL)) != 0) { > ? 91???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s > (%d)\n", strerror(res), res); > ? 92???? exit(1); > ? 93?? } Fixed - well spotted! Thanks, David > --alex > > On 07/09/2018 15:17, David Holmes wrote: >> Thanks Chris! >> >> Can I please get a second review. >> >> David >> >> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>> On 7/9/18 2:41 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Would it be better to problem list this test on solaris using >>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>>>> problem list and start executing on solaris. >>>> >>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>> only fix this for VM created threads. The general problem of TLS >>>> destructors looping if a thread terminates without detaching from >>>> the VM is not solvable - other than by not using TLS in the VM. >>> Ok, I misunderstood your comments in the test. >>> >>> Changes look fine. >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>> tl;dr skip the new regression test on Solaris >>>>>> >>>>>> New webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>> >>>>>> This excludes the test from running on Solaris, so the makefile >>>>>> doesn't bother compiling this native test and the Java part of the >>>>>> test adds: >>>>>> >>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>> natively >>>>>> ? *????????? attached thread that has failed to detach before >>>>>> terminating. >>>>>> + * @comment The native code only supports POSIX so no windows >>>>>> testing; also >>>>>> + *????????? we have to skip solaris as a terminating thread that >>>>>> fails to >>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>> destructor issues - see >>>>>> + *????????? comments in JDK-8156708 >>>>>> >>>>>> Note this means that Solaris is not affected by the original issue >>>>>> because a still-attached native thread can't actually terminate >>>>>> due to the TLS destructor infinite-loop issue. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>> don't run these tests on Solaris until tier4. >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for looking at this. >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>> >>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>> >>>>>>>> More below ... >>>>>>>> >>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>> things I noted: >>>>>>>>> >>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>>>> the bottom of the function, it passed, so PASSED can be >>>>>>>>> returned. The code would be more clear if it did this. As-is it >>>>>>>>> is implied that you can reach the bottom when it fails. >>>>>>>> >>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>> full rewrite. >>>>>>>> >>>>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>>>> >>>>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>>>> before the exits in ji05t001.c >>>>>>>> >>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>>>> >>>>>>>> It's an existing style already used in that test e.g. >>>>>>>> >>>>>>>> ??287???? if ((res = >>>>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>>>> 0)) != 0) { >>>>>>>> >>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>> >>>>>>>>> In the test: >>>>>>>>> >>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>> unexpected >>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>>>> we know exactly >>>>>>>>> >>>>>>>>> "of" should be "or". >>>>>>>> >>>>>>>> Well spotted. Thanks. >>>>>>>> >>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>>>> and the above comment indicates that you don't want this. >>>>>>>> >>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>> handling the terminated native thread, so would indicate a test >>>>>>>> failure. >>>>>>>> >>>>>>>>> Don't we normally put these tests in a package? >>>>>>>> >>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>>>> when they are important for the test. In runtime we have 905 >>>>>>>> java files and only 116 have a package statement. It varies >>>>>>>> elsewhere. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>> tomorrow. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>> >>>>>>>>>>> Problem: >>>>>>>>>>> >>>>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>>>> detaching themselves. When the VM exits and we're using >>>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>> CPU usage. When we encounter the threads that have terminated >>>>>>>>>>> already the low level pthread_getcpuclockid calls returns >>>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert >>>>>>>>>>> in debug mode and can SEGV in product mode. >>>>>>>>>>> >>>>>>>>>>> Solution: >>>>>>>>>>> >>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>> >>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>> two different styles. >>>>>>>>>>> >>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>> regression test >>>>>>>>>>> >>>>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>>>> Elsewhere the potential for a library call failure just >>>>>>>>>>> reports an error value (such as -1 for the cpu time used). >>>>>>>>>>> >>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>>>> case. >>>>>>>>>>> >>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>> ??- new regression test >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> >>>>> >>> >>> From coleen.phillimore at oracle.com Tue Jul 10 00:11:34 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 20:11:34 -0400 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com> Message-ID: <95f95813-52e7-aebd-10e4-306ecb534fc3@oracle.com> This looks good!? Thank you for fixing these failures. Coleen On 7/9/18 7:22 PM, David Holmes wrote: > Adding back runtime > > On 10/07/2018 8:45 AM, Alex Menkov wrote: >> +1 > > Thanks for looking at this Alex! > >> couple minor notes (no need to resend review) > > Webrev updated in place (v3) for others to see. > >> src/hotspot/os/linux/os_linux.cpp >> please replace >> >> 5581???? } >> 5582???? else { >> >> with >> ???? } else { > > Done. > >> >> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c >> please fix error reporting (I suppose you mean "TEST ERROR: >> pthread_create failed"/"TEST ERROR: pthread_join failed"): >> >> ?? 85?? if ((res = pthread_create(&thread, NULL, thread_start, NULL)) >> != 0) { >> ?? 86???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s >> (%d)\n", strerror(res), res); >> ?? 87???? exit(1); >> ?? 88?? } >> ?? 89 >> ?? 90?? if ((res = pthread_join(thread, NULL)) != 0) { >> ?? 91???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s >> (%d)\n", strerror(res), res); >> ?? 92???? exit(1); >> ?? 93?? } > > Fixed - well spotted! > > Thanks, > David > >> --alex >> >> On 07/09/2018 15:17, David Holmes wrote: >>> Thanks Chris! >>> >>> Can I please get a second review. >>> >>> David >>> >>> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>>> On 7/9/18 2:41 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Would it be better to problem list this test on solaris using >>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off >>>>>> the problem list and start executing on solaris. >>>>> >>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>>> only fix this for VM created threads. The general problem of TLS >>>>> destructors looping if a thread terminates without detaching from >>>>> the VM is not solvable - other than by not using TLS in the VM. >>>> Ok, I misunderstood your comments in the test. >>>> >>>> Changes look fine. >>>> >>>> Chris >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>>> tl;dr skip the new regression test on Solaris >>>>>>> >>>>>>> New webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>>> >>>>>>> This excludes the test from running on Solaris, so the makefile >>>>>>> doesn't bother compiling this native test and the Java part of >>>>>>> the test adds: >>>>>>> >>>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>>> natively >>>>>>> ? *????????? attached thread that has failed to detach before >>>>>>> terminating. >>>>>>> + * @comment The native code only supports POSIX so no windows >>>>>>> testing; also >>>>>>> + *????????? we have to skip solaris as a terminating thread >>>>>>> that fails to >>>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>>> destructor issues - see >>>>>>> + *????????? comments in JDK-8156708 >>>>>>> >>>>>>> Note this means that Solaris is not affected by the original >>>>>>> issue because a still-attached native thread can't actually >>>>>>> terminate due to the TLS destructor infinite-loop issue. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>>> don't run these tests on Solaris until tier4. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for looking at this. >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>>> >>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>>> >>>>>>>>> More below ... >>>>>>>>> >>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>>> things I noted: >>>>>>>>>> >>>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>>> agentB(), so there isn't much point to having it. If you >>>>>>>>>> reach the bottom of the function, it passed, so PASSED can be >>>>>>>>>> returned. The code would be more clear if it did this. As-is >>>>>>>>>> it is implied that you can reach the bottom when it fails. >>>>>>>>> >>>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>>> full rewrite. >>>>>>>>> >>>>>>>>>> Is detaching the threads along the failure paths really >>>>>>>>>> needed? exit() is called, so this would seem to make it >>>>>>>>>> unnecessary. >>>>>>>>> >>>>>>>>> You're right that isn't necessary. I'll remove the changes >>>>>>>>> from before the exits in ji05t001.c >>>>>>>>> >>>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>>> much more readable than the similar code in agentA() and >>>>>>>>>> agentB(). >>>>>>>>> >>>>>>>>> It's an existing style already used in that test e.g. >>>>>>>>> >>>>>>>>> ??287???? if ((res = >>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void >>>>>>>>> *) 0)) != 0) { >>>>>>>>> >>>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>>> >>>>>>>>>> In the test: >>>>>>>>>> >>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>>> unexpected >>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some >>>>>>>>>> cases we know exactly >>>>>>>>>> >>>>>>>>>> "of" should be "or". >>>>>>>>> >>>>>>>>> Well spotted. Thanks. >>>>>>>>> >>>>>>>>>> Shouldn't you be catching exceptions for all the Thread >>>>>>>>>> methods you are calling? Otherwise the test will exit if one >>>>>>>>>> is thrown, and the above comment indicates that you don't >>>>>>>>>> want this. >>>>>>>>> >>>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>>> handling the terminated native thread, so would indicate a >>>>>>>>> test failure. >>>>>>>>> >>>>>>>>>> Don't we normally put these tests in a package? >>>>>>>>> >>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses >>>>>>>>> packages when they are important for the test. In runtime we >>>>>>>>> have 905 java files and only 116 have a package statement. It >>>>>>>>> varies elsewhere. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>>> tomorrow. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>>> >>>>>>>>>>>> Problem: >>>>>>>>>>>> >>>>>>>>>>>> The tests create native threads that attach to the VM >>>>>>>>>>>> through JNI_AttachCurrentThread but which then terminate >>>>>>>>>>>> without detaching themselves. When the VM exits and we're >>>>>>>>>>>> using Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>>> CPU usage. When we encounter the threads that have >>>>>>>>>>>> terminated already the low level pthread_getcpuclockid >>>>>>>>>>>> calls returns ESRCH but the code doesn't expect that and so >>>>>>>>>>>> fails an assert in debug mode and can SEGV in product mode. >>>>>>>>>>>> >>>>>>>>>>>> Solution: >>>>>>>>>>>> >>>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>>> >>>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>>> two different styles. >>>>>>>>>>>> >>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>>> regression test >>>>>>>>>>>> >>>>>>>>>>>> I took a good look at the low-level code for interacting >>>>>>>>>>>> with arbitrary threads and as far as I can see the problem >>>>>>>>>>>> only exists for this one case of pthread_getcpuclockid on >>>>>>>>>>>> Linux. Elsewhere the potential for a library call failure >>>>>>>>>>>> just reports an error value (such as -1 for the cpu time >>>>>>>>>>>> used). >>>>>>>>>>>> >>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in >>>>>>>>>>>> that case. >>>>>>>>>>>> >>>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>>> >>>>>>>>>>>> Testing: >>>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>>> ??- new regression test >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>> >>>>>> >>>> >>>> From david.holmes at oracle.com Tue Jul 10 00:14:44 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 10:14:44 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <95f95813-52e7-aebd-10e4-306ecb534fc3@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com> <95f95813-52e7-aebd-10e4-306ecb534fc3@oracle.com> Message-ID: Thanks for looking at this Coleen. David On 10/07/2018 10:11 AM, coleen.phillimore at oracle.com wrote: > > This looks good!? Thank you for fixing these failures. > Coleen > > On 7/9/18 7:22 PM, David Holmes wrote: >> Adding back runtime >> >> On 10/07/2018 8:45 AM, Alex Menkov wrote: >>> +1 >> >> Thanks for looking at this Alex! >> >>> couple minor notes (no need to resend review) >> >> Webrev updated in place (v3) for others to see. >> >>> src/hotspot/os/linux/os_linux.cpp >>> please replace >>> >>> 5581???? } >>> 5582???? else { >>> >>> with >>> ???? } else { >> >> Done. >> >>> >>> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c >>> please fix error reporting (I suppose you mean "TEST ERROR: >>> pthread_create failed"/"TEST ERROR: pthread_join failed"): >>> >>> ?? 85?? if ((res = pthread_create(&thread, NULL, thread_start, NULL)) >>> != 0) { >>> ?? 86???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s >>> (%d)\n", strerror(res), res); >>> ?? 87???? exit(1); >>> ?? 88?? } >>> ?? 89 >>> ?? 90?? if ((res = pthread_join(thread, NULL)) != 0) { >>> ?? 91???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s >>> (%d)\n", strerror(res), res); >>> ?? 92???? exit(1); >>> ?? 93?? } >> >> Fixed - well spotted! >> >> Thanks, >> David >> >>> --alex >>> >>> On 07/09/2018 15:17, David Holmes wrote: >>>> Thanks Chris! >>>> >>>> Can I please get a second review. >>>> >>>> David >>>> >>>> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>>>> On 7/9/18 2:41 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Would it be better to problem list this test on solaris using >>>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off >>>>>>> the problem list and start executing on solaris. >>>>>> >>>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>>>> only fix this for VM created threads. The general problem of TLS >>>>>> destructors looping if a thread terminates without detaching from >>>>>> the VM is not solvable - other than by not using TLS in the VM. >>>>> Ok, I misunderstood your comments in the test. >>>>> >>>>> Changes look fine. >>>>> >>>>> Chris >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>>>> tl;dr skip the new regression test on Solaris >>>>>>>> >>>>>>>> New webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>>>> >>>>>>>> This excludes the test from running on Solaris, so the makefile >>>>>>>> doesn't bother compiling this native test and the Java part of >>>>>>>> the test adds: >>>>>>>> >>>>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>>>> natively >>>>>>>> ? *????????? attached thread that has failed to detach before >>>>>>>> terminating. >>>>>>>> + * @comment The native code only supports POSIX so no windows >>>>>>>> testing; also >>>>>>>> + *????????? we have to skip solaris as a terminating thread >>>>>>>> that fails to >>>>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>>>> destructor issues - see >>>>>>>> + *????????? comments in JDK-8156708 >>>>>>>> >>>>>>>> Note this means that Solaris is not affected by the original >>>>>>>> issue because a still-attached native thread can't actually >>>>>>>> terminate due to the TLS destructor infinite-loop issue. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>>>> don't run these tests on Solaris until tier4. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Thanks for looking at this. >>>>>>>>>> >>>>>>>>>> Updated webrev: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>>>> >>>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>>>> >>>>>>>>>> More below ... >>>>>>>>>> >>>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>>>> things I noted: >>>>>>>>>>> >>>>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>>>> agentB(), so there isn't much point to having it. If you >>>>>>>>>>> reach the bottom of the function, it passed, so PASSED can be >>>>>>>>>>> returned. The code would be more clear if it did this. As-is >>>>>>>>>>> it is implied that you can reach the bottom when it fails. >>>>>>>>>> >>>>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>>>> full rewrite. >>>>>>>>>> >>>>>>>>>>> Is detaching the threads along the failure paths really >>>>>>>>>>> needed? exit() is called, so this would seem to make it >>>>>>>>>>> unnecessary. >>>>>>>>>> >>>>>>>>>> You're right that isn't necessary. I'll remove the changes >>>>>>>>>> from before the exits in ji05t001.c >>>>>>>>>> >>>>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>>>> much more readable than the similar code in agentA() and >>>>>>>>>>> agentB(). >>>>>>>>>> >>>>>>>>>> It's an existing style already used in that test e.g. >>>>>>>>>> >>>>>>>>>> ??287???? if ((res = >>>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void >>>>>>>>>> *) 0)) != 0) { >>>>>>>>>> >>>>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>>>> >>>>>>>>>>> In the test: >>>>>>>>>>> >>>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>>>> unexpected >>>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some >>>>>>>>>>> cases we know exactly >>>>>>>>>>> >>>>>>>>>>> "of" should be "or". >>>>>>>>>> >>>>>>>>>> Well spotted. Thanks. >>>>>>>>>> >>>>>>>>>>> Shouldn't you be catching exceptions for all the Thread >>>>>>>>>>> methods you are calling? Otherwise the test will exit if one >>>>>>>>>>> is thrown, and the above comment indicates that you don't >>>>>>>>>>> want this. >>>>>>>>>> >>>>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>>>> handling the terminated native thread, so would indicate a >>>>>>>>>> test failure. >>>>>>>>>> >>>>>>>>>>> Don't we normally put these tests in a package? >>>>>>>>>> >>>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses >>>>>>>>>> packages when they are important for the test. In runtime we >>>>>>>>>> have 905 java files and only 116 have a package statement. It >>>>>>>>>> varies elsewhere. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>>>> tomorrow. >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>>>> >>>>>>>>>>>>> Problem: >>>>>>>>>>>>> >>>>>>>>>>>>> The tests create native threads that attach to the VM >>>>>>>>>>>>> through JNI_AttachCurrentThread but which then terminate >>>>>>>>>>>>> without detaching themselves. When the VM exits and we're >>>>>>>>>>>>> using Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>>>> CPU usage. When we encounter the threads that have >>>>>>>>>>>>> terminated already the low level pthread_getcpuclockid >>>>>>>>>>>>> calls returns ESRCH but the code doesn't expect that and so >>>>>>>>>>>>> fails an assert in debug mode and can SEGV in product mode. >>>>>>>>>>>>> >>>>>>>>>>>>> Solution: >>>>>>>>>>>>> >>>>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>>>> >>>>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>>>> two different styles. >>>>>>>>>>>>> >>>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>>>> regression test >>>>>>>>>>>>> >>>>>>>>>>>>> I took a good look at the low-level code for interacting >>>>>>>>>>>>> with arbitrary threads and as far as I can see the problem >>>>>>>>>>>>> only exists for this one case of pthread_getcpuclockid on >>>>>>>>>>>>> Linux. Elsewhere the potential for a library call failure >>>>>>>>>>>>> just reports an error value (such as -1 for the cpu time >>>>>>>>>>>>> used). >>>>>>>>>>>>> >>>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in >>>>>>>>>>>>> that case. >>>>>>>>>>>>> >>>>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>>>> >>>>>>>>>>>>> Testing: >>>>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>>>> ??- new regression test >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> > From coleen.phillimore at oracle.com Tue Jul 10 00:55:25 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 20:55:25 -0400 Subject: RFR (S) 8206471: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: <89b5dca7-5e84-c91f-804f-7dc5858c70f8@oracle.com> References: <2a218f81-4ecc-ad6a-3110-80978e17473d@oracle.com> <515e0337-0500-5599-8940-d2b54d8c7b6e@oracle.com> <89b5dca7-5e84-c91f-804f-7dc5858c70f8@oracle.com> Message-ID: Thanks for the code review and suggestions for improvement. Coleen On 7/9/18 7:16 PM, David Holmes wrote: > Looks good Coleen - thanks! > > David > > On 10/07/2018 3:13 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/9/18 7:32 AM, coleen.phillimore at oracle.com wrote: >>> >>> On 7/8/18 9:20 PM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 7/07/2018 5:41 AM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Only fetch Node::next once and use that result. >>>>> >>>>> A racing thread could NULL next->next()->next(). The Node itself >>>>> is stable until the write_synchronize() but the pointers may be >>>>> updated. See bug for more detail. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8206471.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8206471 >>>> >>>> The change looks good. >>>> >>>> Could there be a similar race at: >>>> >>>> ?552?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); >>>> ?553?????? rem_n = rem_n->next(); >>>> >> >> Probably not because this instance has the bucket lock, but I'll >> change it anyway. >> >>>> Even if not, it is marginally more performant to only do the >>>> load-acquire once. >>>> >>>> Similarly: >>>> >>>> ?663 new_table->get_bucket(odd_index)->release_assign_node_ptr(odd, >>>> ?664 aux->next()); >>>> ?665 new_table->get_bucket(even_index)->release_assign_node_ptr(even, >>>> ?666 aux->next()); >>>> >>>> combined with: >>>> >>>> ?685???? aux = aux->next(); >>>> >>>> makes for 3 load-acquire (and 2 if we take the else at line #675). >>>> >>>> And again: >>>> >>>> ?982?????? bucket->release_assign_node_ptr(rem_n_prev, rem_n->next()); >>>> ?983?????? rem_n = rem_n->next(); >>> >> I believe the value of next() is stable in all these cases, and it's >> fine to only load once. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8206471.02/webrev >> >> Ran the extensive gtests that Robbin wrote to cover zipping and >> unzipping the hashtable and rerunning hs-tier1,2. >> >> Thanks, >> Coleen >> >>> Thank you for noticing these other double loads.? I'll study them to >>> see if there's a race or see if there's a reason they have double >>> loads, but I'll change them unless there is a reason not to. >>> >>> Thanks! >>> Coleen >>> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Tested with SymbolTable changes and tests that failed.? Also >>>>> tested with mach5 hs-tier1-5 (in progress). >>>>> >>>>> This is actually Robbin's fix, and my review is that it looks good. >>>>> >>>>> Thanks, >>>>> Coleen >>> >> From serguei.spitsyn at oracle.com Tue Jul 10 02:07:39 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Jul 2018 19:07:39 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: <416fc226-e389-df65-9487-736efc9e7528@oracle.com> Hi David, It looks good modulo the minor comments that others have already found. Could I ask you to fix a couple of really minor issues in new test? Unneeded spaces are at lines 84 and 51 in .java and .c files: 83 if (mbean.isThreadCpuTimeSupported() && 84 mbean.isThreadCpuTimeEnabled() ) { . . . 51 class_id = (*env)->FindClass (env, "java/lang/Thread"); Thanks, Serguei On 7/9/18 15:17, David Holmes wrote: > Thanks Chris! > > Can I please get a second review. > > David > > On 10/07/2018 7:50 AM, Chris Plummer wrote: >> On 7/9/18 2:41 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Would it be better to problem list this test on solaris using >>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>>> problem list and start executing on solaris. >>> >>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>> only fix this for VM created threads. The general problem of TLS >>> destructors looping if a thread terminates without detaching from >>> the VM is not solvable - other than by not using TLS in the VM. >> Ok, I misunderstood your comments in the test. >> >> Changes look fine. >> >> Chris >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>> tl;dr skip the new regression test on Solaris >>>>> >>>>> New webrev: >>>>> >>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>> >>>>> This excludes the test from running on Solaris, so the makefile >>>>> doesn't bother compiling this native test and the Java part of the >>>>> test adds: >>>>> >>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>> natively >>>>> ? *????????? attached thread that has failed to detach before >>>>> terminating. >>>>> + * @comment The native code only supports POSIX so no windows >>>>> testing; also >>>>> + *????????? we have to skip solaris as a terminating thread that >>>>> fails to >>>>> + *????????? detach will hit an infinite loop due to TLS >>>>> destructor issues - see >>>>> + *????????? comments in JDK-8156708 >>>>> >>>>> Note this means that Solaris is not affected by the original issue >>>>> because a still-attached native thread can't actually terminate >>>>> due to the TLS destructor infinite-loop issue. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>> The new test is hanging on Solaris. I just discovered we >>>>>> don't run these tests on Solaris until tier4. >>>>>> >>>>>> David >>>>>> >>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for looking at this. >>>>>>> >>>>>>> Updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>> >>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>> >>>>>>> More below ... >>>>>>> >>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>> things I noted: >>>>>>>> >>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>>> the bottom of the function, it passed, so PASSED can be >>>>>>>> returned. The code would be more clear if it did this. As-is it >>>>>>>> is implied that you can reach the bottom when it fails. >>>>>>> >>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>> full rewrite. >>>>>>> >>>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>>> >>>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>>> before the exits in ji05t001.c >>>>>>> >>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>>> >>>>>>> It's an existing style already used in that test e.g. >>>>>>> >>>>>>> ??287???? if ((res = >>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>>> 0)) != 0) { >>>>>>> >>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>> >>>>>>>> In the test: >>>>>>>> >>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>> unexpected >>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>>> we know exactly >>>>>>>> >>>>>>>> "of" should be "or". >>>>>>> >>>>>>> Well spotted. Thanks. >>>>>>> >>>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>>> and the above comment indicates that you don't want this. >>>>>>> >>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>> called methods. That would potentially indicate a problem in >>>>>>> handling the terminated native thread, so would indicate a test >>>>>>> failure. >>>>>>> >>>>>>>> Don't we normally put these tests in a package? >>>>>>> >>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>>> when they are important for the test. In runtime we have 905 >>>>>>> java files and only 116 have a package statement. It varies >>>>>>> elsewhere. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>> tomorrow. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>> >>>>>>>>>> Problem: >>>>>>>>>> >>>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>>> detaching themselves. When the VM exits and we're using >>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>> CPU usage. When we encounter the threads that have terminated >>>>>>>>>> already the low level pthread_getcpuclockid calls returns >>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert >>>>>>>>>> in debug mode and can SEGV in product mode. >>>>>>>>>> >>>>>>>>>> Solution: >>>>>>>>>> >>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>> >>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>> two different styles. >>>>>>>>>> >>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>> regression test >>>>>>>>>> >>>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>>> Elsewhere the potential for a library call failure just >>>>>>>>>> reports an error value (such as -1 for the cpu time used). >>>>>>>>>> >>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>>> case. >>>>>>>>>> >>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>> or unexpected exceptions. >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>> ??- new regression test >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Tue Jul 10 02:35:41 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 12:35:41 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <416fc226-e389-df65-9487-736efc9e7528@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> <416fc226-e389-df65-9487-736efc9e7528@oracle.com> Message-ID: <864787be-ca72-2422-0399-716a1abe7d27@oracle.com> On 10/07/2018 12:07 PM, serguei.spitsyn at oracle.com wrote: > Hi David, > > It looks good modulo the minor comments that others have already found. Thanks for taking a look. > Could I ask you to fix a couple of really minor issues in new test? > > Unneeded spaces are at lines 84 and 51 in .java and .c files: > > ? 83???????? if (mbean.isThreadCpuTimeSupported() && > ? 84???????????? mbean.isThreadCpuTimeEnabled() ) { > ? . . . > > ? 51?? class_id = (*env)->FindClass (env, "java/lang/Thread"); Sorry Serguei, too late. David > Thanks, > Serguei > > > On 7/9/18 15:17, David Holmes wrote: >> Thanks Chris! >> >> Can I please get a second review. >> >> David >> >> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>> On 7/9/18 2:41 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Would it be better to problem list this test on solaris using >>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>>>> problem list and start executing on solaris. >>>> >>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>> only fix this for VM created threads. The general problem of TLS >>>> destructors looping if a thread terminates without detaching from >>>> the VM is not solvable - other than by not using TLS in the VM. >>> Ok, I misunderstood your comments in the test. >>> >>> Changes look fine. >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>> tl;dr skip the new regression test on Solaris >>>>>> >>>>>> New webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>> >>>>>> This excludes the test from running on Solaris, so the makefile >>>>>> doesn't bother compiling this native test and the Java part of the >>>>>> test adds: >>>>>> >>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>> natively >>>>>> ? *????????? attached thread that has failed to detach before >>>>>> terminating. >>>>>> + * @comment The native code only supports POSIX so no windows >>>>>> testing; also >>>>>> + *????????? we have to skip solaris as a terminating thread that >>>>>> fails to >>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>> destructor issues - see >>>>>> + *????????? comments in JDK-8156708 >>>>>> >>>>>> Note this means that Solaris is not affected by the original issue >>>>>> because a still-attached native thread can't actually terminate >>>>>> due to the TLS destructor infinite-loop issue. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>> don't run these tests on Solaris until tier4. >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for looking at this. >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>> >>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>> >>>>>>>> More below ... >>>>>>>> >>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>> things I noted: >>>>>>>>> >>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>>>> the bottom of the function, it passed, so PASSED can be >>>>>>>>> returned. The code would be more clear if it did this. As-is it >>>>>>>>> is implied that you can reach the bottom when it fails. >>>>>>>> >>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>> full rewrite. >>>>>>>> >>>>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>>>> >>>>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>>>> before the exits in ji05t001.c >>>>>>>> >>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>>>> >>>>>>>> It's an existing style already used in that test e.g. >>>>>>>> >>>>>>>> ??287???? if ((res = >>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>>>> 0)) != 0) { >>>>>>>> >>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>> >>>>>>>>> In the test: >>>>>>>>> >>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>> unexpected >>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>>>> we know exactly >>>>>>>>> >>>>>>>>> "of" should be "or". >>>>>>>> >>>>>>>> Well spotted. Thanks. >>>>>>>> >>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>>>> and the above comment indicates that you don't want this. >>>>>>>> >>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>> handling the terminated native thread, so would indicate a test >>>>>>>> failure. >>>>>>>> >>>>>>>>> Don't we normally put these tests in a package? >>>>>>>> >>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>>>> when they are important for the test. In runtime we have 905 >>>>>>>> java files and only 116 have a package statement. It varies >>>>>>>> elsewhere. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>> tomorrow. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>> >>>>>>>>>>> Problem: >>>>>>>>>>> >>>>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>>>> detaching themselves. When the VM exits and we're using >>>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>> CPU usage. When we encounter the threads that have terminated >>>>>>>>>>> already the low level pthread_getcpuclockid calls returns >>>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert >>>>>>>>>>> in debug mode and can SEGV in product mode. >>>>>>>>>>> >>>>>>>>>>> Solution: >>>>>>>>>>> >>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>> >>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>> two different styles. >>>>>>>>>>> >>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>> regression test >>>>>>>>>>> >>>>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>>>> Elsewhere the potential for a library call failure just >>>>>>>>>>> reports an error value (such as -1 for the cpu time used). >>>>>>>>>>> >>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>>>> case. >>>>>>>>>>> >>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>> ??- new regression test >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> >>>>> >>> >>> > From serguei.spitsyn at oracle.com Tue Jul 10 02:42:57 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Jul 2018 19:42:57 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <864787be-ca72-2422-0399-716a1abe7d27@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> <416fc226-e389-df65-9487-736efc9e7528@oracle.com> <864787be-ca72-2422-0399-716a1abe7d27@oracle.com> Message-ID: <5d5822c1-e5fb-f38f-f44d-26086a9ff3b8@oracle.com> On 7/9/18 19:35, David Holmes wrote: > On 10/07/2018 12:07 PM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> It looks good modulo the minor comments that others have already found. > > Thanks for taking a look. > >> Could I ask you to fix a couple of really minor issues in new test? >> >> Unneeded spaces are at lines 84 and 51 in .java and .c files: >> >> ?? 83???????? if (mbean.isThreadCpuTimeSupported() && >> ?? 84???????????? mbean.isThreadCpuTimeEnabled() ) { >> ?? . . . >> >> ?? 51?? class_id = (*env)->FindClass (env, "java/lang/Thread"); > > Sorry Serguei, too late. Not a problem, David. Sorry for being late. Thanks, Serguei > David > >> Thanks, >> Serguei >> >> >> On 7/9/18 15:17, David Holmes wrote: >>> Thanks Chris! >>> >>> Can I please get a second review. >>> >>> David >>> >>> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>>> On 7/9/18 2:41 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Would it be better to problem list this test on solaris using >>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off >>>>>> the problem list and start executing on solaris. >>>>> >>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>>> only fix this for VM created threads. The general problem of TLS >>>>> destructors looping if a thread terminates without detaching from >>>>> the VM is not solvable - other than by not using TLS in the VM. >>>> Ok, I misunderstood your comments in the test. >>>> >>>> Changes look fine. >>>> >>>> Chris >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>>> tl;dr skip the new regression test on Solaris >>>>>>> >>>>>>> New webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>>> >>>>>>> This excludes the test from running on Solaris, so the makefile >>>>>>> doesn't bother compiling this native test and the Java part of >>>>>>> the test adds: >>>>>>> >>>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>>> natively >>>>>>> ? *????????? attached thread that has failed to detach before >>>>>>> terminating. >>>>>>> + * @comment The native code only supports POSIX so no windows >>>>>>> testing; also >>>>>>> + *????????? we have to skip solaris as a terminating thread >>>>>>> that fails to >>>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>>> destructor issues - see >>>>>>> + *????????? comments in JDK-8156708 >>>>>>> >>>>>>> Note this means that Solaris is not affected by the original >>>>>>> issue because a still-attached native thread can't actually >>>>>>> terminate due to the TLS destructor infinite-loop issue. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>>> don't run these tests on Solaris until tier4. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for looking at this. >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>>> >>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>>> >>>>>>>>> More below ... >>>>>>>>> >>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>>> things I noted: >>>>>>>>>> >>>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>>> agentB(), so there isn't much point to having it. If you >>>>>>>>>> reach the bottom of the function, it passed, so PASSED can be >>>>>>>>>> returned. The code would be more clear if it did this. As-is >>>>>>>>>> it is implied that you can reach the bottom when it fails. >>>>>>>>> >>>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>>> full rewrite. >>>>>>>>> >>>>>>>>>> Is detaching the threads along the failure paths really >>>>>>>>>> needed? exit() is called, so this would seem to make it >>>>>>>>>> unnecessary. >>>>>>>>> >>>>>>>>> You're right that isn't necessary. I'll remove the changes >>>>>>>>> from before the exits in ji05t001.c >>>>>>>>> >>>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>>> much more readable than the similar code in agentA() and >>>>>>>>>> agentB(). >>>>>>>>> >>>>>>>>> It's an existing style already used in that test e.g. >>>>>>>>> >>>>>>>>> ??287???? if ((res = >>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void >>>>>>>>> *) 0)) != 0) { >>>>>>>>> >>>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>>> >>>>>>>>>> In the test: >>>>>>>>>> >>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>>> unexpected >>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some >>>>>>>>>> cases we know exactly >>>>>>>>>> >>>>>>>>>> "of" should be "or". >>>>>>>>> >>>>>>>>> Well spotted. Thanks. >>>>>>>>> >>>>>>>>>> Shouldn't you be catching exceptions for all the Thread >>>>>>>>>> methods you are calling? Otherwise the test will exit if one >>>>>>>>>> is thrown, and the above comment indicates that you don't >>>>>>>>>> want this. >>>>>>>>> >>>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>>> handling the terminated native thread, so would indicate a >>>>>>>>> test failure. >>>>>>>>> >>>>>>>>>> Don't we normally put these tests in a package? >>>>>>>>> >>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses >>>>>>>>> packages when they are important for the test. In runtime we >>>>>>>>> have 905 java files and only 116 have a package statement. It >>>>>>>>> varies elsewhere. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>>> tomorrow. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>>> >>>>>>>>>>>> Problem: >>>>>>>>>>>> >>>>>>>>>>>> The tests create native threads that attach to the VM >>>>>>>>>>>> through JNI_AttachCurrentThread but which then terminate >>>>>>>>>>>> without detaching themselves. When the VM exits and we're >>>>>>>>>>>> using Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>>> CPU usage. When we encounter the threads that have >>>>>>>>>>>> terminated already the low level pthread_getcpuclockid >>>>>>>>>>>> calls returns ESRCH but the code doesn't expect that and so >>>>>>>>>>>> fails an assert in debug mode and can SEGV in product mode. >>>>>>>>>>>> >>>>>>>>>>>> Solution: >>>>>>>>>>>> >>>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>>> >>>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>>> two different styles. >>>>>>>>>>>> >>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>>> regression test >>>>>>>>>>>> >>>>>>>>>>>> I took a good look at the low-level code for interacting >>>>>>>>>>>> with arbitrary threads and as far as I can see the problem >>>>>>>>>>>> only exists for this one case of pthread_getcpuclockid on >>>>>>>>>>>> Linux. Elsewhere the potential for a library call failure >>>>>>>>>>>> just reports an error value (such as -1 for the cpu time >>>>>>>>>>>> used). >>>>>>>>>>>> >>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in >>>>>>>>>>>> that case. >>>>>>>>>>>> >>>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>>> >>>>>>>>>>>> Testing: >>>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>>> ??- new regression test >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>> >>>>>> >>>> >>>> >> From martinrb at google.com Tue Jul 10 02:53:48 2018 From: martinrb at google.com (Martin Buchholz) Date: Mon, 9 Jul 2018 19:53:48 -0700 Subject: What to do: clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() Message-ID: There's only one remaining problem building latest jdk with latest clang on Linux preventing it from working out of the box. It seems likely macosx has the same problem. https://bugs.openjdk.java.net/browse/JDK-8186780 clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() Verifying stack alignment seems rather fragile, especially in the presence of inlining. There are various things we can do: - making os::verify_stack_alignment NOINLINE and/or moving os::verify_stack_alignment to its own translation unit. - simply disabling the stack alignment check for clang - I don't see any reason why esp should be aligned even if stack frames are. (Maybe ebp is better? I'm not a x86 assembly programmer) More principled seems invoking functions recursively and disabling inlining and checking that the difference between addresses of a local is a multiple of the alignment, but that will get complicated. - why does stack alignment even matter? Isn't it the alignment of c++ objects on the stack that matter? From martinrb at google.com Tue Jul 10 03:07:28 2018 From: martinrb at google.com (Martin Buchholz) Date: Mon, 9 Jul 2018 20:07:28 -0700 Subject: What to do: clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() In-Reply-To: References: Message-ID: clang and gcc both have __builtin_frame_address(0) we could simply check alignment of that On Mon, Jul 9, 2018 at 7:53 PM, Martin Buchholz wrote: > There's only one remaining problem building latest jdk with latest clang > on Linux preventing it from working out of the box. It seems likely macosx > has the same problem. > > https://bugs.openjdk.java.net/browse/JDK-8186780 > clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_ > alignment() > > Verifying stack alignment seems rather fragile, especially in the presence > of inlining. > > There are various things we can do: > - making os::verify_stack_alignment NOINLINE and/or moving > os::verify_stack_alignment to its own translation unit. > - simply disabling the stack alignment check for clang > - I don't see any reason why esp should be aligned even if stack frames > are. (Maybe ebp is better? I'm not a x86 assembly programmer) More > principled seems invoking functions recursively and disabling inlining and > checking that the difference between addresses of a local is a multiple of > the alignment, but that will get complicated. > - why does stack alignment even matter? Isn't it the alignment of c++ > objects on the stack that matter? > From coleen.phillimore at oracle.com Tue Jul 10 03:26:28 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Jul 2018 23:26:28 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> Message-ID: Hi Aleksey, I rewrote the logging to use UL and to keep the old format:? see http://cr.openjdk.java.net/~coleenp/gc.log It does shift when the time in the logging adds another digit.? I don't know how to fix that. ? Does this look ok otherwise? thanks, Coleen On 7/9/18 5:42 PM, coleen.phillimore at oracle.com wrote: > > > On 7/9/18 4:08 PM, Aleksey Shipilev wrote: >> Thank you! >> >> Most latency-savvy folks "out there" run with some sort of >> safepointing profiling, which in many >> cases include PrintSafepointStatistics tables. > > That was the original reason I was looking at this logging.? I think > the trouble with the times is that they are ms and mostly zero.? I > wonder if MILLIUNITS would be better for these times: > > ???????????? (int64_t)(sstats->_time_to_spin / MICROUNITS), > ???????????? (int64_t)(sstats->_time_to_wait_to_block / MICROUNITS), > ???????????? (int64_t)(sstats->_time_to_sync / MICROUNITS), > ???????????? (int64_t)(sstats->_time_to_do_cleanups / MICROUNITS), > ???????????? (int64_t)(sstats->_time_to_exec_vmop / MICROUNITS));?? <= > this has nonzero values for GC pauses > > What do you think? > > thanks, > Coleen >> >> -Aleksey >> >> On 07/09/2018 08:35 PM, coleen.phillimore at oracle.com wrote: >>> Okay, somehow the columns of numbers didn't look very useful on my >>> screen to me, and I wanted to >>> convert this to UL (and straighten out the logic), so that's why I >>> made this change.?? I asked >>> around internally to see which people would care about the format >>> change and didn't find anyone >>> specific.? Now I know! >>> >>> Let me rework this to use UL but keep the table. >>> >>> I'll withdraw this change for now. >>> >>> Thank you for the quick feedback. >>> Coleen >>> >>> On 7/9/18 1:58 PM, Aleksey Shipilev wrote: >>>> On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Convert PrintSafepointStatistics to UL >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8198720.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8198720 >>>> The synopsis is misleading: it is not only obsoleting >>>> PrintSafepoint* options, it also reformats the >>>> output! >>>> >>>> We did JDK-8180482 not that long ago, and the reason was that both >>>> people and machine tools are >>>> accustomed to the particular non-noisy format for that table. I am >>>> not at all convinced that >>>> proposed format [2] is better than current version [3]. Can we keep >>>> (at least some resemblance of) >>>> the old format, please? >>>> >>>> -Aleksey >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8180482 >>>> [2] >>>> https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging >>>> [3] http://cr.openjdk.java.net/~shade/8180482/after.txt >>>> >> > From david.holmes at oracle.com Tue Jul 10 07:00:09 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 17:00:09 +1000 Subject: URGENT trivial XS RFR: 8206954: Test runtime/Thread/ThreadPriorities.java crashes with SEGV in pthread_getcpuclockid Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8206954 webrev: http://cr.openjdk.java.net/~dholmes/8206954/webrev/ The new test I added for JDK-8205878 needs to run on othervm mode, otherwise the terminated-but-still-attached JavaThread will be encountered by the ThreadPriorties test that issues a jstack command and tries to print all the thread information. When the bad thread fresh this would result in ESRCH from pthread_getcpuclockid, but later the pthread_t is referring to unmapped memory and so we just get a SEGV. --- old/test/hotspot/jtreg/runtime/jni/terminatedThread/TestTerminatedThread.java 2018-07-10 02:47:54.031499137 -0400 +++ new/test/hotspot/jtreg/runtime/jni/terminatedThread/TestTerminatedThread.java 2018-07-10 02:47:52.471409012 -0400 @@ -24,7 +24,7 @@ /* * @test - * @bug 8205878 + * @bug 8205878 8206954 * @requires os.family != "windows" & os.family != "solaris" * @summary Basic test of Thread and ThreadMXBean queries on a natively * attached thread that has failed to detach before terminating. @@ -32,7 +32,7 @@ * we have to skip solaris as a terminating thread that fails to * detach will hit an infinite loop due to TLS destructor issues - see * comments in JDK-8156708 - * @run main/native TestTerminatedThread + * @run main/othervm/native TestTerminatedThread */ public class TestTerminatedThread { --- Thanks, David From mikael.vidstedt at oracle.com Tue Jul 10 07:02:26 2018 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 10 Jul 2018 00:02:26 -0700 Subject: URGENT trivial XS RFR: 8206954: Test runtime/Thread/ThreadPriorities.java crashes with SEGV in pthread_getcpuclockid In-Reply-To: References: Message-ID: <68953202-A509-450F-B247-DDD736D92594@oracle.com> Looks good, thanks for fixing! Cheers, Mikael > On Jul 10, 2018, at 12:00 AM, David Holmes wrote: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206954 > webrev: http://cr.openjdk.java.net/~dholmes/8206954/webrev/ > > The new test I added for JDK-8205878 needs to run on othervm mode, otherwise the terminated-but-still-attached JavaThread will be encountered by the ThreadPriorties test that issues a jstack command and tries to print all the thread information. When the bad thread fresh this would result in ESRCH from pthread_getcpuclockid, but later the pthread_t is referring to unmapped memory and so we just get a SEGV. > > --- old/test/hotspot/jtreg/runtime/jni/terminatedThread/TestTerminatedThread.java 2018-07-10 02:47:54.031499137 -0400 > +++ new/test/hotspot/jtreg/runtime/jni/terminatedThread/TestTerminatedThread.java 2018-07-10 02:47:52.471409012 -0400 > @@ -24,7 +24,7 @@ > > /* > * @test > - * @bug 8205878 > + * @bug 8205878 8206954 > * @requires os.family != "windows" & os.family != "solaris" > * @summary Basic test of Thread and ThreadMXBean queries on a natively > * attached thread that has failed to detach before terminating. > @@ -32,7 +32,7 @@ > * we have to skip solaris as a terminating thread that fails to > * detach will hit an infinite loop due to TLS destructor issues - see > * comments in JDK-8156708 > - * @run main/native TestTerminatedThread > + * @run main/othervm/native TestTerminatedThread > */ > > public class TestTerminatedThread { > > --- > > Thanks, > David From Alan.Bateman at oracle.com Tue Jul 10 07:03:46 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 10 Jul 2018 08:03:46 +0100 Subject: URGENT trivial XS RFR: 8206954: Test runtime/Thread/ThreadPriorities.java crashes with SEGV in pthread_getcpuclockid In-Reply-To: References: Message-ID: <1f7f73e5-9638-c312-8f31-f47e7aa42706@oracle.com> On 10/07/2018 08:00, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8206954 > webrev: http://cr.openjdk.java.net/~dholmes/8206954/webrev/ > > The new test I added for JDK-8205878 needs to run on othervm mode, > otherwise the terminated-but-still-attached JavaThread will be > encountered by the ThreadPriorties test that issues a jstack command > and tries to print all the thread information. When the bad thread > fresh this would result in ESRCH from pthread_getcpuclockid, but later > the pthread_t is referring to unmapped memory and so we just get a SEGV. Changing the test to run in othervm mode looks okay to me. -Alan From david.holmes at oracle.com Tue Jul 10 07:08:35 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 17:08:35 +1000 Subject: URGENT trivial XS RFR: 8206954: Test runtime/Thread/ThreadPriorities.java crashes with SEGV in pthread_getcpuclockid In-Reply-To: <1f7f73e5-9638-c312-8f31-f47e7aa42706@oracle.com> References: <1f7f73e5-9638-c312-8f31-f47e7aa42706@oracle.com> Message-ID: <926e18e0-880a-5dfb-af8e-e0db3de00f93@oracle.com> Thanks Alan and Mikael! The unexpected side-effects of AgentVM ;-) David On 10/07/2018 5:03 PM, Alan Bateman wrote: > On 10/07/2018 08:00, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206954 >> webrev: http://cr.openjdk.java.net/~dholmes/8206954/webrev/ >> >> The new test I added for JDK-8205878 needs to run on othervm mode, >> otherwise the terminated-but-still-attached JavaThread will be >> encountered by the ThreadPriorties test that issues a jstack command >> and tries to print all the thread information. When the bad thread >> fresh this would result in ESRCH from pthread_getcpuclockid, but later >> the pthread_t is referring to unmapped memory and so we just get a SEGV. > Changing the test to run in othervm mode looks okay to me. > > -Alan From gunter.haug at sap.com Tue Jul 10 10:32:16 2018 From: gunter.haug at sap.com (Haug, Gunter) Date: Tue, 10 Jul 2018 10:32:16 +0000 Subject: RFR(S): 8206919: Add missing CPU/system info to vm_version_ext on s390 Message-ID: <2B2AFB03-2EA6-4418-B13B-CB025638EEC5@sap.com> Hi all, can I please have reviews and a sponsor for the following tiny fix: https://bugs.openjdk.java.net/browse/JDK-8206919 http://cr.openjdk.java.net/~ghaug/webrevs/8206919/ The solution is the same as the one we have on ppc and it suffers from the same shortcoming: there is no obvious way to detect the number of cores/slots on a s390 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. Thanks and best regards, Gunter From goetz.lindenmaier at sap.com Tue Jul 10 10:53:04 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 10 Jul 2018 10:53:04 +0000 Subject: RFR(M): 8206977: Minor improvements of runtime code. Message-ID: <571d727a270e47cb8230d8a88b58a2a1@sap.com> Hi, I ran coverity on the jdk11 hotspot sources and want to propose the following fixes to the runtime code. I scanned the linux x86_64 build. Some issues are similar to previous parfait fixes (check for NULL, add guarantees etc.) I also identified some issues I consider real problems. http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ In detail: Real issues: ------------ jvmtiEnvBase.cpp || should be &&. Attention, this is the only change that really will change behaviour. But if thr == NULL we will see a crash below. perfMemory_linux.cpp: Wrong buffer length used. systemDictionary.cpp: Move code dereferencing ik under if (ik != NULL). virtualspace.cpp Initialization is missing. Moved constructor up to the other constructors. Useful code improvements: ------------------------- vm_version_ext_x86.cpp Assure buffer is not accessed at offset -1. os_linux.cpp Numa_max_node returns int, and a -1 in some cases. moduleEntry.cpp name might be NULL. Just a fix for tracing. systemDictionaryShared.cpp clearify code. It would be wrong if only entry == NULL would hold, one would hit the assertion below. verifier.cpp Fix tracing. Illegal opcode is -1 and should not be passed to name array. logOutput.cpp If n_selections == 0, best_selection would be NULL. Move up the assertion and turn into a guarantee. filemap.cpp Either base can be NULL, or parts of the code before are dead. metaspace.cpp We now an exception is pending. klassVtable.cpp Coverity does not like the format in a variable. Anyways this is quite rough coding, transformed to use stringStream as with other similar exceptions. jvmFlag.cpp match might be NULL. writableFlags.cpp name might be NULL. ostream.cpp If ftell returns error code -1, we need not continue. Especially we should not fseek(-1). logTestUtils.inline.hpp ftell returns -1. test_metachunk.cpp wrong datatype. From david.holmes at oracle.com Tue Jul 10 12:10:23 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 22:10:23 +1000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <571d727a270e47cb8230d8a88b58a2a1@sap.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> Message-ID: <864959d1-1e58-a91b-01ee-355178cae2db@oracle.com> Hi Goetz, On 10/07/2018 8:53 PM, Lindenmaier, Goetz wrote: > Hi, > > I ran coverity on the jdk11 hotspot sources and want to propose the > following fixes to the runtime code. I scanned the linux x86_64 build. > Some issues are similar to previous parfait fixes (check for NULL, add > guarantees etc.) I also identified some issues I consider real problems. > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ It will take a while to go through these. I see some false positives caused by too local an examination - which is typical of code checkers. For example in os_linux.cpp + if (buflen > 7) { we know the buffer coming in is O_BUFLEN in size. Add an assert if you like but no need for a hard-wired guard. I see some asserts changed to guarantees which is unnecessary in general - but again appeases static checkers looking at product builds. I also don't see this as a P3 bug, as there seems only 1 potential real bug there (you yourself call these "minor improvements"). So this seems unsuitable for JDK 11 now we are in RDP1. But fine for 12. Will try to go through in more detail tomorrow, but it is somewhat tedious to have to work through these in detail to refute the code checkers claims of incorrectness. Thanks, David ----- > In detail: > > Real issues: > ------------ > > jvmtiEnvBase.cpp > || should be &&. > Attention, this is the only change that really will change behaviour. > But if thr == NULL we will see a crash below. > > perfMemory_linux.cpp: > Wrong buffer length used. > > systemDictionary.cpp: > Move code dereferencing ik under if (ik != NULL). > > virtualspace.cpp > Initialization is missing. Moved constructor up to the other > constructors. > > > Useful code improvements: > ------------------------- > > vm_version_ext_x86.cpp > Assure buffer is not accessed at offset -1. > > os_linux.cpp > Numa_max_node returns int, and a -1 in some cases. > > moduleEntry.cpp > name might be NULL. Just a fix for tracing. > > systemDictionaryShared.cpp > clearify code. > It would be wrong if only entry == NULL would hold, one > would hit the assertion below. > > verifier.cpp > Fix tracing. > Illegal opcode is -1 and should not be passed to name array. > > logOutput.cpp > If n_selections == 0, best_selection would be NULL. > Move up the assertion and turn into a guarantee. > > filemap.cpp > Either base can be NULL, or parts of the code before are dead. > > metaspace.cpp > We now an exception is pending. > > klassVtable.cpp > Coverity does not like the format in a variable. > Anyways this is quite rough coding, transformed to use stringStream > as with other similar exceptions. > > jvmFlag.cpp > match might be NULL. > > writableFlags.cpp > name might be NULL. > > ostream.cpp > If ftell returns error code -1, we need not continue. > Especially we should not fseek(-1). > > logTestUtils.inline.hpp > ftell returns -1. > > test_metachunk.cpp > wrong datatype. > From goetz.lindenmaier at sap.com Tue Jul 10 12:32:41 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 10 Jul 2018 12:32:41 +0000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <864959d1-1e58-a91b-01ee-355178cae2db@oracle.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <864959d1-1e58-a91b-01ee-355178cae2db@oracle.com> Message-ID: <0e37ae822a5845a1bee78f01f5325cd1@sap.com> Hi David, Take your time (within the RDP1 timeframe ??) to look at the issues themselves. Just for the basic comments on this: > I see some asserts changed to guarantees which is unnecessary in general > - but again appeases static checkers looking at product builds. This has been done to a large extend for parfait: http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 > I also don't see this as a P3 bug, as there seems only 1 potential real > bug there (you yourself call these "minor improvements"). So this seems > unsuitable for JDK 11 now we are in RDP1. But fine for 12. When else should I do this? I can only do this when development is closed, else I have to re-run and do fixes again and again for incoming changes. We are required to run the checker and fix issues before releasing a VM. Best regards, Goetz. http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Dienstag, 10. Juli 2018 14:10 > To: Lindenmaier, Goetz ; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(M): 8206977: Minor improvements of runtime code. > > Hi Goetz, > > On 10/07/2018 8:53 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > I ran coverity on the jdk11 hotspot sources and want to propose the > > following fixes to the runtime code. I scanned the linux x86_64 build. > > Some issues are similar to previous parfait fixes (check for NULL, add > > guarantees etc.) I also identified some issues I consider real problems. > > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ > > It will take a while to go through these. I see some false positives > caused by too local an examination - which is typical of code checkers. > For example in os_linux.cpp > > + if (buflen > 7) { > > we know the buffer coming in is O_BUFLEN in size. Add an assert if you > like but no need for a hard-wired guard. > > I see some asserts changed to guarantees which is unnecessary in general > - but again appeases static checkers looking at product builds. > > I also don't see this as a P3 bug, as there seems only 1 potential real > bug there (you yourself call these "minor improvements"). So this seems > unsuitable for JDK 11 now we are in RDP1. But fine for 12. > > Will try to go through in more detail tomorrow, but it is somewhat > tedious to have to work through these in detail to refute the code > checkers claims of incorrectness. > > Thanks, > David > ----- > > > > In detail: > > > > Real issues: > > ------------ > > > > jvmtiEnvBase.cpp > > || should be &&. > > Attention, this is the only change that really will change behaviour. > > But if thr == NULL we will see a crash below. > > > > perfMemory_linux.cpp: > > Wrong buffer length used. > > > > systemDictionary.cpp: > > Move code dereferencing ik under if (ik != NULL). > > > > virtualspace.cpp > > Initialization is missing. Moved constructor up to the other > > constructors. > > > > > > Useful code improvements: > > ------------------------- > > > > vm_version_ext_x86.cpp > > Assure buffer is not accessed at offset -1. > > > > os_linux.cpp > > Numa_max_node returns int, and a -1 in some cases. > > > > moduleEntry.cpp > > name might be NULL. Just a fix for tracing. > > > > systemDictionaryShared.cpp > > clearify code. > > It would be wrong if only entry == NULL would hold, one > > would hit the assertion below. > > > > verifier.cpp > > Fix tracing. > > Illegal opcode is -1 and should not be passed to name array. > > > > logOutput.cpp > > If n_selections == 0, best_selection would be NULL. > > Move up the assertion and turn into a guarantee. > > > > filemap.cpp > > Either base can be NULL, or parts of the code before are dead. > > > > metaspace.cpp > > We now an exception is pending. > > > > klassVtable.cpp > > Coverity does not like the format in a variable. > > Anyways this is quite rough coding, transformed to use stringStream > > as with other similar exceptions. > > > > jvmFlag.cpp > > match might be NULL. > > > > writableFlags.cpp > > name might be NULL. > > > > ostream.cpp > > If ftell returns error code -1, we need not continue. > > Especially we should not fseek(-1). > > > > logTestUtils.inline.hpp > > ftell returns -1. > > > > test_metachunk.cpp > > wrong datatype. > > From coleen.phillimore at oracle.com Tue Jul 10 13:03:18 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 10 Jul 2018 09:03:18 -0400 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <571d727a270e47cb8230d8a88b58a2a1@sap.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> Message-ID: <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/classfile/moduleEntry.cpp.udiff.html + name ? name->as_C_string() : ""); Can you change to: + name != NULL ? name->as_C_string() : ""); http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/oops/klassVtable.cpp.udiff.html This looks a lot nicer!?? Similar code is in linkResolver.cpp, can you look at changing it too? http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html If name is null here, what would this do?? Should there be an 'else' to print something? I think this looks fine.? It doesn't look major to me.? The asserts turned to guarantees don't appear to be anywhere performance sensitive. Thanks, Coleen On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: > Hi, > > I ran coverity on the jdk11 hotspot sources and want to propose the > following fixes to the runtime code. I scanned the linux x86_64 build. > Some issues are similar to previous parfait fixes (check for NULL, add > guarantees etc.) I also identified some issues I consider real problems. > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ > > In detail: > > Real issues: > ------------ > > jvmtiEnvBase.cpp > || should be &&. > Attention, this is the only change that really will change behaviour. > But if thr == NULL we will see a crash below. > > perfMemory_linux.cpp: > Wrong buffer length used. > > systemDictionary.cpp: > Move code dereferencing ik under if (ik != NULL). > > virtualspace.cpp > Initialization is missing. Moved constructor up to the other > constructors. > > > Useful code improvements: > ------------------------- > > vm_version_ext_x86.cpp > Assure buffer is not accessed at offset -1. > > os_linux.cpp > Numa_max_node returns int, and a -1 in some cases. > > moduleEntry.cpp > name might be NULL. Just a fix for tracing. > > systemDictionaryShared.cpp > clearify code. > It would be wrong if only entry == NULL would hold, one > would hit the assertion below. > > verifier.cpp > Fix tracing. > Illegal opcode is -1 and should not be passed to name array. > > logOutput.cpp > If n_selections == 0, best_selection would be NULL. > Move up the assertion and turn into a guarantee. > > filemap.cpp > Either base can be NULL, or parts of the code before are dead. > > metaspace.cpp > We now an exception is pending. > > klassVtable.cpp > Coverity does not like the format in a variable. > Anyways this is quite rough coding, transformed to use stringStream > as with other similar exceptions. > > jvmFlag.cpp > match might be NULL. > > writableFlags.cpp > name might be NULL. > > ostream.cpp > If ftell returns error code -1, we need not continue. > Especially we should not fseek(-1). > > logTestUtils.inline.hpp > ftell returns -1. > > test_metachunk.cpp > wrong datatype. From volker.simonis at gmail.com Tue Jul 10 13:23:15 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 10 Jul 2018 15:23:15 +0200 Subject: RFR(S): 8206919: Add missing CPU/system info to vm_version_ext on s390 In-Reply-To: <2B2AFB03-2EA6-4418-B13B-CB025638EEC5@sap.com> References: <2B2AFB03-2EA6-4418-B13B-CB025638EEC5@sap.com> Message-ID: Hi Gunter, looks good! Can you please just update the copyrights for vm_version_s390.{hpp,cpp} (no need for a new webrev). Thank you and best regards, Volker On Tue, Jul 10, 2018 at 12:32 PM, Haug, Gunter wrote: > Hi all, > > can I please have reviews and a sponsor for the following tiny fix: > > https://bugs.openjdk.java.net/browse/JDK-8206919 > http://cr.openjdk.java.net/~ghaug/webrevs/8206919/ > > The solution is the same as the one we have on ppc and it suffers from the same shortcoming: there is no obvious way to detect the number of cores/slots on a s390 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. > > Thanks and best regards, > Gunter > > > From martin.doerr at sap.com Tue Jul 10 15:04:43 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 10 Jul 2018 15:04:43 +0000 Subject: RFR(S): 8206919: Add missing CPU/system info to vm_version_ext on s390 In-Reply-To: References: <2B2AFB03-2EA6-4418-B13B-CB025638EEC5@sap.com> Message-ID: <199b0efd294e4eebbae2980aac1111af@sap.com> Hi Gunter, looks good to me, too. Thanks, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Volker Simonis Sent: Dienstag, 10. Juli 2018 15:23 To: Haug, Gunter Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206919: Add missing CPU/system info to vm_version_ext on s390 Hi Gunter, looks good! Can you please just update the copyrights for vm_version_s390.{hpp,cpp} (no need for a new webrev). Thank you and best regards, Volker On Tue, Jul 10, 2018 at 12:32 PM, Haug, Gunter wrote: > Hi all, > > can I please have reviews and a sponsor for the following tiny fix: > > https://bugs.openjdk.java.net/browse/JDK-8206919 > http://cr.openjdk.java.net/~ghaug/webrevs/8206919/ > > The solution is the same as the one we have on ppc and it suffers from the same shortcoming: there is no obvious way to detect the number of cores/slots on a s390 system. Anyway, it would be better to have information on the virtualization of the system. We do have a solution for that at SAP and we would be happy to adopt it to JFR and contribute it if there is any interest. > > Thanks and best regards, > Gunter > > > From ioi.lam at oracle.com Tue Jul 10 17:16:10 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 10 Jul 2018 10:16:10 -0700 Subject: Proposal for improving CDS archive creation Message-ID: I have a proposal for improving the process of creating of the CDS archive(s), so we can make CDS easier to use and support more use cases. ?? - better support for custom loaders ?? - remove explicit training run ?? - support 2 levels of shared archives I think the proposal is relatively straight-forward to implement, as we already have most of the required infrastructures: ?? + the ability to use Java class loaders at archive creation time ?? + the ability to relocate MetaspaceObjects Parts of this proposal will also simplify the CDS code and make it more maintainable. Current process of creating the base archive - [C] ================================================== Currently each JVM process can map at most one CDS archive. Let's call this the "base archive". It is created by [ref1]: ?C1. Reserve a region R of 3GB at 0x800000000. ?C2. Load all classes specified in the class list. All data for these classes ???? live outside of R. ???? (E.g., the Klass objects are loaded into tmp_class_space, which is ????? adjacent to R). ?C3. Copy the metadata of all archivable classes (e.g, exclude generated ???? Lambda classes) into R. At this step, R is divided into several ? ?? sections (RO, RW, etc). ? //? +-- SharedBaseAddress?? (default = 0x800000000) ? //? +-- _narrow_klass._base ? //? | ? //? |?????????????????????????????? +-tmp_class_space.base ? //? v?????????????????????????????? V ? //? +----+----+----+----+----+-....-+-------------------+ ? //? |<-?????????? R?????????????? ->| ? //? | MC | RW | RO | MD | OD |unused| tmp_class_space?? | ? //? +----+----+----+----+----+------+-------------------+ ? //? |<--? 3GB??????? -------------->| ? //? |<-- UnscaledClassSpaceMax = 4GB ------------------>| New process for creating the base archive - [N] =============================================== Currently we have a lot of "if (DumpSharedSpaces)" code to for special case handling of the above scheme. We can improve it by ?N1. Remove all code for special memory layout initialization for -Xshare:dump. ???? As a result, we will reserve a region R of 1GB at 0x800000000, which ???? is used by Klass objects (this is the same as if -Xshare:off were ???? specified.) ?N2. Load all classes in the class list. ?N3. Now R contains the Klass objects of all loaded classes. ???? Allocate a temporary space T, and copy all contents of R into T. ?N4. Now R is empty. Copy the metadata of all archivable classes into R. Dump-as-you-go for the base archive - [G] ========================================= Note that the [N] scheme will work even if you're running an app with -Xshare:off. At some point (e.g., when the VM is about to exit), you can: ?G1. Enter a safe point ?G2. Go to step [N3]. The benefit of [G] is you don't need a separate run to dump the archive, and there's no need to use the class list. Instead, we can have an option like: ?? java -Xshare:autocreate -cp app.jar -XX:SharedArchiveFile=foo.jsa App If foo.jsa is not available, we run in [G] mode. At VM exit, we dump into foo.jsa. This way, we don't need to have an explicit training run with -XX:DumpLoadedClassList. Instead, the training run is This also makes it easy to support the classes from custom loaders. There's no need for special tooling to convert -Xlog:class+load=debug output into a classlist. [ref2] Dumping for second-level archive - [S] ====================================== ?S1. Load the base archive ?S2. Run the app as normal ?S3. All Klass objects of the dynamically loaded classes will be loaded in ???? the region R, which immediately follows the end of the base archive. ? //? +-- SharedBaseAddress ? //? |????????????????????????? +--- dynamically loaded Klasses ? //? |????????????????????????? |??? start from here. ? //? v????????????????????????? v ? //? +--------------------------+---------...-----------------| ? //? | base archive???????????? | region R??????????????????? | ? //? +--------------------------+---------...-----------------| ? //? |<- size of base archive ->| ? //? |<--??????????? 1GB -->| ? S4. At some point (possible when the VM is about to exit) we start ????? dumping the second level archive ? S5. Enter safe point ? S6. Now R contains the Klass objects of all dynamically loaded classes. ????? Allocate a temporary space T, and copy all contents of R into T. ? S7. Now R is empty. Copy the metadata of all archivable, dynamically loaded ????? classes into R. ? S8. Create a new shared_dictionary (and shared_symbol_table) that contains ????? all the Klasses (Symbols) from both the base and second-level archives. References ========== [ref1] Current initialization of memory space layout during -Xshare:dump http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 [ref2] Volker Simonis's tool for support custom class loaders in CDS ?????? https://github.com/simonis/cl4cds ---------------------------------------------------------------------- Any thoughts? Thanks - Ioi From lois.foltan at oracle.com Tue Jul 10 17:19:01 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 10 Jul 2018 13:19:01 -0400 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table Message-ID: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> Please review this clean up change to correctly set ResourceMark from within klassVtable::initialize_vtable() and klassItable::initialize_itable() when applicable, instead of having all instances of calls to these two methods establish a ResourceMark unnecessarily prior to. open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 Testing: hs-tier1-3, jdk-tier1-3 (complete) ?????????????? hs-tier4-5 (in progress) Thanks, Lois From calvin.cheung at oracle.com Tue Jul 10 17:31:14 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 10 Jul 2018 10:31:14 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B43C8E8.2060206@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> Message-ID: <5B44ED62.5030008@oracle.com> Updated webrev with the changes mentioned below: http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ I've rerun hs-tier{1,2,3} tests. thanks, Calvin On 7/9/18, 1:43 PM, Calvin Cheung wrote: > Hi Lois, > > Thanks for your review. > > On 7/9/18, 11:58 AM, Lois Foltan wrote: >> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>> >>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>> >>> The JVM crash could be simulated by renaming/removing the modules >>> file under the jdk/lib directory. >>> The proposed simple fix is to perform a >>> vm_exit_during_initialization(). >> >> Hi Calvin, >> >> Some clarifying questions. Is this just an issue for exploded builds? > I don't think so. As mentioned above, I could reproduce the crash with > a regular jdk image build by renaming the modules file under the > jdk/lib directory. >> I would prefer the exit to occur if the os::stat() fails for the >> system class path in os::set_boot_path(). > Instead of exiting in os::set_boot_path(), how about checking the > return status of os::set_boot_path() in the caller and exiting there > like the following: > bash-4.2$ hg diff os_linux.cpp > diff --git a/src/hotspot/os/linux/os_linux.cpp > b/src/hotspot/os/linux/os_linux.cpp > --- a/src/hotspot/os/linux/os_linux.cpp > +++ b/src/hotspot/os/linux/os_linux.cpp > @@ -367,7 +367,9 @@ > } > } > Arguments::set_java_home(buf); > - set_boot_path('/', ':'); > + if (!set_boot_path('/', ':')) { > + vm_exit_during_initialization("Failed setting boot class > path.", NULL); > + } > } > > Note that before the above change, the return status of > set_boot_path() isn't checked. > The above would involve changing 5 of those os_*.cpp files, one for > each O/S. > >> With certainly an added assert later in >> ClassLoader::setup_bootstrap_search_path() to ensure that the system >> class path is never NULL. > Sure, I can add an assert there. > I'll post updated webrev once I've made the change and done testing. > > thanks, > Calvin >> >> Thanks, >> Lois >> >>> >>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>> >>> thanks, >>> Calvin >> From lois.foltan at oracle.com Tue Jul 10 17:42:33 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 10 Jul 2018 13:42:33 -0400 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B44ED62.5030008@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> Message-ID: On 7/10/2018 1:31 PM, Calvin Cheung wrote: > Updated webrev with the changes mentioned below: > ??? http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ Hi Calvin, Looks good, thanks for making the change.? It does seem a bit odd that the vm_exit_during_initialization is not just called within the method os::set_boot_path(), however, it makes sense for each platform to decide how to handle the failure. Thanks, Lois > > I've rerun hs-tier{1,2,3} tests. > > thanks, > Calvin > > On 7/9/18, 1:43 PM, Calvin Cheung wrote: >> Hi Lois, >> >> Thanks for your review. >> >> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>> >>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>> >>>> The JVM crash could be simulated by renaming/removing the modules >>>> file under the jdk/lib directory. >>>> The proposed simple fix is to perform a >>>> vm_exit_during_initialization(). >>> >>> Hi Calvin, >>> >>> Some clarifying questions.? Is this just an issue for exploded builds? >> I don't think so. As mentioned above, I could reproduce the crash >> with a regular jdk image build by renaming the modules file under the >> jdk/lib directory. >>> ? I would prefer the exit to occur if the os::stat() fails for the >>> system class path in os::set_boot_path(). >> Instead of exiting in os::set_boot_path(), how about checking the >> return status of os::set_boot_path() in the caller and exiting there >> like the following: >> bash-4.2$ hg diff os_linux.cpp >> diff --git a/src/hotspot/os/linux/os_linux.cpp >> b/src/hotspot/os/linux/os_linux.cpp >> --- a/src/hotspot/os/linux/os_linux.cpp >> +++ b/src/hotspot/os/linux/os_linux.cpp >> @@ -367,7 +367,9 @@ >> ?????? } >> ???? } >> ???? Arguments::set_java_home(buf); >> -??? set_boot_path('/', ':'); >> +??? if (!set_boot_path('/', ':')) { >> +????? vm_exit_during_initialization("Failed setting boot class >> path.", NULL); >> +??? } >> ?? } >> >> Note that before the above change, the return status of >> set_boot_path() isn't checked. >> The above would involve changing 5 of those os_*.cpp files, one for >> each O/S. >> >>> ? With certainly an added assert later in >>> ClassLoader::setup_bootstrap_search_path() to ensure that the system >>> class path is never NULL. >> Sure, I can add an assert there. >> I'll post updated webrev once I've made the change and done testing. >> >> thanks, >> Calvin >>> >>> Thanks, >>> Lois >>> >>>> >>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>> >>>> thanks, >>>> Calvin >>> From volker.simonis at gmail.com Tue Jul 10 17:52:39 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 10 Jul 2018 19:52:39 +0200 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 Message-ID: Hi, can I please get a review for the following test-only change: http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ https://bugs.openjdk.java.net/browse/JDK-8206998 The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java intentionally disables caching of Elf sections during symbol lookup with WhiteBox.disableElfSectionCache(). On platforms which do not use file descriptors instead of plain function pointers this slows down the lookup just a little bit, because all the symbols from an Elf file are still read consecutively after one 'fseek()' call. But on platforms with file descriptors like ppc64 big-endian, we get two 'fseek()' calls for each symbol read from the Elf file because reading the file descriptor table is nested inside the loop which reads the symbols. This really trashes the I/O system and considerable slows down the test, so we need an extra long timeout setting. The fix is trivial - simply provide two test versions (i.e. comments): the first one for all Linux flavors which are not ppc64 and a second, new one for Linux/ppc64 which simply has a bigger timeout. Thank you and best regards, Volker From zgu at redhat.com Tue Jul 10 18:04:41 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 10 Jul 2018 14:04:41 -0400 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: References: Message-ID: <0231b4f2-212d-ce1b-afd0-bc2a45ff5b54@redhat.com> Looks good and trivial to me. Thanks, -Zhengyu On 07/10/2018 01:52 PM, Volker Simonis wrote: > Hi, > > can I please get a review for the following test-only change: > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ > https://bugs.openjdk.java.net/browse/JDK-8206998 > > The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java > intentionally disables caching of Elf sections during symbol lookup > with WhiteBox.disableElfSectionCache(). On platforms which do not use > file descriptors instead of plain function pointers this slows down > the lookup just a little bit, because all the symbols from an Elf file > are still read consecutively after one 'fseek()' call. But on > platforms with file descriptors like ppc64 big-endian, we get two > 'fseek()' calls for each symbol read from the Elf file because reading > the file descriptor table is nested inside the loop which reads the > symbols. This really trashes the I/O system and considerable slows > down the test, so we need an extra long timeout setting. > > The fix is trivial - simply provide two test versions (i.e. comments): > the first one for all Linux flavors which are not ppc64 and a second, > new one for Linux/ppc64 which simply has a bigger timeout. > > Thank you and best regards, > Volker > From calvin.cheung at oracle.com Tue Jul 10 19:10:24 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 10 Jul 2018 12:10:24 -0700 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> Message-ID: <5B4504A0.5070607@oracle.com> Hi Lois, I'm wondering if the ResourceMark in the following function in universe.cpp could be removed? If I understand the code correctly, the ResourceMark is necessary for Universe::reinitialize_itables() which calls into klassItable::initialize_itable() where you've added ResourceMark with your change. bool universe_post_init() { assert(!is_init_completed(), "Error: initialization not yet completed!"); Universe::_fully_initialized = true; EXCEPTION_MARK; { ResourceMark rm; Interpreter::initialize(); // needed for interpreter entry points if (!UseSharedSpaces) { HandleMark hm(THREAD); Klass* ok = SystemDictionary::Object_klass(); Universe::reinitialize_vtable_of(ok, CHECK_false); Universe::reinitialize_itables(CHECK_false); } } It looks good otherwise. thanks, Calvin On 7/10/18, 10:19 AM, Lois Foltan wrote: > Please review this clean up change to correctly set ResourceMark from > within klassVtable::initialize_vtable() and > klassItable::initialize_itable() when applicable, instead of having > all instances of calls to these two methods establish a ResourceMark > unnecessarily prior to. > > open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ > bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 > > Testing: hs-tier1-3, jdk-tier1-3 (complete) > hs-tier4-5 (in progress) > > Thanks, > Lois From lois.foltan at oracle.com Tue Jul 10 19:10:49 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 10 Jul 2018 15:10:49 -0400 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> Message-ID: <977e9be8-ad4a-4ad3-c9e2-a5702cb03f9f@oracle.com> Hi Goetz, Just a couple of comments based on Coleen's review, see below. On 7/10/2018 9:03 AM, coleen.phillimore at oracle.com wrote: > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/classfile/moduleEntry.cpp.udiff.html > > > + name ? name->as_C_string() : ""); Instead of "" please use the UNNAMED_MODULE macro from moduleEntry.hpp. > > > Can you change to: > > + name != NULL ? name->as_C_string() : ""); > > > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/oops/klassVtable.cpp.udiff.html > > > This looks a lot nicer!?? Similar code is in linkResolver.cpp, can you > look at changing it too? I have an RFR out currently for JDK-8205611, (see http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-June/033325.html), which needs one more reviewer's okay.? It contains changes to reword the error messages for loader constraint violations in order to follow the new proposed format for module and class loader information.? So our two changes will conflict in this area. Thanks, Lois > > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html > > > If name is null here, what would this do?? Should there be an 'else' > to print something? > > I think this looks fine.? It doesn't look major to me.? The asserts > turned to guarantees don't appear to be anywhere performance sensitive. > > Thanks, > Coleen > > > > On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: >> Hi, >> >> I ran coverity on the jdk11 hotspot sources and want to propose the >> following fixes to the runtime code. I scanned the linux x86_64 build. >> Some issues are similar to previous parfait fixes (check for NULL, add >> guarantees etc.) I also identified some issues I consider real problems. >> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ >> >> In detail: >> >> Real issues: >> ------------ >> >> jvmtiEnvBase.cpp >> ?? || should be &&. >> ?? Attention, this is the only change that really will change behaviour. >> ?? But if thr == NULL we will see a crash below. >> >> perfMemory_linux.cpp: >> ?? Wrong buffer length used. >> >> systemDictionary.cpp: >> ?? Move code dereferencing ik under if (ik != NULL). >> >> virtualspace.cpp >> ?? Initialization is missing. Moved constructor up to the other >> ?? constructors. >> >> >> Useful code improvements: >> ------------------------- >> >> vm_version_ext_x86.cpp >> ?? Assure buffer is not accessed at offset -1. >> >> os_linux.cpp >> ?? Numa_max_node returns int, and a -1 in some cases. >> >> moduleEntry.cpp >> ?? name might be NULL. Just a fix for tracing. >> >> systemDictionaryShared.cpp >> ?? clearify code. >> ?? It would be wrong if only entry == NULL would hold, one >> ?? would hit the assertion below. >> >> verifier.cpp >> ?? Fix tracing. >> ?? Illegal opcode is -1 and should not be passed to name array. >> >> logOutput.cpp >> ?? If n_selections == 0, best_selection would be NULL. >> ?? Move up the assertion and turn into a guarantee. >> >> filemap.cpp >> ?? Either base can be NULL, or parts of the code before are dead. >> >> metaspace.cpp >> ?? We now an exception is pending. >> >> klassVtable.cpp >> ?? Coverity does not like the format in a variable. >> ?? Anyways this is quite rough coding, transformed to use stringStream >> ?? as with other similar exceptions. >> >> jvmFlag.cpp >> ?? match might be NULL. >> >> writableFlags.cpp >> ?? name might be NULL. >> >> ostream.cpp >> ?? If ftell returns error code -1, we need not continue. >> ?? Especially we should not fseek(-1). >> >> logTestUtils.inline.hpp >> ?? ftell returns -1. >> >> test_metachunk.cpp >> ?? wrong datatype. > From jiangli.zhou at oracle.com Tue Jul 10 19:18:31 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 10 Jul 2018 12:18:31 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B44ED62.5030008@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> Message-ID: Hi Calvin, The error handling code in platform specific code are identical. I like Lois? suggestion to check and exit in os::set_boot_path() to avoid duplicating the code. Also, under low memory condition, set_value() might fail to allocate and not trigger any error with a release binary. os::set_boot_path() probably should also check and make sure sys path is not NULL after Arguments::set_sysclasspath(). bool PathString::set_value(const char *value) { if (_value != NULL) { FreeHeap(_value); } _value = AllocateHeap(strlen(value)+1, mtArguments); assert(_value != NULL, "Unable to allocate space for new path value"); if (_value != NULL) { strcpy(_value, value); } else { // not able to allocate return false; } return true; } Thanks, Jiangli > On Jul 10, 2018, at 10:31 AM, Calvin Cheung wrote: > > Updated webrev with the changes mentioned below: > http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ > > I've rerun hs-tier{1,2,3} tests. > > thanks, > Calvin > > On 7/9/18, 1:43 PM, Calvin Cheung wrote: >> Hi Lois, >> >> Thanks for your review. >> >> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>> >>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>> >>>> The JVM crash could be simulated by renaming/removing the modules file under the jdk/lib directory. >>>> The proposed simple fix is to perform a vm_exit_during_initialization(). >>> >>> Hi Calvin, >>> >>> Some clarifying questions. Is this just an issue for exploded builds? >> I don't think so. As mentioned above, I could reproduce the crash with a regular jdk image build by renaming the modules file under the jdk/lib directory. >>> I would prefer the exit to occur if the os::stat() fails for the system class path in os::set_boot_path(). >> Instead of exiting in os::set_boot_path(), how about checking the return status of os::set_boot_path() in the caller and exiting there like the following: >> bash-4.2$ hg diff os_linux.cpp >> diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp >> --- a/src/hotspot/os/linux/os_linux.cpp >> +++ b/src/hotspot/os/linux/os_linux.cpp >> @@ -367,7 +367,9 @@ >> } >> } >> Arguments::set_java_home(buf); >> - set_boot_path('/', ':'); >> + if (!set_boot_path('/', ':')) { >> + vm_exit_during_initialization("Failed setting boot class path.", NULL); >> + } >> } >> >> Note that before the above change, the return status of set_boot_path() isn't checked. >> The above would involve changing 5 of those os_*.cpp files, one for each O/S. >> >>> With certainly an added assert later in ClassLoader::setup_bootstrap_search_path() to ensure that the system class path is never NULL. >> Sure, I can add an assert there. >> I'll post updated webrev once I've made the change and done testing. >> >> thanks, >> Calvin >>> >>> Thanks, >>> Lois >>> >>>> >>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>> >>>> thanks, >>>> Calvin >>> From lois.foltan at oracle.com Tue Jul 10 19:34:09 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 10 Jul 2018 15:34:09 -0400 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <5B4504A0.5070607@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> <5B4504A0.5070607@oracle.com> Message-ID: On 7/10/2018 3:10 PM, Calvin Cheung wrote: > Hi Lois, > > I'm wondering if the ResourceMark in the following function in > universe.cpp could be removed? > If I understand the code correctly, the ResourceMark is necessary for > Universe::reinitialize_itables() which calls into > klassItable::initialize_itable() where you've added ResourceMark with > your change. > > bool universe_post_init() { > ? assert(!is_init_completed(), "Error: initialization not yet > completed!"); > ? Universe::_fully_initialized = true; > ? EXCEPTION_MARK; > ? { ResourceMark rm; > ??? Interpreter::initialize();????? // needed for interpreter entry > points > ??? if (!UseSharedSpaces) { > ????? HandleMark hm(THREAD); > ????? Klass* ok = SystemDictionary::Object_klass(); > ????? Universe::reinitialize_vtable_of(ok, CHECK_false); > ????? Universe::reinitialize_itables(CHECK_false); > ??? } > ? } Thanks Calvin for the review!? I wondered that as well, but I think the ResourceMark may be needed for the Interpreter::initialize(). For example, it calls TemplateTable::initialize() which logs timer information which I suspect may need a ResourceMark.? So, it wasn't clear that the ResourceMark in universe_post_init() was solely needed for the reinitialize_vtable and itables. Thanks, Lois > > It looks good otherwise. > > thanks, > Calvin > > On 7/10/18, 10:19 AM, Lois Foltan wrote: >> Please review this clean up change to correctly set ResourceMark from >> within klassVtable::initialize_vtable() and >> klassItable::initialize_itable() when applicable, instead of having >> all instances of calls to these two methods establish a ResourceMark >> unnecessarily prior to. >> >> open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ >> bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 >> >> Testing: hs-tier1-3, jdk-tier1-3 (complete) >> ?????????????? hs-tier4-5 (in progress) >> >> Thanks, >> Lois From jiangli.zhou at oracle.com Tue Jul 10 19:50:20 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 10 Jul 2018 12:50:20 -0700 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> Message-ID: <822BDE95-7902-4406-A5A2-549FD2120362@oracle.com> Looks good to me. Thanks, Jiangli > On Jul 10, 2018, at 10:19 AM, Lois Foltan wrote: > > Please review this clean up change to correctly set ResourceMark from within klassVtable::initialize_vtable() and klassItable::initialize_itable() when applicable, instead of having all instances of calls to these two methods establish a ResourceMark unnecessarily prior to. > > open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ > bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 > > Testing: hs-tier1-3, jdk-tier1-3 (complete) > hs-tier4-5 (in progress) > > Thanks, > Lois From ioi.lam at oracle.com Tue Jul 10 19:50:19 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 10 Jul 2018 12:50:19 -0700 Subject: Proposal for improving CDS archive creation In-Reply-To: References: Message-ID: Fixing some sloppy text below .... On 7/10/18 10:16 AM, Ioi Lam wrote: > I have a proposal for improving the process of creating of the CDS > archive(s), > so we can make CDS easier to use and support more use cases. > > ?? - better support for custom loaders > ?? - remove explicit training run > ?? - support 2 levels of shared archives > > I think the proposal is relatively straight-forward to implement, as > we already > have most of the required infrastructures: > > ?? + the ability to use Java class loaders at archive creation time > ?? + the ability to relocate MetaspaceObjects > > Parts of this proposal will also simplify the CDS code and make it more > maintainable. > > Current process of creating the base archive - [C] > ================================================== > > Currently each JVM process can map at most one CDS archive. Let's call > this > the "base archive". It is created by [ref1]: > > ?C1. Reserve a region R of 3GB at 0x800000000. > ?C2. Load all classes specified in the class list. All data for these > classes > ???? live outside of R. > ???? (E.g., the Klass objects are loaded into tmp_class_space, which is > ????? adjacent to R). > ?C3. Copy the metadata of all archivable classes (e.g, exclude generated > ???? Lambda classes) into R. At this step, R is divided into several > ? ?? sections (RO, RW, etc). > > > ? //? +-- SharedBaseAddress?? (default = 0x800000000) > ? //? +-- _narrow_klass._base > ? //? | > ? //? |?????????????????????????????? +-tmp_class_space.base > ? //? v?????????????????????????????? V > ? //? +----+----+----+----+----+-....-+-------------------+ > ? //? |<-?????????? R?????????????? ->| > ? //? | MC | RW | RO | MD | OD |unused| tmp_class_space?? | > ? //? +----+----+----+----+----+------+-------------------+ > ? //? |<--? 3GB??????? -------------->| > ? //? |<-- UnscaledClassSpaceMax = 4GB ------------------>| > > > New process for creating the base archive - [N] > =============================================== > > Currently we have a lot of "if (DumpSharedSpaces)" code to for special > case > handling of the above scheme. We can improve it by > > ?N1. Remove all code for special memory layout initialization for > -Xshare:dump. > ???? As a result, we will reserve a region R of 1GB at 0x800000000, which > ???? is used by Klass objects (this is the same as if -Xshare:off were > ???? specified.) > ?N2. Load all classes in the class list. > ?N3. Now R contains the Klass objects of all loaded classes. > ???? Allocate a temporary space T, and copy all contents of R into T. > ?N4. Now R is empty. Copy the metadata of all archivable classes into R. > > > Dump-as-you-go for the base archive - [G] > ========================================= > > Note that the [N] scheme will work even if you're running an app with > -Xshare:off. At some point (e.g., when the VM is about to exit), you > can: > > ?G1. Enter a safe point > ?G2. Go to step [N3]. > > The benefit of [G] is you don't need a separate run to dump the > archive, and > there's no need to use the class list. Instead, we can have an option > like: > > ?? java -Xshare:autocreate -cp app.jar -XX:SharedArchiveFile=foo.jsa App > > If foo.jsa is not available, we run in [G] mode. At VM exit, we dump into > foo.jsa. > > This way, we don't need to have an explicit training run with > -XX:DumpLoadedClassList. Instead, the training run is > I meant, "Instead, your first run, when the archive is not yet available, becomes the training run". Thanks to Calvin and Dan for spotting this :-) - Ioi > This also makes it easy to support the classes from custom loaders. > There's no > need for special tooling to convert -Xlog:class+load=debug output into a > classlist. [ref2] > > > Dumping for second-level archive - [S] > ====================================== > > ?S1. Load the base archive > ?S2. Run the app as normal > ?S3. All Klass objects of the dynamically loaded classes will be > loaded in > ???? the region R, which immediately follows the end of the base archive. > > ? //? +-- SharedBaseAddress > ? //? |????????????????????????? +--- dynamically loaded Klasses > ? //? |????????????????????????? |??? start from here. > ? //? v????????????????????????? v > ? //? +--------------------------+---------...-----------------| > ? //? | base archive???????????? | region R??????????????????? | > ? //? +--------------------------+---------...-----------------| > ? //? |<- size of base archive ->| > ? //? |<--??????????? 1GB -->| > > > ? S4. At some point (possible when the VM is about to exit) we start > ????? dumping the second level archive > ? S5. Enter safe point > ? S6. Now R contains the Klass objects of all dynamically loaded classes. > ????? Allocate a temporary space T, and copy all contents of R into T. > ? S7. Now R is empty. Copy the metadata of all archivable, dynamically > loaded > ????? classes into R. > ? S8. Create a new shared_dictionary (and shared_symbol_table) that > contains > ????? all the Klasses (Symbols) from both the base and second-level > archives. > > References > ========== > > [ref1] Current initialization of memory space layout during -Xshare:dump > http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 > > [ref2] Volker Simonis's tool for support custom class loaders in CDS > ?????? https://github.com/simonis/cl4cds > ---------------------------------------------------------------------- > > > > Any thoughts? > > Thanks > - Ioi From ioi.lam at oracle.com Tue Jul 10 19:55:42 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 10 Jul 2018 12:55:42 -0700 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> Message-ID: Hi Lois, Looks good. ?905 int klassVtable::fill_in_mirandas(int initialized) { ?906?? ResourceMark rm(Thread::current()); maybe this function can have an addition THREAD parameter? That way you can avoid calling Thread::current(), which may be expensive. Thanks - Ioi On 7/10/18 10:19 AM, Lois Foltan wrote: > Please review this clean up change to correctly set ResourceMark from > within klassVtable::initialize_vtable() and > klassItable::initialize_itable() when applicable, instead of having > all instances of calls to these two methods establish a ResourceMark > unnecessarily prior to. > > open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ > bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 > > Testing: hs-tier1-3, jdk-tier1-3 (complete) > ?????????????? hs-tier4-5 (in progress) > > Thanks, > Lois From lois.foltan at oracle.com Tue Jul 10 20:10:47 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 10 Jul 2018 16:10:47 -0400 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <822BDE95-7902-4406-A5A2-549FD2120362@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> <822BDE95-7902-4406-A5A2-549FD2120362@oracle.com> Message-ID: <9af1ec55-fa19-673b-cd54-242b3def1b7f@oracle.com> Thanks for the review Jiangli! Lois On 7/10/2018 3:50 PM, Jiangli Zhou wrote: > Looks good to me. > > Thanks, > Jiangli > >> On Jul 10, 2018, at 10:19 AM, Lois Foltan wrote: >> >> Please review this clean up change to correctly set ResourceMark from within klassVtable::initialize_vtable() and klassItable::initialize_itable() when applicable, instead of having all instances of calls to these two methods establish a ResourceMark unnecessarily prior to. >> >> open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ >> bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 >> >> Testing: hs-tier1-3, jdk-tier1-3 (complete) >> hs-tier4-5 (in progress) >> >> Thanks, >> Lois From lois.foltan at oracle.com Tue Jul 10 20:12:10 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 10 Jul 2018 16:12:10 -0400 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> Message-ID: <172838b3-c501-c2e3-75ef-e67eb67ce791@oracle.com> On 7/10/2018 3:55 PM, Ioi Lam wrote: > Hi Lois, > > Looks good. > > ?905 int klassVtable::fill_in_mirandas(int initialized) { > ?906?? ResourceMark rm(Thread::current()); > > maybe this function can have an addition THREAD parameter? That way > you can avoid calling Thread::current(), which may be expensive. Thanks Ioi!? Good point, new webrev in case you want to see it at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712.1/webrev/ Lois > > Thanks > > - Ioi > > > On 7/10/18 10:19 AM, Lois Foltan wrote: >> Please review this clean up change to correctly set ResourceMark from >> within klassVtable::initialize_vtable() and >> klassItable::initialize_itable() when applicable, instead of having >> all instances of calls to these two methods establish a ResourceMark >> unnecessarily prior to. >> >> open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ >> bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 >> >> Testing: hs-tier1-3, jdk-tier1-3 (complete) >> ?????????????? hs-tier4-5 (in progress) >> >> Thanks, >> Lois > From ioi.lam at oracle.com Tue Jul 10 20:16:35 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 10 Jul 2018 13:16:35 -0700 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <172838b3-c501-c2e3-75ef-e67eb67ce791@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> <172838b3-c501-c2e3-75ef-e67eb67ce791@oracle.com> Message-ID: <8bddaa7d-7c11-4f48-be56-3d2a44e9ffdc@oracle.com> On 7/10/18 1:12 PM, Lois Foltan wrote: > On 7/10/2018 3:55 PM, Ioi Lam wrote: > >> Hi Lois, >> >> Looks good. >> >> ?905 int klassVtable::fill_in_mirandas(int initialized) { >> ?906?? ResourceMark rm(Thread::current()); >> >> maybe this function can have an addition THREAD parameter? That way >> you can avoid calling Thread::current(), which may be expensive. > > Thanks Ioi!? Good point, new webrev in case you want to see it at > http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712.1/webrev/ > Lois > Looks good. Thanks! - Ioi >> >> Thanks >> >> - Ioi >> >> >> On 7/10/18 10:19 AM, Lois Foltan wrote: >>> Please review this clean up change to correctly set ResourceMark >>> from within klassVtable::initialize_vtable() and >>> klassItable::initialize_itable() when applicable, instead of having >>> all instances of calls to these two methods establish a ResourceMark >>> unnecessarily prior to. >>> >>> open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ >>> bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 >>> >>> Testing: hs-tier1-3, jdk-tier1-3 (complete) >>> ?????????????? hs-tier4-5 (in progress) >>> >>> Thanks, >>> Lois >> > From calvin.cheung at oracle.com Tue Jul 10 20:17:10 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 10 Jul 2018 13:17:10 -0700 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> <5B4504A0.5070607@oracle.com> Message-ID: <5B451446.6000905@oracle.com> On 7/10/18, 12:34 PM, Lois Foltan wrote: > On 7/10/2018 3:10 PM, Calvin Cheung wrote: > >> Hi Lois, >> >> I'm wondering if the ResourceMark in the following function in >> universe.cpp could be removed? >> If I understand the code correctly, the ResourceMark is necessary for >> Universe::reinitialize_itables() which calls into >> klassItable::initialize_itable() where you've added ResourceMark with >> your change. >> >> bool universe_post_init() { >> assert(!is_init_completed(), "Error: initialization not yet >> completed!"); >> Universe::_fully_initialized = true; >> EXCEPTION_MARK; >> { ResourceMark rm; >> Interpreter::initialize(); // needed for interpreter entry >> points >> if (!UseSharedSpaces) { >> HandleMark hm(THREAD); >> Klass* ok = SystemDictionary::Object_klass(); >> Universe::reinitialize_vtable_of(ok, CHECK_false); >> Universe::reinitialize_itables(CHECK_false); >> } >> } > > Thanks Calvin for the review! I wondered that as well, but I think > the ResourceMark may be needed for the Interpreter::initialize(). For > example, it calls TemplateTable::initialize() which logs timer > information which I suspect may need a ResourceMark. So, it wasn't > clear that the ResourceMark in universe_post_init() was solely needed > for the reinitialize_vtable and itables. In timerTrace.hpp: // TraceTime is used for tracing the execution time of a block // Usage: // { // TraceTime t("some timer", TIMERTRACE_LOG(Info, startuptime, tagX...)); // some_code(); // } // I looked at several usage of TraceTime and they all don't have ResourceMark before it. I'm fine with leaving the ResourceMark in universe_post_init() if you want to play it safe. thanks, Calvin > > Thanks, > Lois > >> >> It looks good otherwise. >> >> thanks, >> Calvin >> >> On 7/10/18, 10:19 AM, Lois Foltan wrote: >>> Please review this clean up change to correctly set ResourceMark >>> from within klassVtable::initialize_vtable() and >>> klassItable::initialize_itable() when applicable, instead of having >>> all instances of calls to these two methods establish a ResourceMark >>> unnecessarily prior to. >>> >>> open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ >>> bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 >>> >>> Testing: hs-tier1-3, jdk-tier1-3 (complete) >>> hs-tier4-5 (in progress) >>> >>> Thanks, >>> Lois > From calvin.cheung at oracle.com Tue Jul 10 21:17:06 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 10 Jul 2018 14:17:06 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> Message-ID: <5B452252.9010209@oracle.com> Hi Jiangli, Thanks for reviewing. On 7/10/18, 12:18 PM, Jiangli Zhou wrote: > Hi Calvin, > > The error handling code in platform specific code are identical. I > like Lois? suggestion to check and exit in os::set_boot_path() to > avoid duplicating the code. If you want changes in os::set_boot_path() instead of in platform specific code, I'm proposing the following: diff --git a/src/hotspot/share/runtime/os.cpp b/src/hotspot/share/runtime/os.cpp --- a/src/hotspot/share/runtime/os.cpp +++ b/src/hotspot/share/runtime/os.cpp @@ -1270,7 +1270,7 @@ return file; } -bool os::set_boot_path(char fileSep, char pathSep) { +void os::set_boot_path(char fileSep, char pathSep) { const char* home = Arguments::get_java_home(); int home_len = (int)strlen(home); @@ -1278,26 +1278,30 @@ // modular image if "modules" jimage exists char* jimage = format_boot_path("%/lib/" MODULES_IMAGE_NAME, home, home_len, fileSep, pathSep); - if (jimage == NULL) return false; + if (jimage == NULL) { + vm_exit_during_initialization("Failed setting boot class path.", NULL); + } bool has_jimage = (os::stat(jimage, &st) == 0); if (has_jimage) { Arguments::set_sysclasspath(jimage, true); FREE_C_HEAP_ARRAY(char, jimage); - return true; + return; } FREE_C_HEAP_ARRAY(char, jimage); // check if developer build with exploded modules char* base_classes = format_boot_path("%/modules/" JAVA_BASE_NAME, home, home_len, fileSep, pathSep); - if (base_classes == NULL) return false; + if (base_classes == NULL) { + vm_exit_during_initialization("Failed setting boot class path.", NULL); + } if (os::stat(base_classes, &st) == 0) { Arguments::set_sysclasspath(base_classes, false); FREE_C_HEAP_ARRAY(char, base_classes); - return true; + return; } FREE_C_HEAP_ARRAY(char, base_classes); - - return false; + vm_exit_during_initialization("Failed setting boot class path.", NULL); + return; } The function fails if os::stat() fails or allocation of buffer for "jimage" or "base_classes" fails. Since vm should exit on failure in the above function, it is unnecessary for the function to return a bool. > > Also, under low memory condition, set_value() might fail to allocate > and not trigger any error with a release binary. os::set_boot_path() > probably should also check and make sure sys path is not NULL after > Arguments::set_sysclasspath(). The AllocateHeap() you listed below will call into the following which will exit on out of memory. So I don' think we need to do NULL check again. // allocate using malloc; will fail if no memory available char* AllocateHeap(size_t size, MEMFLAGS flags, const NativeCallStack& stack, AllocFailType alloc_failmode /* = AllocFailStrategy::EXIT_OOM*/) { char* p = (char*) os::malloc(size, flags, stack); if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); } return p; } thanks, Calvin > > bool PathString::set_value(const char *value) { > if (_value != NULL) { > FreeHeap(_value); > } > _value = AllocateHeap(strlen(value)+1, mtArguments); > assert(_value != NULL, "Unable to allocate space for new path value"); > if (_value != NULL) { > strcpy(_value, value); > } else { > // not able to allocate > return false; > } > returntrue; > } > > Thanks, > Jiangli > >> On Jul 10, 2018, at 10:31 AM, Calvin Cheung > > wrote: >> >> Updated webrev with the changes mentioned below: >> http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ >> >> >> I've rerun hs-tier{1,2,3} tests. >> >> thanks, >> Calvin >> >> On 7/9/18, 1:43 PM, Calvin Cheung wrote: >>> Hi Lois, >>> >>> Thanks for your review. >>> >>> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>>> >>>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>>> >>>>> >>>>> The JVM crash could be simulated by renaming/removing the modules >>>>> file under the jdk/lib directory. >>>>> The proposed simple fix is to perform a >>>>> vm_exit_during_initialization(). >>>> >>>> Hi Calvin, >>>> >>>> Some clarifying questions. Is this just an issue for exploded builds? >>> I don't think so. As mentioned above, I could reproduce the crash >>> with a regular jdk image build by renaming the modules file under >>> the jdk/lib directory. >>>> I would prefer the exit to occur if the os::stat() fails for the >>>> system class path in os::set_boot_path(). >>> Instead of exiting in os::set_boot_path(), how about checking the >>> return status of os::set_boot_path() in the caller and exiting there >>> like the following: >>> bash-4.2$ hg diff os_linux.cpp >>> diff --git a/src/hotspot/os/linux/os_linux.cpp >>> b/src/hotspot/os/linux/os_linux.cpp >>> --- a/src/hotspot/os/linux/os_linux.cpp >>> +++ b/src/hotspot/os/linux/os_linux.cpp >>> @@ -367,7 +367,9 @@ >>> } >>> } >>> Arguments::set_java_home(buf); >>> - set_boot_path('/', ':'); >>> + if (!set_boot_path('/', ':')) { >>> + vm_exit_during_initialization("Failed setting boot class >>> path.", NULL); >>> + } >>> } >>> >>> Note that before the above change, the return status of >>> set_boot_path() isn't checked. >>> The above would involve changing 5 of those os_*.cpp files, one for >>> each O/S. >>> >>>> With certainly an added assert later in >>>> ClassLoader::setup_bootstrap_search_path() to ensure that the >>>> system class path is never NULL. >>> Sure, I can add an assert there. >>> I'll post updated webrev once I've made the change and done testing. >>> >>> thanks, >>> Calvin >>>> >>>> Thanks, >>>> Lois >>>> >>>>> >>>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>>> >>>>> thanks, >>>>> Calvin >>>> > From david.holmes at oracle.com Tue Jul 10 21:24:21 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Jul 2018 07:24:21 +1000 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: References: Message-ID: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> Hi Volker, On 11/07/2018 3:52 AM, Volker Simonis wrote: > Hi, > > can I please get a review for the following test-only change: > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ > https://bugs.openjdk.java.net/browse/JDK-8206998 > > The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java > intentionally disables caching of Elf sections during symbol lookup > with WhiteBox.disableElfSectionCache(). On platforms which do not use > file descriptors instead of plain function pointers this slows down > the lookup just a little bit, because all the symbols from an Elf file > are still read consecutively after one 'fseek()' call. But on > platforms with file descriptors like ppc64 big-endian, we get two > 'fseek()' calls for each symbol read from the Elf file because reading > the file descriptor table is nested inside the loop which reads the > symbols. This really trashes the I/O system and considerable slows > down the test, so we need an extra long timeout setting. > > The fix is trivial - simply provide two test versions (i.e. comments): > the first one for all Linux flavors which are not ppc64 and a second, > new one for Linux/ppc64 which simply has a bigger timeout. I was not aware that this was a valid way of defining a test! This suggests there can only be one "leading comment" per "defining file: http://openjdk.java.net/jtreg/tag-spec.html Need to verify this with the jtreg folk: jtreg-use at openjdk.java.net Thanks, David > Thank you and best regards, > Volker > From david.holmes at oracle.com Tue Jul 10 22:01:23 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Jul 2018 08:01:23 +1000 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B452252.9010209@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> <5B452252.9010209@oracle.com> Message-ID: <31e1a39a-53b9-d923-5e86-c290827d9bdb@oracle.com> Calling vm_exit_during_initialization is very unfriendly for applications that host the JVM in-process directly. With that in mind decisions to call vm_exit_* should done at as a high a level as feasible in the code - ie. a low-level method like os::set_boot_path should never IMHO be making a decision as to whether an error it encounters is fatal to the whole VM initialization process. That's a decision to be made higher up. My 2c. David On 11/07/2018 7:17 AM, Calvin Cheung wrote: > Hi Jiangli, > > Thanks for reviewing. > > On 7/10/18, 12:18 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> The error handling code in platform specific code are identical. I >> like Lois? suggestion to check and exit in os::set_boot_path() to >> avoid duplicating the code. > If you want changes in os::set_boot_path() instead of in platform > specific code, I'm proposing the following: > diff --git a/src/hotspot/share/runtime/os.cpp > b/src/hotspot/share/runtime/os.cpp > --- a/src/hotspot/share/runtime/os.cpp > +++ b/src/hotspot/share/runtime/os.cpp > @@ -1270,7 +1270,7 @@ > ?? return file; > ?} > > -bool os::set_boot_path(char fileSep, char pathSep) { > +void os::set_boot_path(char fileSep, char pathSep) { > ?? const char* home = Arguments::get_java_home(); > ?? int home_len = (int)strlen(home); > > @@ -1278,26 +1278,30 @@ > > ?? // modular image if "modules" jimage exists > ?? char* jimage = format_boot_path("%/lib/" MODULES_IMAGE_NAME, home, > home_len, fileSep, pathSep); > -? if (jimage == NULL) return false; > +? if (jimage == NULL) { > +??? vm_exit_during_initialization("Failed setting boot class path.", > NULL); > +? } > ?? bool has_jimage = (os::stat(jimage, &st) == 0); > ?? if (has_jimage) { > ???? Arguments::set_sysclasspath(jimage, true); > ???? FREE_C_HEAP_ARRAY(char, jimage); > -??? return true; > +??? return; > ?? } > ?? FREE_C_HEAP_ARRAY(char, jimage); > > ?? // check if developer build with exploded modules > ?? char* base_classes = format_boot_path("%/modules/" JAVA_BASE_NAME, > home, home_len, fileSep, pathSep); > -? if (base_classes == NULL) return false; > +? if (base_classes == NULL) { > +??? vm_exit_during_initialization("Failed setting boot class path.", > NULL); > +? } > ?? if (os::stat(base_classes, &st) == 0) { > ???? Arguments::set_sysclasspath(base_classes, false); > ???? FREE_C_HEAP_ARRAY(char, base_classes); > -??? return true; > +??? return; > ?? } > ?? FREE_C_HEAP_ARRAY(char, base_classes); > - > -? return false; > +? vm_exit_during_initialization("Failed setting boot class path.", NULL); > +? return; > ?} > > The function fails if os::stat() fails or allocation of buffer for > "jimage" or "base_classes" fails. > Since vm should exit on failure in the above function, it is unnecessary > for the function to return a bool. > >> >> Also, under low memory condition, set_value() might fail to allocate >> and not trigger any error with a release binary.? os::set_boot_path() >> probably should also check and make sure sys path is not NULL after >> Arguments::set_sysclasspath(). > The AllocateHeap() you listed below will call into the following which > will exit on out of memory. So I don' think we need to do NULL check again. > > // allocate using malloc; will fail if no memory available > char* AllocateHeap(size_t size, > ?????????????????? MEMFLAGS flags, > ?????????????????? const NativeCallStack& stack, > ?????????????????? AllocFailType alloc_failmode /* = > AllocFailStrategy::EXIT_OOM*/) { > ? char* p = (char*) os::malloc(size, flags, stack); > ? if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { > ??? vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); > ? } > ? return p; > } > > thanks, > Calvin >> >> bool PathString::set_value(const char *value) { >> if (_value != NULL) { >> ??? FreeHeap(_value); >> ? } >> ? _value = AllocateHeap(strlen(value)+1, mtArguments); >> ? assert(_value != NULL, "Unable to allocate space for new path value"); >> if (_value != NULL) { >> ??? strcpy(_value, value); >> ? } else { >> // not able to allocate >> return false; >> ? } >> returntrue; >> } >> >> Thanks, >> Jiangli >> >>> On Jul 10, 2018, at 10:31 AM, Calvin Cheung >> > wrote: >>> >>> Updated webrev with the changes mentioned below: >>> http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ >>> >>> >>> I've rerun hs-tier{1,2,3} tests. >>> >>> thanks, >>> Calvin >>> >>> On 7/9/18, 1:43 PM, Calvin Cheung wrote: >>>> Hi Lois, >>>> >>>> Thanks for your review. >>>> >>>> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>>>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>>>> >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>>>> >>>>>> >>>>>> The JVM crash could be simulated by renaming/removing the modules >>>>>> file under the jdk/lib directory. >>>>>> The proposed simple fix is to perform a >>>>>> vm_exit_during_initialization(). >>>>> >>>>> Hi Calvin, >>>>> >>>>> Some clarifying questions.? Is this just an issue for exploded builds? >>>> I don't think so. As mentioned above, I could reproduce the crash >>>> with a regular jdk image build by renaming the modules file under >>>> the jdk/lib directory. >>>>> ?I would prefer the exit to occur if the os::stat() fails for the >>>>> system class path in os::set_boot_path(). >>>> Instead of exiting in os::set_boot_path(), how about checking the >>>> return status of os::set_boot_path() in the caller and exiting there >>>> like the following: >>>> bash-4.2$ hg diff os_linux.cpp >>>> diff --git a/src/hotspot/os/linux/os_linux.cpp >>>> b/src/hotspot/os/linux/os_linux.cpp >>>> --- a/src/hotspot/os/linux/os_linux.cpp >>>> +++ b/src/hotspot/os/linux/os_linux.cpp >>>> @@ -367,7 +367,9 @@ >>>> ????? } >>>> ??? } >>>> ??? Arguments::set_java_home(buf); >>>> -??? set_boot_path('/', ':'); >>>> +??? if (!set_boot_path('/', ':')) { >>>> +????? vm_exit_during_initialization("Failed setting boot class >>>> path.", NULL); >>>> +??? } >>>> ? } >>>> >>>> Note that before the above change, the return status of >>>> set_boot_path() isn't checked. >>>> The above would involve changing 5 of those os_*.cpp files, one for >>>> each O/S. >>>> >>>>> ?With certainly an added assert later in >>>>> ClassLoader::setup_bootstrap_search_path() to ensure that the >>>>> system class path is never NULL. >>>> Sure, I can add an assert there. >>>> I'll post updated webrev once I've made the change and done testing. >>>> >>>> thanks, >>>> Calvin >>>>> >>>>> Thanks, >>>>> Lois >>>>> >>>>>> >>>>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>> >> From calvin.cheung at oracle.com Tue Jul 10 22:27:16 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 10 Jul 2018 15:27:16 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <31e1a39a-53b9-d923-5e86-c290827d9bdb@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> <5B452252.9010209@oracle.com> <31e1a39a-53b9-d923-5e86-c290827d9bdb@oracle.com> Message-ID: <5B4532C4.6010904@oracle.com> Hi David, Thanks for chiming in. I take that I should go with webrev.01 of the change? thanks, Calvin On 7/10/18, 3:01 PM, David Holmes wrote: > Calling vm_exit_during_initialization is very unfriendly for > applications that host the JVM in-process directly. With that in mind > decisions to call vm_exit_* should done at as a high a level as > feasible in the code - ie. a low-level method like os::set_boot_path > should never IMHO be making a decision as to whether an error it > encounters is fatal to the whole VM initialization process. That's a > decision to be made higher up. > > My 2c. > > David > > On 11/07/2018 7:17 AM, Calvin Cheung wrote: >> Hi Jiangli, >> >> Thanks for reviewing. >> >> On 7/10/18, 12:18 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> The error handling code in platform specific code are identical. I >>> like Lois? suggestion to check and exit in os::set_boot_path() to >>> avoid duplicating the code. >> If you want changes in os::set_boot_path() instead of in platform >> specific code, I'm proposing the following: >> diff --git a/src/hotspot/share/runtime/os.cpp >> b/src/hotspot/share/runtime/os.cpp >> --- a/src/hotspot/share/runtime/os.cpp >> +++ b/src/hotspot/share/runtime/os.cpp >> @@ -1270,7 +1270,7 @@ >> return file; >> } >> >> -bool os::set_boot_path(char fileSep, char pathSep) { >> +void os::set_boot_path(char fileSep, char pathSep) { >> const char* home = Arguments::get_java_home(); >> int home_len = (int)strlen(home); >> >> @@ -1278,26 +1278,30 @@ >> >> // modular image if "modules" jimage exists >> char* jimage = format_boot_path("%/lib/" MODULES_IMAGE_NAME, >> home, home_len, fileSep, pathSep); >> - if (jimage == NULL) return false; >> + if (jimage == NULL) { >> + vm_exit_during_initialization("Failed setting boot class path.", >> NULL); >> + } >> bool has_jimage = (os::stat(jimage, &st) == 0); >> if (has_jimage) { >> Arguments::set_sysclasspath(jimage, true); >> FREE_C_HEAP_ARRAY(char, jimage); >> - return true; >> + return; >> } >> FREE_C_HEAP_ARRAY(char, jimage); >> >> // check if developer build with exploded modules >> char* base_classes = format_boot_path("%/modules/" >> JAVA_BASE_NAME, home, home_len, fileSep, pathSep); >> - if (base_classes == NULL) return false; >> + if (base_classes == NULL) { >> + vm_exit_during_initialization("Failed setting boot class path.", >> NULL); >> + } >> if (os::stat(base_classes, &st) == 0) { >> Arguments::set_sysclasspath(base_classes, false); >> FREE_C_HEAP_ARRAY(char, base_classes); >> - return true; >> + return; >> } >> FREE_C_HEAP_ARRAY(char, base_classes); >> - >> - return false; >> + vm_exit_during_initialization("Failed setting boot class path.", >> NULL); >> + return; >> } >> >> The function fails if os::stat() fails or allocation of buffer for >> "jimage" or "base_classes" fails. >> Since vm should exit on failure in the above function, it is >> unnecessary for the function to return a bool. >> >>> >>> Also, under low memory condition, set_value() might fail to allocate >>> and not trigger any error with a release binary. >>> os::set_boot_path() probably should also check and make sure sys >>> path is not NULL after Arguments::set_sysclasspath(). >> The AllocateHeap() you listed below will call into the following >> which will exit on out of memory. So I don' think we need to do NULL >> check again. >> >> // allocate using malloc; will fail if no memory available >> char* AllocateHeap(size_t size, >> MEMFLAGS flags, >> const NativeCallStack& stack, >> AllocFailType alloc_failmode /* = >> AllocFailStrategy::EXIT_OOM*/) { >> char* p = (char*) os::malloc(size, flags, stack); >> if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { >> vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); >> } >> return p; >> } >> >> thanks, >> Calvin >>> >>> bool PathString::set_value(const char *value) { >>> if (_value != NULL) { >>> FreeHeap(_value); >>> } >>> _value = AllocateHeap(strlen(value)+1, mtArguments); >>> assert(_value != NULL, "Unable to allocate space for new path >>> value"); >>> if (_value != NULL) { >>> strcpy(_value, value); >>> } else { >>> // not able to allocate >>> return false; >>> } >>> returntrue; >>> } >>> >>> Thanks, >>> Jiangli >>> >>>> On Jul 10, 2018, at 10:31 AM, Calvin Cheung >>>> > wrote: >>>> >>>> Updated webrev with the changes mentioned below: >>>> http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ >>>> >>>> >>>> I've rerun hs-tier{1,2,3} tests. >>>> >>>> thanks, >>>> Calvin >>>> >>>> On 7/9/18, 1:43 PM, Calvin Cheung wrote: >>>>> Hi Lois, >>>>> >>>>> Thanks for your review. >>>>> >>>>> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>>>>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>>>>> >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>>>>> >>>>>>> >>>>>>> The JVM crash could be simulated by renaming/removing the >>>>>>> modules file under the jdk/lib directory. >>>>>>> The proposed simple fix is to perform a >>>>>>> vm_exit_during_initialization(). >>>>>> >>>>>> Hi Calvin, >>>>>> >>>>>> Some clarifying questions. Is this just an issue for exploded >>>>>> builds? >>>>> I don't think so. As mentioned above, I could reproduce the crash >>>>> with a regular jdk image build by renaming the modules file under >>>>> the jdk/lib directory. >>>>>> I would prefer the exit to occur if the os::stat() fails for the >>>>>> system class path in os::set_boot_path(). >>>>> Instead of exiting in os::set_boot_path(), how about checking the >>>>> return status of os::set_boot_path() in the caller and exiting >>>>> there like the following: >>>>> bash-4.2$ hg diff os_linux.cpp >>>>> diff --git a/src/hotspot/os/linux/os_linux.cpp >>>>> b/src/hotspot/os/linux/os_linux.cpp >>>>> --- a/src/hotspot/os/linux/os_linux.cpp >>>>> +++ b/src/hotspot/os/linux/os_linux.cpp >>>>> @@ -367,7 +367,9 @@ >>>>> } >>>>> } >>>>> Arguments::set_java_home(buf); >>>>> - set_boot_path('/', ':'); >>>>> + if (!set_boot_path('/', ':')) { >>>>> + vm_exit_during_initialization("Failed setting boot class >>>>> path.", NULL); >>>>> + } >>>>> } >>>>> >>>>> Note that before the above change, the return status of >>>>> set_boot_path() isn't checked. >>>>> The above would involve changing 5 of those os_*.cpp files, one >>>>> for each O/S. >>>>> >>>>>> With certainly an added assert later in >>>>>> ClassLoader::setup_bootstrap_search_path() to ensure that the >>>>>> system class path is never NULL. >>>>> Sure, I can add an assert there. >>>>> I'll post updated webrev once I've made the change and done testing. >>>>> >>>>> thanks, >>>>> Calvin >>>>>> >>>>>> Thanks, >>>>>> Lois >>>>>> >>>>>>> >>>>>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>>>>> >>>>>>> thanks, >>>>>>> Calvin >>>>>> >>> From xxinliu at amazon.com Mon Jul 9 23:31:45 2018 From: xxinliu at amazon.com (Liu, Xin) Date: Mon, 9 Jul 2018 23:31:45 +0000 Subject: 8206075: add assertion for unbound assembler Labels In-Reply-To: References: Message-ID: <42379DF9-CBC8-474E-8C8E-F70C455354FB@amazon.com> Hi, Community, Could you please review this small patch? Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ Problem: X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. This patch align up x86 with other architectures. Add an assertion to the destructor of Label. It won?t add extra overhead because C++ compiler will wipe out the destructor in release build. Previously, hotspot cannot pass this test with assertion on x86-64. make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java If this CR is approved, Paul Hohensee will push it. Thanks, --lx From navy.xliu at gmail.com Tue Jul 10 16:50:13 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Tue, 10 Jul 2018 09:50:13 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Message-ID: Hi, Community, Could you please review this small patch? Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ Problem: X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. This patch align up x86 with other architectures(ppc, arm). Add an assertion to the destructor of Label. It will be wiped out in release build. Previously, hotspot cannot pass this test with assertion on x86-64. make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java If this CR is approved, Paul Hohensee will push it. Thanks, --lx From jiangli.zhou at oracle.com Tue Jul 10 23:09:57 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 10 Jul 2018 16:09:57 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B452252.9010209@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> <5B452252.9010209@oracle.com> Message-ID: <911C10C4-5078-4E9D-9D60-C8E971B50C89@oracle.com> Hi Calvin, > On Jul 10, 2018, at 2:17 PM, Calvin Cheung wrote: > > Hi Jiangli, > > Thanks for reviewing. > >> On 7/10/18, 12:18 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> The error handling code in platform specific code are identical. I like Lois? suggestion to check and exit in os::set_boot_path() to avoid duplicating the code. > If you want changes in os::set_boot_path() instead of in platform specific code, I'm proposing the following: > diff --git a/src/hotspot/share/runtime/os.cpp b/src/hotspot/share/runtime/os.cpp > --- a/src/hotspot/share/runtime/os.cpp > +++ b/src/hotspot/share/runtime/os.cpp > @@ -1270,7 +1270,7 @@ > return file; > } > > -bool os::set_boot_path(char fileSep, char pathSep) { > +void os::set_boot_path(char fileSep, char pathSep) { > const char* home = Arguments::get_java_home(); > int home_len = (int)strlen(home); > > @@ -1278,26 +1278,30 @@ > > // modular image if "modules" jimage exists > char* jimage = format_boot_path("%/lib/" MODULES_IMAGE_NAME, home, home_len, fileSep, pathSep); > - if (jimage == NULL) return false; > + if (jimage == NULL) { > + vm_exit_during_initialization("Failed setting boot class path.", NULL); > + } > bool has_jimage = (os::stat(jimage, &st) == 0); > if (has_jimage) { > Arguments::set_sysclasspath(jimage, true); > FREE_C_HEAP_ARRAY(char, jimage); > - return true; > + return; > } > FREE_C_HEAP_ARRAY(char, jimage); > > // check if developer build with exploded modules > char* base_classes = format_boot_path("%/modules/" JAVA_BASE_NAME, home, home_len, fileSep, pathSep); > - if (base_classes == NULL) return false; > + if (base_classes == NULL) { > + vm_exit_during_initialization("Failed setting boot class path.", NULL); > + } > if (os::stat(base_classes, &st) == 0) { > Arguments::set_sysclasspath(base_classes, false); > FREE_C_HEAP_ARRAY(char, base_classes); > - return true; > + return; > } > FREE_C_HEAP_ARRAY(char, base_classes); > - > - return false; > + vm_exit_during_initialization("Failed setting boot class path.", NULL); > + return; > } > > The function fails if os::stat() fails or allocation of buffer for "jimage" or "base_classes" fails. > Since vm should exit on failure in the above function, it is unnecessary for the function to return a bool. That looks good. On the other hand David?s comments also sound reasonable to me. So it?s your call. > >> >> Also, under low memory condition, set_value() might fail to allocate and not trigger any error with a release binary. os::set_boot_path() probably should also check and make sure sys path is not NULL after Arguments::set_sysclasspath(). > The AllocateHeap() you listed below will call into the following which will exit on out of memory. So I don' think we need to do NULL check again. > > // allocate using malloc; will fail if no memory available > char* AllocateHeap(size_t size, > MEMFLAGS flags, > const NativeCallStack& stack, > AllocFailType alloc_failmode /* = AllocFailStrategy::EXIT_OOM*/) { > char* p = (char*) os::malloc(size, flags, stack); > if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { > vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); > } > return p; > } Thanks for digging further down. So PathString::set_value() would never return false in this case. It seems we should add a comment in set_value() to avoid future confusion. Also the ?else? case can be removed. That can be handled separately. Thanks, Jiangli > > thanks, > Calvin >> >> bool PathString::set_value(const char *value) { >> if (_value != NULL) { >> FreeHeap(_value); >> } >> _value = AllocateHeap(strlen(value)+1, mtArguments); >> assert(_value != NULL, "Unable to allocate space for new path value"); >> if (_value != NULL) { >> strcpy(_value, value); >> } else { >> // not able to allocate >> return false; >> } >> return true; >> } >> >> Thanks, >> Jiangli >> >>> On Jul 10, 2018, at 10:31 AM, Calvin Cheung wrote: >>> >>> Updated webrev with the changes mentioned below: >>> http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ >>> >>> I've rerun hs-tier{1,2,3} tests. >>> >>> thanks, >>> Calvin >>> >>>> On 7/9/18, 1:43 PM, Calvin Cheung wrote: >>>> Hi Lois, >>>> >>>> Thanks for your review. >>>> >>>>> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>>>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>>>> >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>>>> >>>>>> The JVM crash could be simulated by renaming/removing the modules file under the jdk/lib directory. >>>>>> The proposed simple fix is to perform a vm_exit_during_initialization(). >>>>> >>>>> Hi Calvin, >>>>> >>>>> Some clarifying questions. Is this just an issue for exploded builds? >>>> I don't think so. As mentioned above, I could reproduce the crash with a regular jdk image build by renaming the modules file under the jdk/lib directory. >>>>> I would prefer the exit to occur if the os::stat() fails for the system class path in os::set_boot_path(). >>>> Instead of exiting in os::set_boot_path(), how about checking the return status of os::set_boot_path() in the caller and exiting there like the following: >>>> bash-4.2$ hg diff os_linux.cpp >>>> diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp >>>> --- a/src/hotspot/os/linux/os_linux.cpp >>>> +++ b/src/hotspot/os/linux/os_linux.cpp >>>> @@ -367,7 +367,9 @@ >>>> } >>>> } >>>> Arguments::set_java_home(buf); >>>> - set_boot_path('/', ':'); >>>> + if (!set_boot_path('/', ':')) { >>>> + vm_exit_during_initialization("Failed setting boot class path.", NULL); >>>> + } >>>> } >>>> >>>> Note that before the above change, the return status of set_boot_path() isn't checked. >>>> The above would involve changing 5 of those os_*.cpp files, one for each O/S. >>>> >>>>> With certainly an added assert later in ClassLoader::setup_bootstrap_search_path() to ensure that the system class path is never NULL. >>>> Sure, I can add an assert there. >>>> I'll post updated webrev once I've made the change and done testing. >>>> >>>> thanks, >>>> Calvin >>>>> >>>>> Thanks, >>>>> Lois >>>>> >>>>>> >>>>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>> >> From david.holmes at oracle.com Tue Jul 10 23:13:11 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Jul 2018 09:13:11 +1000 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B4532C4.6010904@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> <5B452252.9010209@oracle.com> <31e1a39a-53b9-d923-5e86-c290827d9bdb@oracle.com> <5B4532C4.6010904@oracle.com> Message-ID: On 11/07/2018 8:27 AM, Calvin Cheung wrote: > Hi David, > > Thanks for chiming in. > > I take that I should go with webrev.01 of the change? That's up to you and your reviewers. :) David > thanks, > Calvin > > On 7/10/18, 3:01 PM, David Holmes wrote: >> Calling vm_exit_during_initialization is very unfriendly for >> applications that host the JVM in-process directly. With that in mind >> decisions to call vm_exit_* should done at as a high a level as >> feasible in the code - ie. a low-level method like os::set_boot_path >> should never IMHO be making a decision as to whether an error it >> encounters is fatal to the whole VM initialization process. That's a >> decision to be made higher up. >> >> My 2c. >> >> David >> >> On 11/07/2018 7:17 AM, Calvin Cheung wrote: >>> Hi Jiangli, >>> >>> Thanks for reviewing. >>> >>> On 7/10/18, 12:18 PM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> The error handling code in platform specific code are identical. I >>>> like Lois? suggestion to check and exit in os::set_boot_path() to >>>> avoid duplicating the code. >>> If you want changes in os::set_boot_path() instead of in platform >>> specific code, I'm proposing the following: >>> diff --git a/src/hotspot/share/runtime/os.cpp >>> b/src/hotspot/share/runtime/os.cpp >>> --- a/src/hotspot/share/runtime/os.cpp >>> +++ b/src/hotspot/share/runtime/os.cpp >>> @@ -1270,7 +1270,7 @@ >>> ??? return file; >>> ? } >>> >>> -bool os::set_boot_path(char fileSep, char pathSep) { >>> +void os::set_boot_path(char fileSep, char pathSep) { >>> ??? const char* home = Arguments::get_java_home(); >>> ??? int home_len = (int)strlen(home); >>> >>> @@ -1278,26 +1278,30 @@ >>> >>> ??? // modular image if "modules" jimage exists >>> ??? char* jimage = format_boot_path("%/lib/" MODULES_IMAGE_NAME, >>> home, home_len, fileSep, pathSep); >>> -? if (jimage == NULL) return false; >>> +? if (jimage == NULL) { >>> +??? vm_exit_during_initialization("Failed setting boot class path.", >>> NULL); >>> +? } >>> ??? bool has_jimage = (os::stat(jimage, &st) == 0); >>> ??? if (has_jimage) { >>> ????? Arguments::set_sysclasspath(jimage, true); >>> ????? FREE_C_HEAP_ARRAY(char, jimage); >>> -??? return true; >>> +??? return; >>> ??? } >>> ??? FREE_C_HEAP_ARRAY(char, jimage); >>> >>> ??? // check if developer build with exploded modules >>> ??? char* base_classes = format_boot_path("%/modules/" >>> JAVA_BASE_NAME, home, home_len, fileSep, pathSep); >>> -? if (base_classes == NULL) return false; >>> +? if (base_classes == NULL) { >>> +??? vm_exit_during_initialization("Failed setting boot class path.", >>> NULL); >>> +? } >>> ??? if (os::stat(base_classes, &st) == 0) { >>> ????? Arguments::set_sysclasspath(base_classes, false); >>> ????? FREE_C_HEAP_ARRAY(char, base_classes); >>> -??? return true; >>> +??? return; >>> ??? } >>> ??? FREE_C_HEAP_ARRAY(char, base_classes); >>> - >>> -? return false; >>> +? vm_exit_during_initialization("Failed setting boot class path.", >>> NULL); >>> +? return; >>> ? } >>> >>> The function fails if os::stat() fails or allocation of buffer for >>> "jimage" or "base_classes" fails. >>> Since vm should exit on failure in the above function, it is >>> unnecessary for the function to return a bool. >>> >>>> >>>> Also, under low memory condition, set_value() might fail to allocate >>>> and not trigger any error with a release binary. os::set_boot_path() >>>> probably should also check and make sure sys path is not NULL after >>>> Arguments::set_sysclasspath(). >>> The AllocateHeap() you listed below will call into the following >>> which will exit on out of memory. So I don' think we need to do NULL >>> check again. >>> >>> // allocate using malloc; will fail if no memory available >>> char* AllocateHeap(size_t size, >>> ??????????????????? MEMFLAGS flags, >>> ??????????????????? const NativeCallStack& stack, >>> ??????????????????? AllocFailType alloc_failmode /* = >>> AllocFailStrategy::EXIT_OOM*/) { >>> ?? char* p = (char*) os::malloc(size, flags, stack); >>> ?? if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { >>> ???? vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); >>> ?? } >>> ?? return p; >>> } >>> >>> thanks, >>> Calvin >>>> >>>> bool PathString::set_value(const char *value) { >>>> if (_value != NULL) { >>>> ??? FreeHeap(_value); >>>> ? } >>>> ? _value = AllocateHeap(strlen(value)+1, mtArguments); >>>> ? assert(_value != NULL, "Unable to allocate space for new path >>>> value"); >>>> if (_value != NULL) { >>>> ??? strcpy(_value, value); >>>> ? } else { >>>> // not able to allocate >>>> return false; >>>> ? } >>>> returntrue; >>>> } >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> On Jul 10, 2018, at 10:31 AM, Calvin Cheung >>>>> > wrote: >>>>> >>>>> Updated webrev with the changes mentioned below: >>>>> http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ >>>>> >>>>> >>>>> I've rerun hs-tier{1,2,3} tests. >>>>> >>>>> thanks, >>>>> Calvin >>>>> >>>>> On 7/9/18, 1:43 PM, Calvin Cheung wrote: >>>>>> Hi Lois, >>>>>> >>>>>> Thanks for your review. >>>>>> >>>>>> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>>>>>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>>>>>> >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> The JVM crash could be simulated by renaming/removing the >>>>>>>> modules file under the jdk/lib directory. >>>>>>>> The proposed simple fix is to perform a >>>>>>>> vm_exit_during_initialization(). >>>>>>> >>>>>>> Hi Calvin, >>>>>>> >>>>>>> Some clarifying questions.? Is this just an issue for exploded >>>>>>> builds? >>>>>> I don't think so. As mentioned above, I could reproduce the crash >>>>>> with a regular jdk image build by renaming the modules file under >>>>>> the jdk/lib directory. >>>>>>> ?I would prefer the exit to occur if the os::stat() fails for the >>>>>>> system class path in os::set_boot_path(). >>>>>> Instead of exiting in os::set_boot_path(), how about checking the >>>>>> return status of os::set_boot_path() in the caller and exiting >>>>>> there like the following: >>>>>> bash-4.2$ hg diff os_linux.cpp >>>>>> diff --git a/src/hotspot/os/linux/os_linux.cpp >>>>>> b/src/hotspot/os/linux/os_linux.cpp >>>>>> --- a/src/hotspot/os/linux/os_linux.cpp >>>>>> +++ b/src/hotspot/os/linux/os_linux.cpp >>>>>> @@ -367,7 +367,9 @@ >>>>>> ????? } >>>>>> ??? } >>>>>> ??? Arguments::set_java_home(buf); >>>>>> -??? set_boot_path('/', ':'); >>>>>> +??? if (!set_boot_path('/', ':')) { >>>>>> +????? vm_exit_during_initialization("Failed setting boot class >>>>>> path.", NULL); >>>>>> +??? } >>>>>> ? } >>>>>> >>>>>> Note that before the above change, the return status of >>>>>> set_boot_path() isn't checked. >>>>>> The above would involve changing 5 of those os_*.cpp files, one >>>>>> for each O/S. >>>>>> >>>>>>> ?With certainly an added assert later in >>>>>>> ClassLoader::setup_bootstrap_search_path() to ensure that the >>>>>>> system class path is never NULL. >>>>>> Sure, I can add an assert there. >>>>>> I'll post updated webrev once I've made the change and done testing. >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>>>> >>>>>>> Thanks, >>>>>>> Lois >>>>>>> >>>>>>>> >>>>>>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>>>>>> >>>>>>>> thanks, >>>>>>>> Calvin >>>>>>> >>>> From david.holmes at oracle.com Tue Jul 10 23:17:46 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Jul 2018 09:17:46 +1000 Subject: RFR(S) 8206183: Possible construct EMPTY_STACK and allocation stack, etc. on first use In-Reply-To: <9503ded0-bc68-543b-a1ab-6d884854dc9a@redhat.com> References: <2cd7efd8-d6d2-0a65-d87b-b264d8bd3970@oracle.com> <9503ded0-bc68-543b-a1ab-6d884854dc9a@redhat.com> Message-ID: <580231be-537e-bafd-baf9-7562162d09f0@oracle.com> On 9/07/2018 9:45 PM, Zhengyu Gu wrote: > Hi David, > > On 07/09/2018 12:37 AM, David Holmes wrote: >> Hi Zhengyu, >> >> On 7/07/2018 9:36 PM, Zhengyu Gu wrote: >>> Hi, >>> >>> NMT has to workaround static initialization order issues: some of >>> static objects, who allocate memory inside their constructors, may be >>> initialized ahead of NMT, so NMT is forced to initialize itself early >>> and risks its static objects may be reinitialized by C runtime. >>> >>> The workaround was to declare storage for the static objects as >>> primitive arrays, then use placement new operator to initialize them, >>> or just initialize them eagerly, if the results are constants. >>> >>> But the solution is not elegant, could break with some compilers. >>> A better solution is to use "construct on First Use Idiom" pattern >>> (https://isocpp.org/wiki/faq/ctors#static-init-order), cause we only >>> have initialization order problems, those static objects do not have >>> dependencies on other static objects, so we don't suffer from static >>> deinitialization problems. >> >> Okay but this relies on C+11 thread-safe static initialization. That's >> only available in VS2015 and above (which should be okay for JDK 12+). >> What about other compilers? Does it have to be enabled via any >> compilation flags? > > Thanks for pointing out. > > NMT is always initialized while JVM is still in single-thread mode, so I > think it is safe even without language support, or I miss something here? Okay - that sounds alright then. Thanks, David > -Zhengyu > > >> >> I'm currently running this through some additional internal build/tests. >> >> Thanks, >> David >> ----- >> >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206183 >>> Webrev: http://cr.openjdk.java.net/~zgu/8206183/webrev.00/ >>> >>> Test: >>> >>> ?? hotspot_nmt on Linux 64 (fastdebug and release) >>> ?? Submit-test. >>> >>> >>> Thanks, >>> >>> -Zhengyu From vladimir.kozlov at oracle.com Wed Jul 11 00:08:28 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 10 Jul 2018 17:08:28 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: Message-ID: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> Fix looks reasonable. I will test it in our framework. Thanks, Vladimir On 7/10/18 9:50 AM, Liu Xin wrote: > Hi, Community, > > Could you please review this small patch? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ > > Problem: > X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. > This patch align up x86 with other architectures(ppc, arm). > Add an assertion to the destructor of Label. It will be wiped out in release build. > > Previously, hotspot cannot pass this test with assertion on x86-64. > make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java > > If this CR is approved, Paul Hohensee will push it. > > Thanks, > --lx > > > From leonid.mesnik at oracle.com Wed Jul 11 00:13:03 2018 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 10 Jul 2018 17:13:03 -0700 Subject: RFR(XS) 8139876: Exclude hanging nsk/stress/stack from execution with deoptimization enabled Message-ID: <9F7B6107-AA4E-46AE-BB34-1930839C35A5@oracle.com> Hi Could you please review following fix which add run nsk/stress/stack/* tests only when DeoptimizeALot is disabled. The reason of exclusion is same as for JDK-8172854 [TESTBUG] Exclude runtime/ReservedStack/ReservedStackTest.java from being run with DeoptimizeALot option Tests create a lot of recursive calls to trigger stack overflow. Running tests with DeoptimizeALot increase time significantly. (Up to 10 hours...) Also fix slightly update tests runtime/ReservedStack/* to use correct option name. I occasionally found that test still executed after JDK-8172854 because of typo "Alot" instead of "ALot". webrev: http://cr.openjdk.java.net/~lmesnik/8139876/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8139876 Please note that bug JDK-819988 [TESTBUG] fromTonga/nsk/stress/stack tests fail by timeout when -Xcomp is used is different. The tests hang with Xcomp intermittently and this issue require additional investigation and fix. Leonid From vladimir.kozlov at oracle.com Wed Jul 11 00:27:25 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 10 Jul 2018 17:27:25 -0700 Subject: RFR(XS) 8139876: Exclude hanging nsk/stress/stack from execution with deoptimization enabled In-Reply-To: <9F7B6107-AA4E-46AE-BB34-1930839C35A5@oracle.com> References: <9F7B6107-AA4E-46AE-BB34-1930839C35A5@oracle.com> Message-ID: <8eb00ecb-9c89-ee85-59ec-6cb2a5cce334@oracle.com> Looks good. Thanks, Vladimir On 7/10/18 5:13 PM, Leonid Mesnik wrote: > Hi > > Could you please review following fix which add run nsk/stress/stack/* tests only when DeoptimizeALot is disabled. > The reason of exclusion is same as for > JDK-8172854 [TESTBUG] Exclude runtime/ReservedStack/ReservedStackTest.java from being run with DeoptimizeALot option > Tests create a lot of recursive calls to trigger stack overflow. Running tests with DeoptimizeALot increase time significantly. (Up to 10 hours...) > > Also fix slightly update tests runtime/ReservedStack/* to use correct option name. I occasionally found that test still executed after JDK-8172854 because of typo "Alot" instead of "ALot". > > webrev: http://cr.openjdk.java.net/~lmesnik/8139876/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8139876 > > Please note that bug > JDK-819988 [TESTBUG] fromTonga/nsk/stress/stack tests fail by timeout when -Xcomp is used > is different. The tests hang with Xcomp intermittently and this issue require additional investigation and fix. > > Leonid > From calvin.cheung at oracle.com Wed Jul 11 00:43:23 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 10 Jul 2018 17:43:23 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <911C10C4-5078-4E9D-9D60-C8E971B50C89@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> <5B452252.9010209@oracle.com> <911C10C4-5078-4E9D-9D60-C8E971B50C89@oracle.com> Message-ID: <5B4552AB.8000401@oracle.com> On 7/10/18, 4:09 PM, Jiangli Zhou wrote: > Hi Calvin, > > > On Jul 10, 2018, at 2:17 PM, Calvin Cheung > wrote: > >> Hi Jiangli, >> >> Thanks for reviewing. >> >> On 7/10/18, 12:18 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> The error handling code in platform specific code are identical. I >>> like Lois? suggestion to check and exit in os::set_boot_path() to >>> avoid duplicating the code. >> If you want changes in os::set_boot_path() instead of in platform >> specific code, I'm proposing the following: >> diff --git a/src/hotspot/share/runtime/os.cpp >> b/src/hotspot/share/runtime/os.cpp >> --- a/src/hotspot/share/runtime/os.cpp >> +++ b/src/hotspot/share/runtime/os.cpp >> @@ -1270,7 +1270,7 @@ >> return file; >> } >> >> -bool os::set_boot_path(char fileSep, char pathSep) { >> +void os::set_boot_path(char fileSep, char pathSep) { >> const char* home = Arguments::get_java_home(); >> int home_len = (int)strlen(home); >> >> @@ -1278,26 +1278,30 @@ >> >> // modular image if "modules" jimage exists >> char* jimage = format_boot_path("%/lib/" MODULES_IMAGE_NAME, home, >> home_len, fileSep, pathSep); >> - if (jimage == NULL) return false; >> + if (jimage == NULL) { >> + vm_exit_during_initialization("Failed setting boot class path.", >> NULL); >> + } >> bool has_jimage = (os::stat(jimage, &st) == 0); >> if (has_jimage) { >> Arguments::set_sysclasspath(jimage, true); >> FREE_C_HEAP_ARRAY(char, jimage); >> - return true; >> + return; >> } >> FREE_C_HEAP_ARRAY(char, jimage); >> >> // check if developer build with exploded modules >> char* base_classes = format_boot_path("%/modules/" JAVA_BASE_NAME, >> home, home_len, fileSep, pathSep); >> - if (base_classes == NULL) return false; >> + if (base_classes == NULL) { >> + vm_exit_during_initialization("Failed setting boot class path.", >> NULL); >> + } >> if (os::stat(base_classes, &st) == 0) { >> Arguments::set_sysclasspath(base_classes, false); >> FREE_C_HEAP_ARRAY(char, base_classes); >> - return true; >> + return; >> } >> FREE_C_HEAP_ARRAY(char, base_classes); >> - >> - return false; >> + vm_exit_during_initialization("Failed setting boot class path.", >> NULL); >> + return; >> } >> >> The function fails if os::stat() fails or allocation of buffer for >> "jimage" or "base_classes" fails. >> Since vm should exit on failure in the above function, it is >> unnecessary for the function to return a bool. > > That looks good. On the other hand David?s comments also sound > reasonable to me. So it?s your call. Let's go with webrev.01. > >> >>> >>> Also, under low memory condition, set_value() might fail to allocate >>> and not trigger any error with a release binary. >>> os::set_boot_path() probably should also check and make sure sys >>> path is not NULL after Arguments::set_sysclasspath(). >> The AllocateHeap() you listed below will call into the following >> which will exit on out of memory. So I don' think we need to do NULL >> check again. >> >> // allocate using malloc; will fail if no memory available >> char* AllocateHeap(size_t size, >> MEMFLAGS flags, >> const NativeCallStack& stack, >> AllocFailType alloc_failmode /* = >> AllocFailStrategy::EXIT_OOM*/) { >> char* p = (char*) os::malloc(size, flags, stack); >> if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { >> vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); >> } >> return p; >> } > > Thanks for digging further down. So PathString::set_value() would > never return false in this case. It seems we should add a comment in > set_value() to avoid future confusion. Also the ?else? case can be > removed. That can be handled separately. The function is being called at eight different places and none of them checks the return value. Yes, let's clean it up in a separate bug/RFE. thanks, Calvin > > Thanks, > Jiangli > >> >> thanks, >> Calvin >>> >>> bool PathString::set_value(const char *value) { >>> if (_value != NULL) { >>> FreeHeap(_value); >>> } >>> _value = AllocateHeap(strlen(value)+1, mtArguments); >>> assert(_value != NULL, "Unable to allocate space for new path value"); >>> if (_value != NULL) { >>> strcpy(_value, value); >>> } else { >>> // not able to allocate >>> return false; >>> } >>> returntrue; >>> } >>> >>> Thanks, >>> Jiangli >>> >>>> On Jul 10, 2018, at 10:31 AM, Calvin Cheung >>>> > wrote: >>>> >>>> Updated webrev with the changes mentioned below: >>>> http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ >>>> >>>> >>>> I've rerun hs-tier{1,2,3} tests. >>>> >>>> thanks, >>>> Calvin >>>> >>>> On 7/9/18, 1:43 PM, Calvin Cheung wrote: >>>>> Hi Lois, >>>>> >>>>> Thanks for your review. >>>>> >>>>> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>>>>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>>>>> >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>>>>> >>>>>>> >>>>>>> The JVM crash could be simulated by renaming/removing the >>>>>>> modules file under the jdk/lib directory. >>>>>>> The proposed simple fix is to perform a >>>>>>> vm_exit_during_initialization(). >>>>>> >>>>>> Hi Calvin, >>>>>> >>>>>> Some clarifying questions. Is this just an issue for exploded >>>>>> builds? >>>>> I don't think so. As mentioned above, I could reproduce the crash >>>>> with a regular jdk image build by renaming the modules file under >>>>> the jdk/lib directory. >>>>>> I would prefer the exit to occur if the os::stat() fails for the >>>>>> system class path in os::set_boot_path(). >>>>> Instead of exiting in os::set_boot_path(), how about checking the >>>>> return status of os::set_boot_path() in the caller and exiting >>>>> there like the following: >>>>> bash-4.2$ hg diff os_linux.cpp >>>>> diff --git a/src/hotspot/os/linux/os_linux.cpp >>>>> b/src/hotspot/os/linux/os_linux.cpp >>>>> --- a/src/hotspot/os/linux/os_linux.cpp >>>>> +++ b/src/hotspot/os/linux/os_linux.cpp >>>>> @@ -367,7 +367,9 @@ >>>>> } >>>>> } >>>>> Arguments::set_java_home(buf); >>>>> - set_boot_path('/', ':'); >>>>> + if (!set_boot_path('/', ':')) { >>>>> + vm_exit_during_initialization("Failed setting boot class >>>>> path.", NULL); >>>>> + } >>>>> } >>>>> >>>>> Note that before the above change, the return status of >>>>> set_boot_path() isn't checked. >>>>> The above would involve changing 5 of those os_*.cpp files, one >>>>> for each O/S. >>>>> >>>>>> With certainly an added assert later in >>>>>> ClassLoader::setup_bootstrap_search_path() to ensure that the >>>>>> system class path is never NULL. >>>>> Sure, I can add an assert there. >>>>> I'll post updated webrev once I've made the change and done testing. >>>>> >>>>> thanks, >>>>> Calvin >>>>>> >>>>>> Thanks, >>>>>> Lois >>>>>> >>>>>>> >>>>>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>>>>> >>>>>>> thanks, >>>>>>> Calvin >>>>>> >>> From jiangli.zhou at oracle.com Wed Jul 11 01:20:55 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 10 Jul 2018 18:20:55 -0700 Subject: RFR(S): 8205946: JVM crash after call to ClassLoader::setup_bootstrap_search_path() In-Reply-To: <5B4552AB.8000401@oracle.com> References: <5B439B8F.3020709@oracle.com> <5B43C8E8.2060206@oracle.com> <5B44ED62.5030008@oracle.com> <5B452252.9010209@oracle.com> <911C10C4-5078-4E9D-9D60-C8E971B50C89@oracle.com> <5B4552AB.8000401@oracle.com> Message-ID: > On Jul 10, 2018, at 5:43 PM, Calvin Cheung wrote: > > > >> On 7/10/18, 4:09 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> >> On Jul 10, 2018, at 2:17 PM, Calvin Cheung wrote: >> >>> Hi Jiangli, >>> >>> Thanks for reviewing. >>> >>>> On 7/10/18, 12:18 PM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> The error handling code in platform specific code are identical. I like Lois? suggestion to check and exit in os::set_boot_path() to avoid duplicating the code. >>> If you want changes in os::set_boot_path() instead of in platform specific code, I'm proposing the following: >>> diff --git a/src/hotspot/share/runtime/os.cpp b/src/hotspot/share/runtime/os.cpp >>> --- a/src/hotspot/share/runtime/os.cpp >>> +++ b/src/hotspot/share/runtime/os.cpp >>> @@ -1270,7 +1270,7 @@ >>> return file; >>> } >>> >>> -bool os::set_boot_path(char fileSep, char pathSep) { >>> +void os::set_boot_path(char fileSep, char pathSep) { >>> const char* home = Arguments::get_java_home(); >>> int home_len = (int)strlen(home); >>> >>> @@ -1278,26 +1278,30 @@ >>> >>> // modular image if "modules" jimage exists >>> char* jimage = format_boot_path("%/lib/" MODULES_IMAGE_NAME, home, home_len, fileSep, pathSep); >>> - if (jimage == NULL) return false; >>> + if (jimage == NULL) { >>> + vm_exit_during_initialization("Failed setting boot class path.", NULL); >>> + } >>> bool has_jimage = (os::stat(jimage, &st) == 0); >>> if (has_jimage) { >>> Arguments::set_sysclasspath(jimage, true); >>> FREE_C_HEAP_ARRAY(char, jimage); >>> - return true; >>> + return; >>> } >>> FREE_C_HEAP_ARRAY(char, jimage); >>> >>> // check if developer build with exploded modules >>> char* base_classes = format_boot_path("%/modules/" JAVA_BASE_NAME, home, home_len, fileSep, pathSep); >>> - if (base_classes == NULL) return false; >>> + if (base_classes == NULL) { >>> + vm_exit_during_initialization("Failed setting boot class path.", NULL); >>> + } >>> if (os::stat(base_classes, &st) == 0) { >>> Arguments::set_sysclasspath(base_classes, false); >>> FREE_C_HEAP_ARRAY(char, base_classes); >>> - return true; >>> + return; >>> } >>> FREE_C_HEAP_ARRAY(char, base_classes); >>> - >>> - return false; >>> + vm_exit_during_initialization("Failed setting boot class path.", NULL); >>> + return; >>> } >>> >>> The function fails if os::stat() fails or allocation of buffer for "jimage" or "base_classes" fails. >>> Since vm should exit on failure in the above function, it is unnecessary for the function to return a bool. >> >> That looks good. On the other hand David?s comments also sound reasonable to me. So it?s your call. > Let's go with webrev.01. Ok. >> >>> >>>> >>>> Also, under low memory condition, set_value() might fail to allocate and not trigger any error with a release binary. os::set_boot_path() probably should also check and make sure sys path is not NULL after Arguments::set_sysclasspath(). >>> The AllocateHeap() you listed below will call into the following which will exit on out of memory. So I don' think we need to do NULL check again. >>> >>> // allocate using malloc; will fail if no memory available >>> char* AllocateHeap(size_t size, >>> MEMFLAGS flags, >>> const NativeCallStack& stack, >>> AllocFailType alloc_failmode /* = AllocFailStrategy::EXIT_OOM*/) { >>> char* p = (char*) os::malloc(size, flags, stack); >>> if (p == NULL && alloc_failmode == AllocFailStrategy::EXIT_OOM) { >>> vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap"); >>> } >>> return p; >>> } >> >> Thanks for digging further down. So PathString::set_value() would never return false in this case. It seems we should add a comment in set_value() to avoid future confusion. Also the ?else? case can be removed. That can be handled separately. > The function is being called at eight different places and none of them checks the return value. > Yes, let's clean it up in a separate bug/RFE. Please file a new RFE. Thanks, Jiangli > > thanks, > Calvin >> >> Thanks, >> Jiangli >> >>> >>> thanks, >>> Calvin >>>> >>>> bool PathString::set_value(const char *value) { >>>> if (_value != NULL) { >>>> FreeHeap(_value); >>>> } >>>> _value = AllocateHeap(strlen(value)+1, mtArguments); >>>> assert(_value != NULL, "Unable to allocate space for new path value"); >>>> if (_value != NULL) { >>>> strcpy(_value, value); >>>> } else { >>>> // not able to allocate >>>> return false; >>>> } >>>> return true; >>>> } >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> On Jul 10, 2018, at 10:31 AM, Calvin Cheung wrote: >>>>> >>>>> Updated webrev with the changes mentioned below: >>>>> http://cr.openjdk.java.net/~ccheung/8205946/webrev.01/ >>>>> >>>>> I've rerun hs-tier{1,2,3} tests. >>>>> >>>>> thanks, >>>>> Calvin >>>>> >>>>>> On 7/9/18, 1:43 PM, Calvin Cheung wrote: >>>>>> Hi Lois, >>>>>> >>>>>> Thanks for your review. >>>>>> >>>>>>> On 7/9/18, 11:58 AM, Lois Foltan wrote: >>>>>>> On 7/9/2018 1:29 PM, Calvin Cheung wrote: >>>>>>> >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8205946 >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8205946/webrev.00/ >>>>>>>> >>>>>>>> The JVM crash could be simulated by renaming/removing the modules file under the jdk/lib directory. >>>>>>>> The proposed simple fix is to perform a vm_exit_during_initialization(). >>>>>>> >>>>>>> Hi Calvin, >>>>>>> >>>>>>> Some clarifying questions. Is this just an issue for exploded builds? >>>>>> I don't think so. As mentioned above, I could reproduce the crash with a regular jdk image build by renaming the modules file under the jdk/lib directory. >>>>>>> I would prefer the exit to occur if the os::stat() fails for the system class path in os::set_boot_path(). >>>>>> Instead of exiting in os::set_boot_path(), how about checking the return status of os::set_boot_path() in the caller and exiting there like the following: >>>>>> bash-4.2$ hg diff os_linux.cpp >>>>>> diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp >>>>>> --- a/src/hotspot/os/linux/os_linux.cpp >>>>>> +++ b/src/hotspot/os/linux/os_linux.cpp >>>>>> @@ -367,7 +367,9 @@ >>>>>> } >>>>>> } >>>>>> Arguments::set_java_home(buf); >>>>>> - set_boot_path('/', ':'); >>>>>> + if (!set_boot_path('/', ':')) { >>>>>> + vm_exit_during_initialization("Failed setting boot class path.", NULL); >>>>>> + } >>>>>> } >>>>>> >>>>>> Note that before the above change, the return status of set_boot_path() isn't checked. >>>>>> The above would involve changing 5 of those os_*.cpp files, one for each O/S. >>>>>> >>>>>>> With certainly an added assert later in ClassLoader::setup_bootstrap_search_path() to ensure that the system class path is never NULL. >>>>>> Sure, I can add an assert there. >>>>>> I'll post updated webrev once I've made the change and done testing. >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>>>> >>>>>>> Thanks, >>>>>>> Lois >>>>>>> >>>>>>>> >>>>>>>> Ran hs-tier{1,2,3} tests successfully including the new test case. >>>>>>>> >>>>>>>> thanks, >>>>>>>> Calvin >>>>>>> >>>> From vladimir.kozlov at oracle.com Wed Jul 11 01:33:58 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 10 Jul 2018 18:33:58 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> Message-ID: <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> I hit new assert in few other tests: compiler/codegen/TestCharVect2.java compiler/c2/cr6340864/* Regards, Vladimir On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > Fix looks reasonable. I will test it in our framework. > > Thanks, > Vladimir > > On 7/10/18 9:50 AM, Liu Xin wrote: >> Hi, Community, >> Could you please review this small patch? >> Bug:? https://bugs.openjdk.java.net/browse/JDK-8206075 >> >> CR:? http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >> >> Problem: >> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >> This patch align up x86 with other architectures(ppc, arm). >> Add an assertion to the destructor of Label. It? will be wiped out in release build. >> Previously, hotspot cannot pass this test with assertion on x86-64. >> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >> If this CR is approved, Paul Hohensee will push it. >> Thanks, >> --lx >> From goetz.lindenmaier at sap.com Wed Jul 11 06:22:22 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 11 Jul 2018 06:22:22 +0000 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> References: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> Message-ID: <8cc6fa5863da487f94b48d9e0987e845@sap.com> Hi David, I discovered this feature a while ago in test/hotspot/jtreg/serviceability/sa/TestUniverse.java You get result directories with .v1 and .v2 (or the like) appended to the directory name. I think this is very useful, because there are a row of tests where several files exist to run the tests with different flags. The test descriptions can be moved into one file with this feature. E.g., compiler/ciReplay/TestSAClient.java compiler/ciReplay/TestSAServer.java Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of David Holmes > Sent: Dienstag, 10. Juli 2018 23:24 > To: Volker Simonis ; hotspot-runtime- > dev at openjdk.java.net runtime > Subject: Re: [11] RFR(S): 8206998: [test] > runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on > ppc64 > > Hi Volker, > > On 11/07/2018 3:52 AM, Volker Simonis wrote: > > Hi, > > > > can I please get a review for the following test-only change: > > > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ > > https://bugs.openjdk.java.net/browse/JDK-8206998 > > > > The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java > > intentionally disables caching of Elf sections during symbol lookup > > with WhiteBox.disableElfSectionCache(). On platforms which do not use > > file descriptors instead of plain function pointers this slows down > > the lookup just a little bit, because all the symbols from an Elf file > > are still read consecutively after one 'fseek()' call. But on > > platforms with file descriptors like ppc64 big-endian, we get two > > 'fseek()' calls for each symbol read from the Elf file because reading > > the file descriptor table is nested inside the loop which reads the > > symbols. This really trashes the I/O system and considerable slows > > down the test, so we need an extra long timeout setting. > > > > The fix is trivial - simply provide two test versions (i.e. comments): > > the first one for all Linux flavors which are not ppc64 and a second, > > new one for Linux/ppc64 which simply has a bigger timeout. > > I was not aware that this was a valid way of defining a test! This > suggests there can only be one "leading comment" per "defining file: > > http://openjdk.java.net/jtreg/tag-spec.html > > Need to verify this with the jtreg folk: jtreg-use at openjdk.java.net > > Thanks, > David > > > Thank you and best regards, > > Volker > > From goetz.lindenmaier at sap.com Wed Jul 11 06:55:44 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 11 Jul 2018 06:55:44 +0000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <977e9be8-ad4a-4ad3-c9e2-a5702cb03f9f@oracle.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> <977e9be8-ad4a-4ad3-c9e2-a5702cb03f9f@oracle.com> Message-ID: <51c5e470968144ffab18385eba7b8025@sap.com> Hi Lois, > > + name ? name->as_C_string() : ""); > Instead of "" please use the UNNAMED_MODULE macro from > moduleEntry.hpp. Oh yes, sure! > > This looks a lot nicer!?? Similar code is in linkResolver.cpp, can you > > look at changing it too? > > I have an RFR out currently for JDK-8205611, (see > http://mail.openjdk.java.net/pipermail/hotspot-dev/2018- > June/033325.html), > which needs one more reviewer's okay.? It contains changes to reword the > error messages for loader constraint violations in order to follow the > new proposed format for module and class loader information.? So our two > changes will conflict in this area. Your change is targeted at 12, mine to 11. Changing this to use stringStream will also simplify your change. Do you mind me fixing that in 11, and you waiting with pushing until it's merged to 12? I need to make a coverity change for the compiler sources, but after that I'll write a review for your change. Best regards, Goetz. > Thanks, > Lois > > > > > http://cr.openjdk.java.net/~goetz/wr18/8206977- > covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html > > > > > > If name is null here, what would this do?? Should there be an 'else' > > to print something? > > > > I think this looks fine.? It doesn't look major to me.? The asserts > > turned to guarantees don't appear to be anywhere performance sensitive. > > > > Thanks, > > Coleen > > > > > > > > On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: > >> Hi, > >> > >> I ran coverity on the jdk11 hotspot sources and want to propose the > >> following fixes to the runtime code. I scanned the linux x86_64 build. > >> Some issues are similar to previous parfait fixes (check for NULL, add > >> guarantees etc.) I also identified some issues I consider real problems. > >> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ > >> > >> In detail: > >> > >> Real issues: > >> ------------ > >> > >> jvmtiEnvBase.cpp > >> ?? || should be &&. > >> ?? Attention, this is the only change that really will change behaviour. > >> ?? But if thr == NULL we will see a crash below. > >> > >> perfMemory_linux.cpp: > >> ?? Wrong buffer length used. > >> > >> systemDictionary.cpp: > >> ?? Move code dereferencing ik under if (ik != NULL). > >> > >> virtualspace.cpp > >> ?? Initialization is missing. Moved constructor up to the other > >> ?? constructors. > >> > >> > >> Useful code improvements: > >> ------------------------- > >> > >> vm_version_ext_x86.cpp > >> ?? Assure buffer is not accessed at offset -1. > >> > >> os_linux.cpp > >> ?? Numa_max_node returns int, and a -1 in some cases. > >> > >> moduleEntry.cpp > >> ?? name might be NULL. Just a fix for tracing. > >> > >> systemDictionaryShared.cpp > >> ?? clearify code. > >> ?? It would be wrong if only entry == NULL would hold, one > >> ?? would hit the assertion below. > >> > >> verifier.cpp > >> ?? Fix tracing. > >> ?? Illegal opcode is -1 and should not be passed to name array. > >> > >> logOutput.cpp > >> ?? If n_selections == 0, best_selection would be NULL. > >> ?? Move up the assertion and turn into a guarantee. > >> > >> filemap.cpp > >> ?? Either base can be NULL, or parts of the code before are dead. > >> > >> metaspace.cpp > >> ?? We now an exception is pending. > >> > >> klassVtable.cpp > >> ?? Coverity does not like the format in a variable. > >> ?? Anyways this is quite rough coding, transformed to use stringStream > >> ?? as with other similar exceptions. > >> > >> jvmFlag.cpp > >> ?? match might be NULL. > >> > >> writableFlags.cpp > >> ?? name might be NULL. > >> > >> ostream.cpp > >> ?? If ftell returns error code -1, we need not continue. > >> ?? Especially we should not fseek(-1). > >> > >> logTestUtils.inline.hpp > >> ?? ftell returns -1. > >> > >> test_metachunk.cpp > >> ?? wrong datatype. > > From volker.simonis at gmail.com Wed Jul 11 07:34:32 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 11 Jul 2018 09:34:32 +0200 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> References: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> Message-ID: Hi David, so it obviously works and as Goetz mentioned there are already other, existing tests which use this feature. Do you want me to get a formal review which confirms this from somebody from the JTreg team? I've CC-ed jtreg-use and Jonathan in the hope that they can confirm this. Regards, Volker On Tue, Jul 10, 2018 at 11:24 PM, David Holmes wrote: > Hi Volker, > > On 11/07/2018 3:52 AM, Volker Simonis wrote: >> >> Hi, >> >> can I please get a review for the following test-only change: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ >> https://bugs.openjdk.java.net/browse/JDK-8206998 >> >> The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java >> intentionally disables caching of Elf sections during symbol lookup >> with WhiteBox.disableElfSectionCache(). On platforms which do not use >> file descriptors instead of plain function pointers this slows down >> the lookup just a little bit, because all the symbols from an Elf file >> are still read consecutively after one 'fseek()' call. But on >> platforms with file descriptors like ppc64 big-endian, we get two >> 'fseek()' calls for each symbol read from the Elf file because reading >> the file descriptor table is nested inside the loop which reads the >> symbols. This really trashes the I/O system and considerable slows >> down the test, so we need an extra long timeout setting. >> >> The fix is trivial - simply provide two test versions (i.e. comments): >> the first one for all Linux flavors which are not ppc64 and a second, >> new one for Linux/ppc64 which simply has a bigger timeout. > > > I was not aware that this was a valid way of defining a test! This suggests > there can only be one "leading comment" per "defining file: > > http://openjdk.java.net/jtreg/tag-spec.html > > Need to verify this with the jtreg folk: jtreg-use at openjdk.java.net > > Thanks, > David > > >> Thank you and best regards, >> Volker >> > From david.holmes at oracle.com Wed Jul 11 07:39:08 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Jul 2018 17:39:08 +1000 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: <8cc6fa5863da487f94b48d9e0987e845@sap.com> References: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> <8cc6fa5863da487f94b48d9e0987e845@sap.com> Message-ID: <4bd8bab8-beb9-bfa3-2fbf-bc443b5a295e@oracle.com> Hi Goetz, On 11/07/2018 4:22 PM, Lindenmaier, Goetz wrote: > Hi David, > > I discovered this feature a while ago in > test/hotspot/jtreg/serviceability/sa/TestUniverse.java > > You get result directories with .v1 and .v2 (or the like) > appended to the directory name. > > I think this is very useful, because there are a row > of tests where several files exist to run the tests with > different flags. The test descriptions can be moved into > one file with this feature. E.g., > compiler/ciReplay/TestSAClient.java > compiler/ciReplay/TestSAServer.java Yes very useful feature. Just wish it was documented. :) Cheers, David > Best regards, > Goetz. > > >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> bounces at openjdk.java.net] On Behalf Of David Holmes >> Sent: Dienstag, 10. Juli 2018 23:24 >> To: Volker Simonis ; hotspot-runtime- >> dev at openjdk.java.net runtime >> Subject: Re: [11] RFR(S): 8206998: [test] >> runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on >> ppc64 >> >> Hi Volker, >> >> On 11/07/2018 3:52 AM, Volker Simonis wrote: >>> Hi, >>> >>> can I please get a review for the following test-only change: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ >>> https://bugs.openjdk.java.net/browse/JDK-8206998 >>> >>> The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java >>> intentionally disables caching of Elf sections during symbol lookup >>> with WhiteBox.disableElfSectionCache(). On platforms which do not use >>> file descriptors instead of plain function pointers this slows down >>> the lookup just a little bit, because all the symbols from an Elf file >>> are still read consecutively after one 'fseek()' call. But on >>> platforms with file descriptors like ppc64 big-endian, we get two >>> 'fseek()' calls for each symbol read from the Elf file because reading >>> the file descriptor table is nested inside the loop which reads the >>> symbols. This really trashes the I/O system and considerable slows >>> down the test, so we need an extra long timeout setting. >>> >>> The fix is trivial - simply provide two test versions (i.e. comments): >>> the first one for all Linux flavors which are not ppc64 and a second, >>> new one for Linux/ppc64 which simply has a bigger timeout. >> >> I was not aware that this was a valid way of defining a test! This >> suggests there can only be one "leading comment" per "defining file: >> >> http://openjdk.java.net/jtreg/tag-spec.html >> >> Need to verify this with the jtreg folk: jtreg-use at openjdk.java.net >> >> Thanks, >> David >> >>> Thank you and best regards, >>> Volker >>> From david.holmes at oracle.com Wed Jul 11 07:41:54 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Jul 2018 17:41:54 +1000 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: References: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> Message-ID: <5ffa742c-d0eb-f94b-448a-dd45674b334a@oracle.com> Hi Volker, On 11/07/2018 5:34 PM, Volker Simonis wrote: > Hi David, > > so it obviously works and as Goetz mentioned there are already other, > existing tests which use this feature. > > Do you want me to get a formal review which confirms this from > somebody from the JTreg team? > > I've CC-ed jtreg-use and Jonathan in the hope that they can confirm this. No that's fine - just surprised to see this (and couldn't find any documentation for it!). Thanks, David > Regards, > Volker > > On Tue, Jul 10, 2018 at 11:24 PM, David Holmes wrote: >> Hi Volker, >> >> On 11/07/2018 3:52 AM, Volker Simonis wrote: >>> >>> Hi, >>> >>> can I please get a review for the following test-only change: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ >>> https://bugs.openjdk.java.net/browse/JDK-8206998 >>> >>> The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java >>> intentionally disables caching of Elf sections during symbol lookup >>> with WhiteBox.disableElfSectionCache(). On platforms which do not use >>> file descriptors instead of plain function pointers this slows down >>> the lookup just a little bit, because all the symbols from an Elf file >>> are still read consecutively after one 'fseek()' call. But on >>> platforms with file descriptors like ppc64 big-endian, we get two >>> 'fseek()' calls for each symbol read from the Elf file because reading >>> the file descriptor table is nested inside the loop which reads the >>> symbols. This really trashes the I/O system and considerable slows >>> down the test, so we need an extra long timeout setting. >>> >>> The fix is trivial - simply provide two test versions (i.e. comments): >>> the first one for all Linux flavors which are not ppc64 and a second, >>> new one for Linux/ppc64 which simply has a bigger timeout. >> >> >> I was not aware that this was a valid way of defining a test! This suggests >> there can only be one "leading comment" per "defining file: >> >> http://openjdk.java.net/jtreg/tag-spec.html >> >> Need to verify this with the jtreg folk: jtreg-use at openjdk.java.net >> >> Thanks, >> David >> >> >>> Thank you and best regards, >>> Volker >>> >> From martin.doerr at sap.com Wed Jul 11 10:40:09 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 11 Jul 2018 10:40:09 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> Message-ID: Hi, I think the idea is good, but doesn't work in all cases. We may bail out from code generation and discard the generated code leaving the label unbound. We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov Sent: Mittwoch, 11. Juli 2018 03:34 To: Liu Xin ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 I hit new assert in few other tests: compiler/codegen/TestCharVect2.java compiler/c2/cr6340864/* Regards, Vladimir On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > Fix looks reasonable. I will test it in our framework. > > Thanks, > Vladimir > > On 7/10/18 9:50 AM, Liu Xin wrote: >> Hi, Community, >> Could you please review this small patch? >> Bug:? https://bugs.openjdk.java.net/browse/JDK-8206075 >> >> CR:? http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >> >> Problem: >> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >> This patch align up x86 with other architectures(ppc, arm). >> Add an assertion to the destructor of Label. It? will be wiped out in release build. >> Previously, hotspot cannot pass this test with assertion on x86-64. >> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >> If this CR is approved, Paul Hohensee will push it. >> Thanks, >> --lx >> From zgu at redhat.com Wed Jul 11 11:43:02 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 11 Jul 2018 07:43:02 -0400 Subject: RFR(S) 8206183: Possible construct EMPTY_STACK and allocation stack, etc. on first use In-Reply-To: <580231be-537e-bafd-baf9-7562162d09f0@oracle.com> References: <2cd7efd8-d6d2-0a65-d87b-b264d8bd3970@oracle.com> <9503ded0-bc68-543b-a1ab-6d884854dc9a@redhat.com> <580231be-537e-bafd-baf9-7562162d09f0@oracle.com> Message-ID: <3b3b6712-f9c2-3b25-9a39-b2a6b75164f4@redhat.com> >>> Okay but this relies on C+11 thread-safe static initialization. >>> That's only available in VS2015 and above (which should be okay for >>> JDK 12+). What about other compilers? Does it have to be enabled via >>> any compilation flags? >> >> Thanks for pointing out. >> >> NMT is always initialized while JVM is still in single-thread mode, so >> I think it is safe even without language support, or I miss something >> here? > > Okay - that sounds alright then. Thanks, David. -Zhengyu > > Thanks, > David > >> -Zhengyu >> >> >>> >>> I'm currently running this through some additional internal build/tests. >>> >>> Thanks, >>> David >>> ----- >>> >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206183 >>>> Webrev: http://cr.openjdk.java.net/~zgu/8206183/webrev.00/ >>>> >>>> Test: >>>> >>>> ?? hotspot_nmt on Linux 64 (fastdebug and release) >>>> ?? Submit-test. >>>> >>>> >>>> Thanks, >>>> >>>> -Zhengyu From coleen.phillimore at oracle.com Wed Jul 11 13:15:01 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 11 Jul 2018 09:15:01 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> Message-ID: I've kept the output the same and converted to UL.? To get the lines not to shift due to uptime printing, you can use the option: -Xlog:safepoint+stats=debug::tags or none instead of tags. I could alias PrintSafepointStatistics to this: -Xlog:safepoint+stats=debug::none as this option gets verbose. Having the ability to send the output to a gc.log file is pretty nice though so worth using all the logging options. Please review: open webrev at http://cr.openjdk.java.net/~coleenp/8198720.02/webrev Tested with tier1-3. Thanks, Coleen On 7/9/18 11:26 PM, coleen.phillimore at oracle.com wrote: > > Hi Aleksey, > > I rewrote the logging to use UL and to keep the old format:? see > http://cr.openjdk.java.net/~coleenp/gc.log > It does shift when the time in the logging adds another digit.? I > don't know how to fix that. ? Does this look ok otherwise? > > thanks, > Coleen > > > On 7/9/18 5:42 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/9/18 4:08 PM, Aleksey Shipilev wrote: >>> Thank you! >>> >>> Most latency-savvy folks "out there" run with some sort of >>> safepointing profiling, which in many >>> cases include PrintSafepointStatistics tables. >> >> That was the original reason I was looking at this logging.? I think >> the trouble with the times is that they are ms and mostly zero.? I >> wonder if MILLIUNITS would be better for these times: >> >> ???????????? (int64_t)(sstats->_time_to_spin / MICROUNITS), >> ???????????? (int64_t)(sstats->_time_to_wait_to_block / MICROUNITS), >> ???????????? (int64_t)(sstats->_time_to_sync / MICROUNITS), >> ???????????? (int64_t)(sstats->_time_to_do_cleanups / MICROUNITS), >> ???????????? (int64_t)(sstats->_time_to_exec_vmop / MICROUNITS));?? >> <= this has nonzero values for GC pauses >> >> What do you think? >> >> thanks, >> Coleen >>> >>> -Aleksey >>> >>> On 07/09/2018 08:35 PM, coleen.phillimore at oracle.com wrote: >>>> Okay, somehow the columns of numbers didn't look very useful on my >>>> screen to me, and I wanted to >>>> convert this to UL (and straighten out the logic), so that's why I >>>> made this change.?? I asked >>>> around internally to see which people would care about the format >>>> change and didn't find anyone >>>> specific.? Now I know! >>>> >>>> Let me rework this to use UL but keep the table. >>>> >>>> I'll withdraw this change for now. >>>> >>>> Thank you for the quick feedback. >>>> Coleen >>>> >>>> On 7/9/18 1:58 PM, Aleksey Shipilev wrote: >>>>> On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Convert PrintSafepointStatistics to UL >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8198720.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8198720 >>>>> The synopsis is misleading: it is not only obsoleting >>>>> PrintSafepoint* options, it also reformats the >>>>> output! >>>>> >>>>> We did JDK-8180482 not that long ago, and the reason was that both >>>>> people and machine tools are >>>>> accustomed to the particular non-noisy format for that table. I am >>>>> not at all convinced that >>>>> proposed format [2] is better than current version [3]. Can we >>>>> keep (at least some resemblance of) >>>>> the old format, please? >>>>> >>>>> -Aleksey >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8180482 >>>>> [2] >>>>> https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging >>>>> [3] http://cr.openjdk.java.net/~shade/8180482/after.txt >>>>> >>> >> > From shade at redhat.com Wed Jul 11 13:22:11 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 11 Jul 2018 15:22:11 +0200 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> Message-ID: <3413f132-1677-8339-efde-ef6e3ce27e42@redhat.com> On 07/11/2018 03:15 PM, coleen.phillimore at oracle.com wrote: > Please review: > > open webrev at http://cr.openjdk.java.net/~coleenp/8198720.02/webrev New output looks good to me, Coleen. Thanks for not breaking it :) -Aleksey From cthalinger at twitter.com Wed Jul 11 14:34:20 2018 From: cthalinger at twitter.com (Christian Thalinger) Date: Wed, 11 Jul 2018 10:34:20 -0400 Subject: What to do: clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() In-Reply-To: References: Message-ID: <58385C26-3644-447C-8E25-80C23355F83E@twitter.com> > On Jul 9, 2018, at 10:53 PM, Martin Buchholz wrote: > > There's only one remaining problem building latest jdk with latest clang on > Linux preventing it from working out of the box. It seems likely macosx > has the same problem. > > https://bugs.openjdk.java.net/browse/JDK-8186780 > clang-4.0 fastdebug assertion failure in > os_linux_x86:os::verify_stack_alignment() > > Verifying stack alignment seems rather fragile, especially in the presence > of inlining. > > There are various things we can do: > - making os::verify_stack_alignment NOINLINE and/or moving > os::verify_stack_alignment > to its own translation unit. > - simply disabling the stack alignment check for clang > - I don't see any reason why esp should be aligned even if stack frames > are. (Maybe ebp is better? I'm not a x86 assembly programmer) More > principled seems invoking functions recursively and disabling inlining and > checking that the difference between addresses of a local is a multiple of > the alignment, but that will get complicated. > - why does stack alignment even matter? Isn't it the alignment of c++ > objects on the stack that matter? C++ compilers can (and actually do) emit instructions that need alignment, like movaps. I?ve seen many crashes in the past with movaps and an unaligned stack coming from JIT or stub code. From hohensee at amazon.com Wed Jul 11 14:49:52 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 11 Jul 2018 14:49:52 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> Message-ID: Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. Thanks, Paul ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" wrote: Hi, I think the idea is good, but doesn't work in all cases. We may bail out from code generation and discard the generated code leaving the label unbound. We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov Sent: Mittwoch, 11. Juli 2018 03:34 To: Liu Xin ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 I hit new assert in few other tests: compiler/codegen/TestCharVect2.java compiler/c2/cr6340864/* Regards, Vladimir On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > Fix looks reasonable. I will test it in our framework. > > Thanks, > Vladimir > > On 7/10/18 9:50 AM, Liu Xin wrote: >> Hi, Community, >> Could you please review this small patch? >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >> >> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >> >> Problem: >> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >> This patch align up x86 with other architectures(ppc, arm). >> Add an assertion to the destructor of Label. It will be wiped out in release build. >> Previously, hotspot cannot pass this test with assertion on x86-64. >> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >> If this CR is approved, Paul Hohensee will push it. >> Thanks, >> --lx >> From coleen.phillimore at oracle.com Wed Jul 11 14:54:13 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 11 Jul 2018 10:54:13 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: <3413f132-1677-8339-efde-ef6e3ce27e42@redhat.com> References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> <3413f132-1677-8339-efde-ef6e3ce27e42@redhat.com> Message-ID: <08849120-0afe-7c54-49b0-7048b973d38c@oracle.com> On 7/11/18 9:22 AM, Aleksey Shipilev wrote: > On 07/11/2018 03:15 PM, coleen.phillimore at oracle.com wrote: >> Please review: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8198720.02/webrev > New output looks good to me, Coleen. Thanks for not breaking it :) Thank you for the feedback and code review! Coleen > > -Aleksey > From martinrb at google.com Wed Jul 11 14:56:47 2018 From: martinrb at google.com (Martin Buchholz) Date: Wed, 11 Jul 2018 07:56:47 -0700 Subject: What to do: clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() In-Reply-To: <58385C26-3644-447C-8E25-80C23355F83E@twitter.com> References: <58385C26-3644-447C-8E25-80C23355F83E@twitter.com> Message-ID: On Wed, Jul 11, 2018 at 7:34 AM, Christian Thalinger wrote: > > > - why does stack alignment even matter? Isn't it the alignment of c++ > > objects on the stack that matter? > > C++ compilers can (and actually do) emit instructions that need alignment, > like movaps. I?ve seen many crashes in the past with movaps and an > unaligned stack coming from JIT or stub code. Sure, actual instructions need alignment. But os::verify_stack_alignment isn't doing a good job of finding misaligned instructions, while causing trouble for clang builds. Christian, what do you suggest we do to fix the failing assertion in os::verify_stack_alignment ? I don't know what movaps does, but perhaps os::verify_stack_alignment could simply use that instruction via inline asm? From cthalinger at twitter.com Wed Jul 11 15:18:05 2018 From: cthalinger at twitter.com (Christian Thalinger) Date: Wed, 11 Jul 2018 11:18:05 -0400 Subject: What to do: clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() In-Reply-To: References: <58385C26-3644-447C-8E25-80C23355F83E@twitter.com> Message-ID: <21F9626A-F919-41CA-804A-276EE71490A6@twitter.com> > On Jul 11, 2018, at 10:56 AM, Martin Buchholz wrote: > > > > On Wed, Jul 11, 2018 at 7:34 AM, Christian Thalinger > wrote: > > > - why does stack alignment even matter? Isn't it the alignment of c++ > > objects on the stack that matter? > > C++ compilers can (and actually do) emit instructions that need alignment, like movaps. I?ve seen many crashes in the past with movaps and an unaligned stack coming from JIT or stub code. > > Sure, actual instructions need alignment. But os::verify_stack_alignment isn't doing a good job of finding misaligned instructions, while causing trouble for clang builds. Christian, what do you suggest we do to fix the failing assertion in os::verify_stack_alignment ? > > I don't know what movaps does, but perhaps os::verify_stack_alignment could simply use that instruction via inline asm? movaps is used in the prologue to spill to the stack. If the stack isn?t 16-byte aligned it crashes. os::verify_stack_alignment just makes it easier to find these bugs. I?m not exactly sure what to do. I think the best solution is to avoid that os::verify_stack_alignment is being inlined. From martinrb at google.com Wed Jul 11 16:04:13 2018 From: martinrb at google.com (Martin Buchholz) Date: Wed, 11 Jul 2018 09:04:13 -0700 Subject: What to do: clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() In-Reply-To: <21F9626A-F919-41CA-804A-276EE71490A6@twitter.com> References: <58385C26-3644-447C-8E25-80C23355F83E@twitter.com> <21F9626A-F919-41CA-804A-276EE71490A6@twitter.com> Message-ID: On Wed, Jul 11, 2018 at 8:18 AM, Christian Thalinger wrote: > > > On Jul 11, 2018, at 10:56 AM, Martin Buchholz wrote: > > > > On Wed, Jul 11, 2018 at 7:34 AM, Christian Thalinger < > cthalinger at twitter.com> wrote: > >> >> > - why does stack alignment even matter? Isn't it the alignment of c++ >> > objects on the stack that matter? >> >> C++ compilers can (and actually do) emit instructions that need >> alignment, like movaps. I?ve seen many crashes in the past with movaps and >> an unaligned stack coming from JIT or stub code. > > > Sure, actual instructions need alignment. But os::verify_stack_alignment > isn't doing a good job of finding misaligned instructions, while causing > trouble for clang builds. Christian, what do you suggest we do to fix the > failing assertion in os::verify_stack_alignment ? > > I don't know what movaps does, but perhaps os::verify_stack_alignment > could simply use that instruction via inline asm? > > > movaps is used in the prologue to spill to the stack. If the stack isn?t > 16-byte aligned it crashes. os::verify_stack_alignment just makes it > easier to find these bugs. > > I?m not exactly sure what to do. I think the best solution is to avoid > that os::verify_stack_alignment is being inlined. > > I just commented on the bug: clang inlined os::curent_stack_pointer into its caller __in the same translation unit__ (that could be fixed in a separate change) so of course in this case it didn't have to follow the ABI. One possible fix is obvious in hindsight: -address os::current_stack_pointer() { +NOINLINE address os::current_stack_pointer() { BUT logically a call like current_stack_pointer should return the stack pointer of the __current__ frame, so should probably be a macro that does inline assembly instead of doing a function call. - From lois.foltan at oracle.com Wed Jul 11 16:22:17 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 11 Jul 2018 12:22:17 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> Message-ID: <37a0b154-981a-ab68-e067-a5a2ab38369d@oracle.com> On 7/11/2018 9:15 AM, coleen.phillimore at oracle.com wrote: > > I've kept the output the same and converted to UL.? To get the lines > not to shift due to uptime printing, you can use the option: > > -Xlog:safepoint+stats=debug::tags or none instead of tags. > > I could alias PrintSafepointStatistics to this: > -Xlog:safepoint+stats=debug::none as this option gets verbose. Having > the ability to send the output to a gc.log file is pretty nice though > so worth using all the logging options. > > Please review: > > open webrev at http://cr.openjdk.java.net/~coleenp/8198720.02/webrev Hi Coleen, This looks good.? One minor comment: share/runtime/safepoint.hpp line #111 - can you clarify why the "#if 0/#endif code shouldn't just be removed? Thanks, Lois > > Tested with tier1-3. > > Thanks, > Coleen > > On 7/9/18 11:26 PM, coleen.phillimore at oracle.com wrote: >> >> Hi Aleksey, >> >> I rewrote the logging to use UL and to keep the old format:? see >> http://cr.openjdk.java.net/~coleenp/gc.log >> It does shift when the time in the logging adds another digit. I >> don't know how to fix that. ? Does this look ok otherwise? >> >> thanks, >> Coleen >> >> >> On 7/9/18 5:42 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/9/18 4:08 PM, Aleksey Shipilev wrote: >>>> Thank you! >>>> >>>> Most latency-savvy folks "out there" run with some sort of >>>> safepointing profiling, which in many >>>> cases include PrintSafepointStatistics tables. >>> >>> That was the original reason I was looking at this logging.? I think >>> the trouble with the times is that they are ms and mostly zero.? I >>> wonder if MILLIUNITS would be better for these times: >>> >>> ???????????? (int64_t)(sstats->_time_to_spin / MICROUNITS), >>> ???????????? (int64_t)(sstats->_time_to_wait_to_block / MICROUNITS), >>> ???????????? (int64_t)(sstats->_time_to_sync / MICROUNITS), >>> ???????????? (int64_t)(sstats->_time_to_do_cleanups / MICROUNITS), >>> ???????????? (int64_t)(sstats->_time_to_exec_vmop / MICROUNITS));?? >>> <= this has nonzero values for GC pauses >>> >>> What do you think? >>> >>> thanks, >>> Coleen >>>> >>>> -Aleksey >>>> >>>> On 07/09/2018 08:35 PM, coleen.phillimore at oracle.com wrote: >>>>> Okay, somehow the columns of numbers didn't look very useful on my >>>>> screen to me, and I wanted to >>>>> convert this to UL (and straighten out the logic), so that's why I >>>>> made this change.?? I asked >>>>> around internally to see which people would care about the format >>>>> change and didn't find anyone >>>>> specific.? Now I know! >>>>> >>>>> Let me rework this to use UL but keep the table. >>>>> >>>>> I'll withdraw this change for now. >>>>> >>>>> Thank you for the quick feedback. >>>>> Coleen >>>>> >>>>> On 7/9/18 1:58 PM, Aleksey Shipilev wrote: >>>>>> On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Convert PrintSafepointStatistics to UL >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8198720.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8198720 >>>>>> The synopsis is misleading: it is not only obsoleting >>>>>> PrintSafepoint* options, it also reformats the >>>>>> output! >>>>>> >>>>>> We did JDK-8180482 not that long ago, and the reason was that >>>>>> both people and machine tools are >>>>>> accustomed to the particular non-noisy format for that table. I >>>>>> am not at all convinced that >>>>>> proposed format [2] is better than current version [3]. Can we >>>>>> keep (at least some resemblance of) >>>>>> the old format, please? >>>>>> >>>>>> -Aleksey >>>>>> >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8180482 >>>>>> [2] >>>>>> https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging >>>>>> [3] http://cr.openjdk.java.net/~shade/8180482/after.txt >>>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Wed Jul 11 16:44:34 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 11 Jul 2018 12:44:34 -0400 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <977e9be8-ad4a-4ad3-c9e2-a5702cb03f9f@oracle.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> <977e9be8-ad4a-4ad3-c9e2-a5702cb03f9f@oracle.com> Message-ID: On 7/10/18 3:10 PM, Lois Foltan wrote: > Hi Goetz, > > Just a couple of comments based on Coleen's review, see below. > > On 7/10/2018 9:03 AM, coleen.phillimore at oracle.com wrote: > >> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/classfile/moduleEntry.cpp.udiff.html >> >> >> + name ? name->as_C_string() : ""); > > Instead of "" please use the UNNAMED_MODULE macro from > moduleEntry.hpp. > >> >> >> Can you change to: >> >> + name != NULL ? name->as_C_string() : ""); >> >> >> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/oops/klassVtable.cpp.udiff.html >> >> >> This looks a lot nicer!?? Similar code is in linkResolver.cpp, can >> you look at changing it too? > > I have an RFR out currently for JDK-8205611, (see > http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-June/033325.html), > which needs one more reviewer's okay.? It contains changes to reword > the error messages for loader constraint violations in order to follow > the new proposed format for module and class loader information.? So > our two changes will conflict in this area. Goetz, if you want to check in this to 11, I think Lois can change it similarly in the linkResolver code with her change for 12, once it migrates down. Thanks, Coleen > > Thanks, > Lois > >> >> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html >> >> >> If name is null here, what would this do?? Should there be an 'else' >> to print something? >> >> I think this looks fine.? It doesn't look major to me.? The asserts >> turned to guarantees don't appear to be anywhere performance sensitive. >> >> Thanks, >> Coleen >> >> >> >> On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> I ran coverity on the jdk11 hotspot sources and want to propose the >>> following fixes to the runtime code. I scanned the linux x86_64 build. >>> Some issues are similar to previous parfait fixes (check for NULL, add >>> guarantees etc.) I also identified some issues I consider real >>> problems. >>> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ >>> >>> In detail: >>> >>> Real issues: >>> ------------ >>> >>> jvmtiEnvBase.cpp >>> ?? || should be &&. >>> ?? Attention, this is the only change that really will change >>> behaviour. >>> ?? But if thr == NULL we will see a crash below. >>> >>> perfMemory_linux.cpp: >>> ?? Wrong buffer length used. >>> >>> systemDictionary.cpp: >>> ?? Move code dereferencing ik under if (ik != NULL). >>> >>> virtualspace.cpp >>> ?? Initialization is missing. Moved constructor up to the other >>> ?? constructors. >>> >>> >>> Useful code improvements: >>> ------------------------- >>> >>> vm_version_ext_x86.cpp >>> ?? Assure buffer is not accessed at offset -1. >>> >>> os_linux.cpp >>> ?? Numa_max_node returns int, and a -1 in some cases. >>> >>> moduleEntry.cpp >>> ?? name might be NULL. Just a fix for tracing. >>> >>> systemDictionaryShared.cpp >>> ?? clearify code. >>> ?? It would be wrong if only entry == NULL would hold, one >>> ?? would hit the assertion below. >>> >>> verifier.cpp >>> ?? Fix tracing. >>> ?? Illegal opcode is -1 and should not be passed to name array. >>> >>> logOutput.cpp >>> ?? If n_selections == 0, best_selection would be NULL. >>> ?? Move up the assertion and turn into a guarantee. >>> >>> filemap.cpp >>> ?? Either base can be NULL, or parts of the code before are dead. >>> >>> metaspace.cpp >>> ?? We now an exception is pending. >>> >>> klassVtable.cpp >>> ?? Coverity does not like the format in a variable. >>> ?? Anyways this is quite rough coding, transformed to use stringStream >>> ?? as with other similar exceptions. >>> >>> jvmFlag.cpp >>> ?? match might be NULL. >>> >>> writableFlags.cpp >>> ?? name might be NULL. >>> >>> ostream.cpp >>> ?? If ftell returns error code -1, we need not continue. >>> ?? Especially we should not fseek(-1). >>> >>> logTestUtils.inline.hpp >>> ?? ftell returns -1. >>> >>> test_metachunk.cpp >>> ?? wrong datatype. >> > From yumin.qi at gmail.com Wed Jul 11 17:18:33 2018 From: yumin.qi at gmail.com (yumin qi) Date: Wed, 11 Jul 2018 10:18:33 -0700 Subject: RFR(S) 8206183: Possible construct EMPTY_STACK and allocation stack, etc. on first use In-Reply-To: References: Message-ID: The changes look good to me. Thanks Yumin On Sat, Jul 7, 2018 at 4:37 AM Zhengyu Gu wrote: > Hi, > > NMT has to workaround static initialization order issues: some of static > objects, who allocate memory inside their constructors, may be > initialized ahead of NMT, so NMT is forced to initialize itself early > and risks its static objects may be reinitialized by C runtime. > > The workaround was to declare storage for the static objects as > primitive arrays, then use placement new operator to initialize them, or > just initialize them eagerly, if the results are constants. > > But the solution is not elegant, could break with some compilers. > A better solution is to use "construct on First Use Idiom" pattern > (https://isocpp.org/wiki/faq/ctors#static-init-order), cause we only > have initialization order problems, those static objects do not have > dependencies on other static objects, so we don't suffer from static > deinitialization problems. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206183 > Webrev: http://cr.openjdk.java.net/~zgu/8206183/webrev.00/ > > Test: > > hotspot_nmt on Linux 64 (fastdebug and release) > Submit-test. > > > Thanks, > > -Zhengyu > From zgu at redhat.com Wed Jul 11 17:19:54 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 11 Jul 2018 13:19:54 -0400 Subject: RFR(S) 8206183: Possible construct EMPTY_STACK and allocation stack, etc. on first use In-Reply-To: References: Message-ID: Thanks, Yumin. -Zhengyu On 07/11/2018 01:18 PM, yumin qi wrote: > The changes look good to me. > > Thanks > Yumin > > On Sat, Jul 7, 2018 at 4:37 AM Zhengyu Gu > wrote: > > Hi, > > NMT has to workaround static initialization order issues: some of > static > objects, who allocate memory inside their constructors, may be > initialized ahead of NMT, so NMT is forced to initialize itself early > and risks its static objects may be reinitialized by C runtime. > > The workaround was to declare storage for the static objects as > primitive arrays, then use placement new operator to initialize > them, or > just initialize them eagerly, if the results are constants. > > But the solution is not elegant, could break with some compilers. > A better solution is to use "construct on First Use Idiom" pattern > (https://isocpp.org/wiki/faq/ctors#static-init-order), cause we only > have initialization order problems, those static objects do not have > dependencies on other static objects, so we don't suffer from static > deinitialization problems. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206183 > Webrev: http://cr.openjdk.java.net/~zgu/8206183/webrev.00/ > > Test: > > ? ?hotspot_nmt on Linux 64 (fastdebug and release) > ? ?Submit-test. > > > Thanks, > > -Zhengyu > From volker.simonis at gmail.com Wed Jul 11 17:30:09 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 11 Jul 2018 19:30:09 +0200 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java Message-ID: Hi, can I please have a review for the following test fix which prevents eventual test timeouts: https://bugs.openjdk.java.net/browse/JDK-8207067 http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ The two tests hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java produce more than 90_000 classes until they eat up ~70% of the 128M meta space they run with. The loading of each of these classes triggers a full dependency check for ALL nmethods in debug/fastdebug builds because 'VerifyDependencies' is 'true' there. This slows down the tests from about 3 sec. in the opt build to about 88 sec. in the fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on ppc64. Because the tests are not about dependency checking, it makes sense to switch of 'VerifyDependencies' if they are run inside a debug/fastdebug VM and decrease the execution time down to about 6 sec. on both x86_64 and ppc64. Thank you and best regards, Volker From navy.xliu at gmail.com Wed Jul 11 17:35:00 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Wed, 11 Jul 2018 10:35:00 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> Message-ID: <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. I made another revision. I will run tests thoroughly. Thanks, ?lx > On Jul 11, 2018, at 7:49 AM, Hohensee, Paul wrote: > > Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. > > Thanks, > > Paul > > ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" wrote: > > Hi, > > I think the idea is good, but doesn't work in all cases. > We may bail out from code generation and discard the generated code leaving the label unbound. > We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov > Sent: Mittwoch, 11. Juli 2018 03:34 > To: Liu Xin ; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 > > I hit new assert in few other tests: > > compiler/codegen/TestCharVect2.java > compiler/c2/cr6340864/* > > Regards, > Vladimir > > On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >> Fix looks reasonable. I will test it in our framework. >> >> Thanks, >> Vladimir >> >> On 7/10/18 9:50 AM, Liu Xin wrote: >>> Hi, Community, >>> Could you please review this small patch? >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>> >>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>> >>> Problem: >>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >>> This patch align up x86 with other architectures(ppc, arm). >>> Add an assertion to the destructor of Label. It will be wiped out in release build. >>> Previously, hotspot cannot pass this test with assertion on x86-64. >>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>> If this CR is approved, Paul Hohensee will push it. >>> Thanks, >>> --lx >>> > > From coleen.phillimore at oracle.com Wed Jul 11 18:17:32 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 11 Jul 2018 14:17:32 -0400 Subject: RFR (M) 8198720: Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options In-Reply-To: <37a0b154-981a-ab68-e067-a5a2ab38369d@oracle.com> References: <9349e320-e39d-c5ee-5ebb-b93305fc03f5@oracle.com> <2a50a090-36df-433b-aa4a-6a7087a8e589@redhat.com> <05f84226-0825-896f-c1c3-a89f85338159@oracle.com> <1826f57f-fc8c-86b3-b3fa-65a1c81a9eff@redhat.com> <37a0b154-981a-ab68-e067-a5a2ab38369d@oracle.com> Message-ID: On 7/11/18 12:22 PM, Lois Foltan wrote: > On 7/11/2018 9:15 AM, coleen.phillimore at oracle.com wrote: >> >> I've kept the output the same and converted to UL.? To get the lines >> not to shift due to uptime printing, you can use the option: >> >> -Xlog:safepoint+stats=debug::tags or none instead of tags. >> >> I could alias PrintSafepointStatistics to this: >> -Xlog:safepoint+stats=debug::none as this option gets verbose. Having >> the ability to send the output to a gc.log file is pretty nice though >> so worth using all the logging options. >> >> Please review: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8198720.02/webrev > > Hi Coleen, > This looks good.? One minor comment: > > share/runtime/safepoint.hpp line #111 - can you clarify why the "#if > 0/#endif code shouldn't just be removed? Hi Lois, Thanks for the review and thank you for noticing this. Yes, this code should be removed, which I will do before checking it in. thanks! Coleen > > Thanks, > Lois > >> >> Tested with tier1-3. >> >> Thanks, >> Coleen >> >> On 7/9/18 11:26 PM, coleen.phillimore at oracle.com wrote: >>> >>> Hi Aleksey, >>> >>> I rewrote the logging to use UL and to keep the old format: see >>> http://cr.openjdk.java.net/~coleenp/gc.log >>> It does shift when the time in the logging adds another digit. I >>> don't know how to fix that. ? Does this look ok otherwise? >>> >>> thanks, >>> Coleen >>> >>> >>> On 7/9/18 5:42 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 7/9/18 4:08 PM, Aleksey Shipilev wrote: >>>>> Thank you! >>>>> >>>>> Most latency-savvy folks "out there" run with some sort of >>>>> safepointing profiling, which in many >>>>> cases include PrintSafepointStatistics tables. >>>> >>>> That was the original reason I was looking at this logging. I think >>>> the trouble with the times is that they are ms and mostly zero.? I >>>> wonder if MILLIUNITS would be better for these times: >>>> >>>> ???????????? (int64_t)(sstats->_time_to_spin / MICROUNITS), >>>> ???????????? (int64_t)(sstats->_time_to_wait_to_block / MICROUNITS), >>>> ???????????? (int64_t)(sstats->_time_to_sync / MICROUNITS), >>>> ???????????? (int64_t)(sstats->_time_to_do_cleanups / MICROUNITS), >>>> ???????????? (int64_t)(sstats->_time_to_exec_vmop / MICROUNITS));?? >>>> <= this has nonzero values for GC pauses >>>> >>>> What do you think? >>>> >>>> thanks, >>>> Coleen >>>>> >>>>> -Aleksey >>>>> >>>>> On 07/09/2018 08:35 PM, coleen.phillimore at oracle.com wrote: >>>>>> Okay, somehow the columns of numbers didn't look very useful on >>>>>> my screen to me, and I wanted to >>>>>> convert this to UL (and straighten out the logic), so that's why >>>>>> I made this change.?? I asked >>>>>> around internally to see which people would care about the format >>>>>> change and didn't find anyone >>>>>> specific.? Now I know! >>>>>> >>>>>> Let me rework this to use UL but keep the table. >>>>>> >>>>>> I'll withdraw this change for now. >>>>>> >>>>>> Thank you for the quick feedback. >>>>>> Coleen >>>>>> >>>>>> On 7/9/18 1:58 PM, Aleksey Shipilev wrote: >>>>>>> On 07/09/2018 07:48 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Convert PrintSafepointStatistics to UL >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8198720.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8198720 >>>>>>> The synopsis is misleading: it is not only obsoleting >>>>>>> PrintSafepoint* options, it also reformats the >>>>>>> output! >>>>>>> >>>>>>> We did JDK-8180482 not that long ago, and the reason was that >>>>>>> both people and machine tools are >>>>>>> accustomed to the particular non-noisy format for that table. I >>>>>>> am not at all convinced that >>>>>>> proposed format [2] is better than current version [3]. Can we >>>>>>> keep (at least some resemblance of) >>>>>>> the old format, please? >>>>>>> >>>>>>> -Aleksey >>>>>>> >>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8180482 >>>>>>> [2] >>>>>>> https://bugs.openjdk.java.net/secure/attachment/75330/out.safepoint-logging >>>>>>> [3] http://cr.openjdk.java.net/~shade/8180482/after.txt >>>>>>> >>>>> >>>> >>> >> > From ioi.lam at oracle.com Wed Jul 11 23:13:59 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 11 Jul 2018 16:13:59 -0700 Subject: Proposal for improving CDS archive creation In-Reply-To: References: Message-ID: <20c6adc7-9910-435b-bb5b-549ade7034b9@oracle.com> I had an off-line discussion with Jiangli, and she has an alternative proposal: When -Xshare:autocreate is specified, but the CDS archive is not available, 1. Load classes as normal. After each InstanceKlass is loaded, but before it's used, ?? make a deep copy of this class into an internal cache. 2. The deep copy includes all methods, etc, for this class. However, if a Method is ?? inherited from a super class, then only a reference to this Method is copied. 3. At a certain point (probably at VM exit), copy all the (suitable) classes from the ?? cache and write them into the CDS archive. The advantage of this approach is we will be able to archive classes that were loaded by custom loaders, but have been freed at VM exit time because the class loaders were GC'ed. Note: When a class X is loaded, if its supertype(s) have already been redefined, we probably should not copy X into the buffer. That's because the vtable of X may point to some redefined methods from a supertype, which do not match the bytecodes of these methods in the supertype's original class file, so it's a messy situation. Thanks - Ioi On 7/10/18 12:50 PM, Ioi Lam wrote: > Fixing some sloppy text below .... > > > On 7/10/18 10:16 AM, Ioi Lam wrote: >> I have a proposal for improving the process of creating of the CDS >> archive(s), >> so we can make CDS easier to use and support more use cases. >> >> ?? - better support for custom loaders >> ?? - remove explicit training run >> ?? - support 2 levels of shared archives >> >> I think the proposal is relatively straight-forward to implement, as >> we already >> have most of the required infrastructures: >> >> ?? + the ability to use Java class loaders at archive creation time >> ?? + the ability to relocate MetaspaceObjects >> >> Parts of this proposal will also simplify the CDS code and make it more >> maintainable. >> >> Current process of creating the base archive - [C] >> ================================================== >> >> Currently each JVM process can map at most one CDS archive. Let's >> call this >> the "base archive". It is created by [ref1]: >> >> ?C1. Reserve a region R of 3GB at 0x800000000. >> ?C2. Load all classes specified in the class list. All data for these >> classes >> ???? live outside of R. >> ???? (E.g., the Klass objects are loaded into tmp_class_space, which is >> ????? adjacent to R). >> ?C3. Copy the metadata of all archivable classes (e.g, exclude generated >> ???? Lambda classes) into R. At this step, R is divided into several >> ? ?? sections (RO, RW, etc). >> >> >> ? //? +-- SharedBaseAddress?? (default = 0x800000000) >> ? //? +-- _narrow_klass._base >> ? //? | >> ? //? |?????????????????????????????? +-tmp_class_space.base >> ? //? v?????????????????????????????? V >> ? //? +----+----+----+----+----+-....-+-------------------+ >> ? //? |<-?????????? R?????????????? ->| >> ? //? | MC | RW | RO | MD | OD |unused| tmp_class_space?? | >> ? //? +----+----+----+----+----+------+-------------------+ >> ? //? |<--? 3GB??????? -------------->| >> ? //? |<-- UnscaledClassSpaceMax = 4GB ------------------>| >> >> >> New process for creating the base archive - [N] >> =============================================== >> >> Currently we have a lot of "if (DumpSharedSpaces)" code to for >> special case >> handling of the above scheme. We can improve it by >> >> ?N1. Remove all code for special memory layout initialization for >> -Xshare:dump. >> ???? As a result, we will reserve a region R of 1GB at 0x800000000, >> which >> ???? is used by Klass objects (this is the same as if -Xshare:off were >> ???? specified.) >> ?N2. Load all classes in the class list. >> ?N3. Now R contains the Klass objects of all loaded classes. >> ???? Allocate a temporary space T, and copy all contents of R into T. >> ?N4. Now R is empty. Copy the metadata of all archivable classes into R. >> >> >> Dump-as-you-go for the base archive - [G] >> ========================================= >> >> Note that the [N] scheme will work even if you're running an app with >> -Xshare:off. At some point (e.g., when the VM is about to exit), you >> can: >> >> ?G1. Enter a safe point >> ?G2. Go to step [N3]. >> >> The benefit of [G] is you don't need a separate run to dump the >> archive, and >> there's no need to use the class list. Instead, we can have an option >> like: >> >> ?? java -Xshare:autocreate -cp app.jar -XX:SharedArchiveFile=foo.jsa App >> >> If foo.jsa is not available, we run in [G] mode. At VM exit, we dump >> into >> foo.jsa. >> >> This way, we don't need to have an explicit training run with >> -XX:DumpLoadedClassList. Instead, the training run is >> > I meant, "Instead, your first run, when the archive is not yet > available, becomes the > training run". > > Thanks to Calvin and Dan for spotting this :-) > - Ioi > >> This also makes it easy to support the classes from custom loaders. >> There's no >> need for special tooling to convert -Xlog:class+load=debug output into a >> classlist. [ref2] >> >> >> Dumping for second-level archive - [S] >> ====================================== >> >> ?S1. Load the base archive >> ?S2. Run the app as normal >> ?S3. All Klass objects of the dynamically loaded classes will be >> loaded in >> ???? the region R, which immediately follows the end of the base >> archive. >> >> ? //? +-- SharedBaseAddress >> ? //? |????????????????????????? +--- dynamically loaded Klasses >> ? //? |????????????????????????? |??? start from here. >> ? //? v????????????????????????? v >> ? //? +--------------------------+---------...-----------------| >> ? //? | base archive???????????? | region R??????????????????? | >> ? //? +--------------------------+---------...-----------------| >> ? //? |<- size of base archive ->| >> ? //? |<--??????????? 1GB -->| >> >> >> ? S4. At some point (possible when the VM is about to exit) we start >> ????? dumping the second level archive >> ? S5. Enter safe point >> ? S6. Now R contains the Klass objects of all dynamically loaded >> classes. >> ????? Allocate a temporary space T, and copy all contents of R into T. >> ? S7. Now R is empty. Copy the metadata of all archivable, >> dynamically loaded >> ????? classes into R. >> ? S8. Create a new shared_dictionary (and shared_symbol_table) that >> contains >> ????? all the Klasses (Symbols) from both the base and second-level >> archives. >> >> References >> ========== >> >> [ref1] Current initialization of memory space layout during -Xshare:dump >> http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 >> >> [ref2] Volker Simonis's tool for support custom class loaders in CDS >> ?????? https://github.com/simonis/cl4cds >> ---------------------------------------------------------------------- >> >> >> >> Any thoughts? >> >> Thanks >> - Ioi > From david.holmes at oracle.com Wed Jul 11 23:27:26 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 12 Jul 2018 09:27:26 +1000 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: References: Message-ID: Hi Volker, On 12/07/2018 3:30 AM, Volker Simonis wrote: > Hi, > can I please have a review for the following test fix which prevents > eventual test timeouts: > > https://bugs.openjdk.java.net/browse/JDK-8207067 > http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ > > The two tests hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java > produce more than 90_000 classes until they eat up ~70% of the 128M > meta space they run with. The loading of each of these classes > triggers a full dependency check for ALL nmethods in debug/fastdebug > builds because 'VerifyDependencies' is 'true' there. This slows down > the tests from about 3 sec. in the opt build to about 88 sec. in the > fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on > ppc64. > > Because the tests are not about dependency checking, it makes sense to > switch of 'VerifyDependencies' if they are run inside a > debug/fastdebug VM and decrease the execution time down to about 6 > sec. on both x86_64 and ppc64. It's very annoying that this can't be fixed the obvious way by just specifying -XX:-VerifyDependencies. Maybe jtreg could add a debug_only(...) capability ... :( It's unclear to me whether it is safe to disable VerifyDependencies after the VM has commenced execution. This might lead to inconsistencies. Probably need the compiler folk to clarify that. That aside the comments are far too elaborate and better suited for the bug report. The tests could use @bug lines (though it would be nice to track down the original bug that added the tests). The comments could reduce to a simple: // This test produces more than 90_000 classes until it eats up ~70% of the 128M meta space. // With VerifyDependencies enabled in debug builds this slows the test down considerably. // As it is a develop flag, if we see that it is "constant" then we know this is a product build. Though there is already Platform.isDebugBuild() if you wanted something more direct. (Given you need WB to change the flag it doesn't really make much difference.) Thanks, David > Thank you and best regards, > Volker > From jiangli.zhou at oracle.com Thu Jul 12 00:46:22 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 11 Jul 2018 17:46:22 -0700 Subject: Proposal for improving CDS archive creation In-Reply-To: <20c6adc7-9910-435b-bb5b-549ade7034b9@oracle.com> References: <20c6adc7-9910-435b-bb5b-549ade7034b9@oracle.com> Message-ID: <064d4f4d-e707-682e-cd7b-99d7a0afec86@oracle.com> Volker originally suggested the idea in the email thread "Improving AppCDS for Custom Loaders". I think this is a cleaner approach. Thanks, Jiangli On 7/11/18 4:13 PM, Ioi Lam wrote: > I had an off-line discussion with Jiangli, and she has an alternative > proposal: > > When -Xshare:autocreate is specified, but the CDS archive is not > available, > > 1. Load classes as normal. After each InstanceKlass is loaded, but > before it's used, > ?? make a deep copy of this class into an internal cache. > > 2. The deep copy includes all methods, etc, for this class. However, > if a Method is > ?? inherited from a super class, then only a reference to this Method > is copied. > > 3. At a certain point (probably at VM exit), copy all the (suitable) > classes from the > ?? cache and write them into the CDS archive. > > The advantage of this approach is we will be able to archive classes > that were > loaded by custom loaders, but have been freed at VM exit time because > the class > loaders were GC'ed. > > > Note: When a class X is loaded, if its supertype(s) have already been > redefined, > we probably should not copy X into the buffer. That's because the > vtable of X may > point to some redefined methods from a supertype, which do not match > the bytecodes > of these methods in the supertype's original class file, so it's a > messy situation. > > Thanks > - Ioi > > > > On 7/10/18 12:50 PM, Ioi Lam wrote: >> Fixing some sloppy text below .... >> >> >> On 7/10/18 10:16 AM, Ioi Lam wrote: >>> I have a proposal for improving the process of creating of the CDS >>> archive(s), >>> so we can make CDS easier to use and support more use cases. >>> >>> ?? - better support for custom loaders >>> ?? - remove explicit training run >>> ?? - support 2 levels of shared archives >>> >>> I think the proposal is relatively straight-forward to implement, as >>> we already >>> have most of the required infrastructures: >>> >>> ?? + the ability to use Java class loaders at archive creation time >>> ?? + the ability to relocate MetaspaceObjects >>> >>> Parts of this proposal will also simplify the CDS code and make it more >>> maintainable. >>> >>> Current process of creating the base archive - [C] >>> ================================================== >>> >>> Currently each JVM process can map at most one CDS archive. Let's >>> call this >>> the "base archive". It is created by [ref1]: >>> >>> ?C1. Reserve a region R of 3GB at 0x800000000. >>> ?C2. Load all classes specified in the class list. All data for >>> these classes >>> ???? live outside of R. >>> ???? (E.g., the Klass objects are loaded into tmp_class_space, which is >>> ????? adjacent to R). >>> ?C3. Copy the metadata of all archivable classes (e.g, exclude >>> generated >>> ???? Lambda classes) into R. At this step, R is divided into several >>> ? ?? sections (RO, RW, etc). >>> >>> >>> ? //? +-- SharedBaseAddress?? (default = 0x800000000) >>> ? //? +-- _narrow_klass._base >>> ? //? | >>> ? //? |?????????????????????????????? +-tmp_class_space.base >>> ? //? v?????????????????????????????? V >>> ? //? +----+----+----+----+----+-....-+-------------------+ >>> ? //? |<-?????????? R?????????????? ->| >>> ? //? | MC | RW | RO | MD | OD |unused| tmp_class_space?? | >>> ? //? +----+----+----+----+----+------+-------------------+ >>> ? //? |<--? 3GB??????? -------------->| >>> ? //? |<-- UnscaledClassSpaceMax = 4GB ------------------>| >>> >>> >>> New process for creating the base archive - [N] >>> =============================================== >>> >>> Currently we have a lot of "if (DumpSharedSpaces)" code to for >>> special case >>> handling of the above scheme. We can improve it by >>> >>> ?N1. Remove all code for special memory layout initialization for >>> -Xshare:dump. >>> ???? As a result, we will reserve a region R of 1GB at 0x800000000, >>> which >>> ???? is used by Klass objects (this is the same as if -Xshare:off were >>> ???? specified.) >>> ?N2. Load all classes in the class list. >>> ?N3. Now R contains the Klass objects of all loaded classes. >>> ???? Allocate a temporary space T, and copy all contents of R into T. >>> ?N4. Now R is empty. Copy the metadata of all archivable classes >>> into R. >>> >>> >>> Dump-as-you-go for the base archive - [G] >>> ========================================= >>> >>> Note that the [N] scheme will work even if you're running an app with >>> -Xshare:off. At some point (e.g., when the VM is about to exit), you >>> can: >>> >>> ?G1. Enter a safe point >>> ?G2. Go to step [N3]. >>> >>> The benefit of [G] is you don't need a separate run to dump the >>> archive, and >>> there's no need to use the class list. Instead, we can have an >>> option like: >>> >>> ?? java -Xshare:autocreate -cp app.jar -XX:SharedArchiveFile=foo.jsa >>> App >>> >>> If foo.jsa is not available, we run in [G] mode. At VM exit, we dump >>> into >>> foo.jsa. >>> >>> This way, we don't need to have an explicit training run with >>> -XX:DumpLoadedClassList. Instead, the training run is >>> >> I meant, "Instead, your first run, when the archive is not yet >> available, becomes the >> training run". >> >> Thanks to Calvin and Dan for spotting this :-) >> - Ioi >> >>> This also makes it easy to support the classes from custom loaders. >>> There's no >>> need for special tooling to convert -Xlog:class+load=debug output >>> into a >>> classlist. [ref2] >>> >>> >>> Dumping for second-level archive - [S] >>> ====================================== >>> >>> ?S1. Load the base archive >>> ?S2. Run the app as normal >>> ?S3. All Klass objects of the dynamically loaded classes will be >>> loaded in >>> ???? the region R, which immediately follows the end of the base >>> archive. >>> >>> ? //? +-- SharedBaseAddress >>> ? //? |????????????????????????? +--- dynamically loaded Klasses >>> ? //? |????????????????????????? |??? start from here. >>> ? //? v????????????????????????? v >>> ? // +--------------------------+---------...-----------------| >>> ? //? | base archive???????????? | region R | >>> ? // +--------------------------+---------...-----------------| >>> ? //? |<- size of base archive ->| >>> ? //? |<--??????????? 1GB -->| >>> >>> >>> ? S4. At some point (possible when the VM is about to exit) we start >>> ????? dumping the second level archive >>> ? S5. Enter safe point >>> ? S6. Now R contains the Klass objects of all dynamically loaded >>> classes. >>> ????? Allocate a temporary space T, and copy all contents of R into T. >>> ? S7. Now R is empty. Copy the metadata of all archivable, >>> dynamically loaded >>> ????? classes into R. >>> ? S8. Create a new shared_dictionary (and shared_symbol_table) that >>> contains >>> ????? all the Klasses (Symbols) from both the base and second-level >>> archives. >>> >>> References >>> ========== >>> >>> [ref1] Current initialization of memory space layout during >>> -Xshare:dump >>> http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 >>> >>> [ref2] Volker Simonis's tool for support custom class loaders in CDS >>> ?????? https://github.com/simonis/cl4cds >>> ---------------------------------------------------------------------- >>> >>> >>> >>> Any thoughts? >>> >>> Thanks >>> - Ioi >> > From david.holmes at oracle.com Thu Jul 12 07:20:46 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 12 Jul 2018 17:20:46 +1000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <0e37ae822a5845a1bee78f01f5325cd1@sap.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <864959d1-1e58-a91b-01ee-355178cae2db@oracle.com> <0e37ae822a5845a1bee78f01f5325cd1@sap.com> Message-ID: On 10/07/2018 10:32 PM, Lindenmaier, Goetz wrote: > Hi David, > > Take your time (within the RDP1 timeframe ??) to look at the issues themselves. I have 24 hours left before I'm on vacation and I won't be around to argue about this. Other's seem quite happy to do whatever the checks say regardless of whether it is really necessary or not. So be it. > Just for the basic comments on this: >> I see some asserts changed to guarantees which is unnecessary in general >> - but again appeases static checkers looking at product builds. > This has been done to a large extend for parfait: > http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 In many cases we file "false positive" bugs against parfait or just ignore these warnings. We typically use these kind of asserts to sanity check preconditions that are statically known. So when we run the code in debug mode and it doesn't assert we know it won't encounter bad values in product mode either. I don't agree with changing them just to pacify a static analysis tool. The fact the code path may not be performance sensitive is not in itself justification to me. "Performance death by a thousand cuts" >> I also don't see this as a P3 bug, as there seems only 1 potential real >> bug there (you yourself call these "minor improvements"). So this seems >> unsuitable for JDK 11 now we are in RDP1. But fine for 12. > When else should I do this? I can only do this when development > is closed, else I have to re-run and do fixes again and again for > incoming changes. > We are required to run the checker and fix issues before releasing > a VM. And you also have to comply with the RDP processes of OpenJDK. I don't make the rules. And until GA you still may have to re-run to cover all the changes coming in - that's the nature of the beast. If it were me releasing my own builds I'd be applying these changes locally and then after the release I'd upstream the patch. But I'm not commenting further. Cheers, David > Best regards, > Goetz. > > > > > > > > http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 > >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Dienstag, 10. Juli 2018 14:10 >> To: Lindenmaier, Goetz ; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: RFR(M): 8206977: Minor improvements of runtime code. >> >> Hi Goetz, >> >> On 10/07/2018 8:53 PM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> I ran coverity on the jdk11 hotspot sources and want to propose the >>> following fixes to the runtime code. I scanned the linux x86_64 build. >>> Some issues are similar to previous parfait fixes (check for NULL, add >>> guarantees etc.) I also identified some issues I consider real problems. >>> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ >> >> It will take a while to go through these. I see some false positives >> caused by too local an examination - which is typical of code checkers. >> For example in os_linux.cpp >> >> + if (buflen > 7) { >> >> we know the buffer coming in is O_BUFLEN in size. Add an assert if you >> like but no need for a hard-wired guard. >> >> I see some asserts changed to guarantees which is unnecessary in general >> - but again appeases static checkers looking at product builds. >> >> I also don't see this as a P3 bug, as there seems only 1 potential real >> bug there (you yourself call these "minor improvements"). So this seems >> unsuitable for JDK 11 now we are in RDP1. But fine for 12. >> >> Will try to go through in more detail tomorrow, but it is somewhat >> tedious to have to work through these in detail to refute the code >> checkers claims of incorrectness. >> >> Thanks, >> David >> ----- >> >> >>> In detail: >>> >>> Real issues: >>> ------------ >>> >>> jvmtiEnvBase.cpp >>> || should be &&. >>> Attention, this is the only change that really will change behaviour. >>> But if thr == NULL we will see a crash below. >>> >>> perfMemory_linux.cpp: >>> Wrong buffer length used. >>> >>> systemDictionary.cpp: >>> Move code dereferencing ik under if (ik != NULL). >>> >>> virtualspace.cpp >>> Initialization is missing. Moved constructor up to the other >>> constructors. >>> >>> >>> Useful code improvements: >>> ------------------------- >>> >>> vm_version_ext_x86.cpp >>> Assure buffer is not accessed at offset -1. >>> >>> os_linux.cpp >>> Numa_max_node returns int, and a -1 in some cases. >>> >>> moduleEntry.cpp >>> name might be NULL. Just a fix for tracing. >>> >>> systemDictionaryShared.cpp >>> clearify code. >>> It would be wrong if only entry == NULL would hold, one >>> would hit the assertion below. >>> >>> verifier.cpp >>> Fix tracing. >>> Illegal opcode is -1 and should not be passed to name array. >>> >>> logOutput.cpp >>> If n_selections == 0, best_selection would be NULL. >>> Move up the assertion and turn into a guarantee. >>> >>> filemap.cpp >>> Either base can be NULL, or parts of the code before are dead. >>> >>> metaspace.cpp >>> We now an exception is pending. >>> >>> klassVtable.cpp >>> Coverity does not like the format in a variable. >>> Anyways this is quite rough coding, transformed to use stringStream >>> as with other similar exceptions. >>> >>> jvmFlag.cpp >>> match might be NULL. >>> >>> writableFlags.cpp >>> name might be NULL. >>> >>> ostream.cpp >>> If ftell returns error code -1, we need not continue. >>> Especially we should not fseek(-1). >>> >>> logTestUtils.inline.hpp >>> ftell returns -1. >>> >>> test_metachunk.cpp >>> wrong datatype. >>> From david.holmes at oracle.com Thu Jul 12 07:29:13 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 12 Jul 2018 17:29:13 +1000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <864959d1-1e58-a91b-01ee-355178cae2db@oracle.com> <0e37ae822a5845a1bee78f01f5325cd1@sap.com> Message-ID: <637e13b1-373d-7814-c45b-5cb4fd9f23c1@oracle.com> Sorry Goetz this comes over far too abrupt, which wasn't my intention. Cheers, David On 12/07/2018 5:20 PM, David Holmes wrote: > On 10/07/2018 10:32 PM, Lindenmaier, Goetz wrote: >> Hi David, >> >> Take your time (within the RDP1 timeframe ??) to look at the issues >> themselves. > > I have 24 hours left before I'm on vacation and I won't be around to > argue about this. Other's seem quite happy to do whatever the checks say > regardless of whether it is really necessary or not. So be it. > >> Just for the basic comments on this: >>> I see some asserts changed to guarantees which is unnecessary in general >>> - but again appeases static checkers looking at product builds. >> This has been done to a large extend for parfait: >> ? http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 > > In many cases we file "false positive" bugs against parfait or just > ignore these warnings. We typically use these kind of asserts to sanity > check preconditions that are statically known. So when we run the code > in debug mode and it doesn't assert we know it won't encounter bad > values in product mode either. > > I don't agree with changing them just to pacify a static analysis tool. > The fact the code path may not be performance sensitive is not in itself > justification to me. "Performance death by a thousand cuts" > >>> I also don't see this as a P3 bug, as there seems only 1 potential real >>> bug there (you yourself call these "minor improvements"). So this seems >>> unsuitable for JDK 11 now we are in RDP1. But fine for 12. >> When else should I do this? I can only do this when development >> is closed, else I have to re-run and do fixes again and again for >> incoming changes. >> We are required to run the checker and fix issues before releasing >> a VM. > > And you also have to comply with the RDP processes of OpenJDK. I don't > make the rules. > > And until GA you still may have to re-run to cover all the changes > coming in - that's the nature of the beast. > > If it were me releasing my own builds I'd be applying these changes > locally and then after the release I'd upstream the patch. > > But I'm not commenting further. > > Cheers, > David > >> Best regards, >> ?? Goetz. >> >> >> >> >> >> >> >> http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 >> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Dienstag, 10. Juli 2018 14:10 >>> To: Lindenmaier, Goetz ; hotspot-runtime- >>> dev at openjdk.java.net >>> Subject: Re: RFR(M): 8206977: Minor improvements of runtime code. >>> >>> Hi Goetz, >>> >>> On 10/07/2018 8:53 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> I ran coverity on the jdk11 hotspot sources and want to propose the >>>> following fixes to the runtime code. I scanned the linux x86_64 build. >>>> Some issues are similar to previous parfait fixes (check for NULL, add >>>> guarantees etc.) I also identified some issues I consider real >>>> problems. >>>> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ >>> >>> It will take a while to go through these. I see some false positives >>> caused by too local an examination - which is typical of code checkers. >>> For example in os_linux.cpp >>> >>> +???? if (buflen > 7) { >>> >>> we know the buffer coming in is O_BUFLEN in size. Add an assert if you >>> like but no need for a hard-wired guard. >>> >>> I see some asserts changed to guarantees which is unnecessary in general >>> - but again appeases static checkers looking at product builds. >>> >>> I also don't see this as a P3 bug, as there seems only 1 potential real >>> bug there (you yourself call these "minor improvements"). So this seems >>> unsuitable for JDK 11 now we are in RDP1. But fine for 12. >>> >>> Will try to go through in more detail tomorrow, but it is somewhat >>> tedious to have to work through these in detail to refute the code >>> checkers claims of incorrectness. >>> >>> Thanks, >>> David >>> ----- >>> >>> >>>> In detail: >>>> >>>> Real issues: >>>> ------------ >>>> >>>> jvmtiEnvBase.cpp >>>> ??? || should be &&. >>>> ??? Attention, this is the only change that really will change >>>> behaviour. >>>> ??? But if thr == NULL we will see a crash below. >>>> >>>> perfMemory_linux.cpp: >>>> ??? Wrong buffer length used. >>>> >>>> systemDictionary.cpp: >>>> ??? Move code dereferencing ik under if (ik != NULL). >>>> >>>> virtualspace.cpp >>>> ??? Initialization is missing. Moved constructor up to the other >>>> ??? constructors. >>>> >>>> >>>> Useful code improvements: >>>> ------------------------- >>>> >>>> vm_version_ext_x86.cpp >>>> ??? Assure buffer is not accessed at offset -1. >>>> >>>> os_linux.cpp >>>> ??? Numa_max_node returns int, and a -1 in some cases. >>>> >>>> moduleEntry.cpp >>>> ??? name might be NULL. Just a fix for tracing. >>>> >>>> systemDictionaryShared.cpp >>>> ??? clearify code. >>>> ??? It would be wrong if only entry == NULL would hold, one >>>> ??? would hit the assertion below. >>>> >>>> verifier.cpp >>>> ??? Fix tracing. >>>> ??? Illegal opcode is -1 and should not be passed to name array. >>>> >>>> logOutput.cpp >>>> ??? If n_selections == 0, best_selection would be NULL. >>>> ??? Move up the assertion and turn into a guarantee. >>>> >>>> filemap.cpp >>>> ??? Either base can be NULL, or parts of the code before are dead. >>>> >>>> metaspace.cpp >>>> ??? We now an exception is pending. >>>> >>>> klassVtable.cpp >>>> ??? Coverity does not like the format in a variable. >>>> ??? Anyways this is quite rough coding, transformed to use stringStream >>>> ??? as with other similar exceptions. >>>> >>>> jvmFlag.cpp >>>> ??? match might be NULL. >>>> >>>> writableFlags.cpp >>>> ??? name might be NULL. >>>> >>>> ostream.cpp >>>> ??? If ftell returns error code -1, we need not continue. >>>> ??? Especially we should not fseek(-1). >>>> >>>> logTestUtils.inline.hpp >>>> ??? ftell returns -1. >>>> >>>> test_metachunk.cpp >>>> ??? wrong datatype. >>>> From volker.simonis at gmail.com Thu Jul 12 07:44:10 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 12 Jul 2018 09:44:10 +0200 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: References: Message-ID: On Thu, Jul 12, 2018 at 1:27 AM, David Holmes wrote: > Hi Volker, > > On 12/07/2018 3:30 AM, Volker Simonis wrote: >> >> Hi, >> can I please have a review for the following test fix which prevents >> eventual test timeouts: >> >> https://bugs.openjdk.java.net/browse/JDK-8207067 >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ >> >> The two tests >> hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java >> produce more than 90_000 classes until they eat up ~70% of the 128M >> meta space they run with. The loading of each of these classes >> triggers a full dependency check for ALL nmethods in debug/fastdebug >> builds because 'VerifyDependencies' is 'true' there. This slows down >> the tests from about 3 sec. in the opt build to about 88 sec. in the >> fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on >> ppc64. >> >> Because the tests are not about dependency checking, it makes sense to >> switch of 'VerifyDependencies' if they are run inside a >> debug/fastdebug VM and decrease the execution time down to about 6 >> sec. on both x86_64 and ppc64. > Hi David, thanks for looking at my change! > > It's very annoying that this can't be fixed the obvious way by just > specifying -XX:-VerifyDependencies. Maybe jtreg could add a debug_only(...) > capability ... :( > Agree... > It's unclear to me whether it is safe to disable VerifyDependencies after > the VM has commenced execution. This might lead to inconsistencies. Probably > need the compiler folk to clarify that. > I've checked that prior to doing this change. ' VerifyDependencies' is a very simple flag which maintains no state at all. It is only used once in codeCache.cpp in 'CodeCache::mark_for_deoptimization()' to protect the call to 'nmethod::check_all_dependencies(changes)' (i.e. the one which consumes all the time in the mentioned tests). Besides that, there are a few uses of ' VerifyDependencies' in dependencies.cpp which all simply guard some verification code. Turning ' VerifyDependencies' off in the middle of a compilation or class loading event will simply prevent further checks without doing any harm. But I've CC-ed hotspot-compiler-dev for any case. > That aside the comments are far too elaborate and better suited for the bug > report. The tests could use @bug lines (though it would be nice to track > down the original bug that added the tests). The comments could reduce to a > simple: > > // This test produces more than 90_000 classes until it eats up ~70% of the > 128M meta space. > // With VerifyDependencies enabled in debug builds this slows the test down > considerably. > // As it is a develop flag, if we see that it is "constant" then we know > this is a product build. > I'm happy to change the comments as suggested by you. Thank you and best regards, Volker > Though there is already Platform.isDebugBuild() if you wanted something more > direct. (Given you need WB to change the flag it doesn't really make much > difference.) > > Thanks, > David > > >> Thank you and best regards, >> Volker >> > From goetz.lindenmaier at sap.com Thu Jul 12 08:13:56 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 12 Jul 2018 08:13:56 +0000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <637e13b1-373d-7814-c45b-5cb4fd9f23c1@oracle.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <864959d1-1e58-a91b-01ee-355178cae2db@oracle.com> <0e37ae822a5845a1bee78f01f5325cd1@sap.com> <637e13b1-373d-7814-c45b-5cb4fd9f23c1@oracle.com> Message-ID: Hi David, > Sorry Goetz this comes over far too abrupt, which wasn't my intention. No, it's fine. I know you are always a critical reviewer, but also always constructive and very thoroughly. I wish you a fine vacation! Best regards, Goetz. > > Cheers, > David > > On 12/07/2018 5:20 PM, David Holmes wrote: > > On 10/07/2018 10:32 PM, Lindenmaier, Goetz wrote: > >> Hi David, > >> > >> Take your time (within the RDP1 timeframe ??) to look at the issues > >> themselves. > > > > I have 24 hours left before I'm on vacation and I won't be around to > > argue about this. Other's seem quite happy to do whatever the checks say > > regardless of whether it is really necessary or not. So be it. > > > >> Just for the basic comments on this: > >>> I see some asserts changed to guarantees which is unnecessary in > general > >>> - but again appeases static checkers looking at product builds. > >> This has been done to a large extend for parfait: > >> ? http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 > > > > In many cases we file "false positive" bugs against parfait or just > > ignore these warnings. We typically use these kind of asserts to sanity > > check preconditions that are statically known. So when we run the code > > in debug mode and it doesn't assert we know it won't encounter bad > > values in product mode either. > > > > I don't agree with changing them just to pacify a static analysis tool. > > The fact the code path may not be performance sensitive is not in itself > > justification to me. "Performance death by a thousand cuts" > > > >>> I also don't see this as a P3 bug, as there seems only 1 potential real > >>> bug there (you yourself call these "minor improvements"). So this seems > >>> unsuitable for JDK 11 now we are in RDP1. But fine for 12. > >> When else should I do this? I can only do this when development > >> is closed, else I have to re-run and do fixes again and again for > >> incoming changes. > >> We are required to run the checker and fix issues before releasing > >> a VM. > > > > And you also have to comply with the RDP processes of OpenJDK. I don't > > make the rules. > > > > And until GA you still may have to re-run to cover all the changes > > coming in - that's the nature of the beast. > > > > If it were me releasing my own builds I'd be applying these changes > > locally and then after the release I'd upstream the patch. > > > > But I'm not commenting further. > > > > Cheers, > > David > > > >> Best regards, > >> ?? Goetz. > >> > >> > >> > >> > >> > >> > >> > >> http://hg.openjdk.java.net/jdk/jdk/search/?rev=parfait&revcount=40 > >> > >>> -----Original Message----- > >>> From: David Holmes [mailto:david.holmes at oracle.com] > >>> Sent: Dienstag, 10. Juli 2018 14:10 > >>> To: Lindenmaier, Goetz ; hotspot- > runtime- > >>> dev at openjdk.java.net > >>> Subject: Re: RFR(M): 8206977: Minor improvements of runtime code. > >>> > >>> Hi Goetz, > >>> > >>> On 10/07/2018 8:53 PM, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> I ran coverity on the jdk11 hotspot sources and want to propose the > >>>> following fixes to the runtime code. I scanned the linux x86_64 build. > >>>> Some issues are similar to previous parfait fixes (check for NULL, add > >>>> guarantees etc.) I also identified some issues I consider real > >>>> problems. > >>>> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ > >>> > >>> It will take a while to go through these. I see some false positives > >>> caused by too local an examination - which is typical of code checkers. > >>> For example in os_linux.cpp > >>> > >>> +???? if (buflen > 7) { > >>> > >>> we know the buffer coming in is O_BUFLEN in size. Add an assert if you > >>> like but no need for a hard-wired guard. > >>> > >>> I see some asserts changed to guarantees which is unnecessary in > general > >>> - but again appeases static checkers looking at product builds. > >>> > >>> I also don't see this as a P3 bug, as there seems only 1 potential real > >>> bug there (you yourself call these "minor improvements"). So this seems > >>> unsuitable for JDK 11 now we are in RDP1. But fine for 12. > >>> > >>> Will try to go through in more detail tomorrow, but it is somewhat > >>> tedious to have to work through these in detail to refute the code > >>> checkers claims of incorrectness. > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>> > >>>> In detail: > >>>> > >>>> Real issues: > >>>> ------------ > >>>> > >>>> jvmtiEnvBase.cpp > >>>> ??? || should be &&. > >>>> ??? Attention, this is the only change that really will change > >>>> behaviour. > >>>> ??? But if thr == NULL we will see a crash below. > >>>> > >>>> perfMemory_linux.cpp: > >>>> ??? Wrong buffer length used. > >>>> > >>>> systemDictionary.cpp: > >>>> ??? Move code dereferencing ik under if (ik != NULL). > >>>> > >>>> virtualspace.cpp > >>>> ??? Initialization is missing. Moved constructor up to the other > >>>> ??? constructors. > >>>> > >>>> > >>>> Useful code improvements: > >>>> ------------------------- > >>>> > >>>> vm_version_ext_x86.cpp > >>>> ??? Assure buffer is not accessed at offset -1. > >>>> > >>>> os_linux.cpp > >>>> ??? Numa_max_node returns int, and a -1 in some cases. > >>>> > >>>> moduleEntry.cpp > >>>> ??? name might be NULL. Just a fix for tracing. > >>>> > >>>> systemDictionaryShared.cpp > >>>> ??? clearify code. > >>>> ??? It would be wrong if only entry == NULL would hold, one > >>>> ??? would hit the assertion below. > >>>> > >>>> verifier.cpp > >>>> ??? Fix tracing. > >>>> ??? Illegal opcode is -1 and should not be passed to name array. > >>>> > >>>> logOutput.cpp > >>>> ??? If n_selections == 0, best_selection would be NULL. > >>>> ??? Move up the assertion and turn into a guarantee. > >>>> > >>>> filemap.cpp > >>>> ??? Either base can be NULL, or parts of the code before are dead. > >>>> > >>>> metaspace.cpp > >>>> ??? We now an exception is pending. > >>>> > >>>> klassVtable.cpp > >>>> ??? Coverity does not like the format in a variable. > >>>> ??? Anyways this is quite rough coding, transformed to use stringStream > >>>> ??? as with other similar exceptions. > >>>> > >>>> jvmFlag.cpp > >>>> ??? match might be NULL. > >>>> > >>>> writableFlags.cpp > >>>> ??? name might be NULL. > >>>> > >>>> ostream.cpp > >>>> ??? If ftell returns error code -1, we need not continue. > >>>> ??? Especially we should not fseek(-1). > >>>> > >>>> logTestUtils.inline.hpp > >>>> ??? ftell returns -1. > >>>> > >>>> test_metachunk.cpp > >>>> ??? wrong datatype. > >>>> From goetz.lindenmaier at sap.com Thu Jul 12 09:02:42 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 12 Jul 2018 09:02:42 +0000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> Message-ID: <720f92fadf7b4327b63a995fa5aca667@sap.com> Hi Coleen, thanks for looking at my change. New webrev: http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/02/ > + name ? name->as_C_string() : ""); > Can you change to: > + name != NULL ? name->as_C_string() : ""); Fixed. > http://cr.openjdk.java.net/~goetz/wr18/8206977- > covRuntime/01/src/hotspot/share/oops/klassVtable.cpp.udiff.html > This looks a lot nicer!?? Similar code is in linkResolver.cpp, can you > look at changing it too? I removed it and leave it to Lois. > http://cr.openjdk.java.net/~goetz/wr18/8206977- > covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html > If name is null here, what would this do?? Should there be an 'else' to > print something? I think the case with the null name is for "MISSING_NAME" and thus no message is needed here. (If range is null no message is printed either.) Best regards, Goetz. > > I think this looks fine.? It doesn't look major to me.? The asserts > turned to guarantees don't appear to be anywhere performance sensitive. > > Thanks, > Coleen > > > > On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: > > Hi, > > > > I ran coverity on the jdk11 hotspot sources and want to propose the > > following fixes to the runtime code. I scanned the linux x86_64 build. > > Some issues are similar to previous parfait fixes (check for NULL, add > > guarantees etc.) I also identified some issues I consider real problems. > > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ > > > > In detail: > > > > Real issues: > > ------------ > > > > jvmtiEnvBase.cpp > > || should be &&. > > Attention, this is the only change that really will change behaviour. > > But if thr == NULL we will see a crash below. > > > > perfMemory_linux.cpp: > > Wrong buffer length used. > > > > systemDictionary.cpp: > > Move code dereferencing ik under if (ik != NULL). > > > > virtualspace.cpp > > Initialization is missing. Moved constructor up to the other > > constructors. > > > > > > Useful code improvements: > > ------------------------- > > > > vm_version_ext_x86.cpp > > Assure buffer is not accessed at offset -1. > > > > os_linux.cpp > > Numa_max_node returns int, and a -1 in some cases. > > > > moduleEntry.cpp > > name might be NULL. Just a fix for tracing. > > > > systemDictionaryShared.cpp > > clearify code. > > It would be wrong if only entry == NULL would hold, one > > would hit the assertion below. > > > > verifier.cpp > > Fix tracing. > > Illegal opcode is -1 and should not be passed to name array. > > > > logOutput.cpp > > If n_selections == 0, best_selection would be NULL. > > Move up the assertion and turn into a guarantee. > > > > filemap.cpp > > Either base can be NULL, or parts of the code before are dead. > > > > metaspace.cpp > > We now an exception is pending. > > > > klassVtable.cpp > > Coverity does not like the format in a variable. > > Anyways this is quite rough coding, transformed to use stringStream > > as with other similar exceptions. > > > > jvmFlag.cpp > > match might be NULL. > > > > writableFlags.cpp > > name might be NULL. > > > > ostream.cpp > > If ftell returns error code -1, we need not continue. > > Especially we should not fseek(-1). > > > > logTestUtils.inline.hpp > > ftell returns -1. > > > > test_metachunk.cpp > > wrong datatype. From coleen.phillimore at oracle.com Thu Jul 12 12:04:19 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 12 Jul 2018 08:04:19 -0400 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <720f92fadf7b4327b63a995fa5aca667@sap.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> <720f92fadf7b4327b63a995fa5aca667@sap.com> Message-ID: On 7/12/18 5:02 AM, Lindenmaier, Goetz wrote: > Hi Coleen, > > thanks for looking at my change. > > New webrev: > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/02/ > >> + name ? name->as_C_string() : ""); >> Can you change to: >> + name != NULL ? name->as_C_string() : ""); > Fixed. > >> http://cr.openjdk.java.net/~goetz/wr18/8206977- >> covRuntime/01/src/hotspot/share/oops/klassVtable.cpp.udiff.html >> This looks a lot nicer!?? Similar code is in linkResolver.cpp, can you >> look at changing it too? > I removed it and leave it to Lois. Great. that's even better. > >> http://cr.openjdk.java.net/~goetz/wr18/8206977- >> covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html >> If name is null here, what would this do?? Should there be an 'else' to >> print something? > I think the case with the null name is for "MISSING_NAME" > and thus no message is needed here. (If range is null no > message is printed either.) Okay this is fine.?? For the record, if you think these are false positives, you should file a report against the static analysis tool but I think you should still make these simple changes. Thanks, Coleen > > Best regards, > Goetz. > > >> I think this looks fine.? It doesn't look major to me.? The asserts >> turned to guarantees don't appear to be anywhere performance sensitive. >> >> Thanks, >> Coleen >> >> >> >> On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> I ran coverity on the jdk11 hotspot sources and want to propose the >>> following fixes to the runtime code. I scanned the linux x86_64 build. >>> Some issues are similar to previous parfait fixes (check for NULL, add >>> guarantees etc.) I also identified some issues I consider real problems. >>> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ >>> >>> In detail: >>> >>> Real issues: >>> ------------ >>> >>> jvmtiEnvBase.cpp >>> || should be &&. >>> Attention, this is the only change that really will change behaviour. >>> But if thr == NULL we will see a crash below. >>> >>> perfMemory_linux.cpp: >>> Wrong buffer length used. >>> >>> systemDictionary.cpp: >>> Move code dereferencing ik under if (ik != NULL). >>> >>> virtualspace.cpp >>> Initialization is missing. Moved constructor up to the other >>> constructors. >>> >>> >>> Useful code improvements: >>> ------------------------- >>> >>> vm_version_ext_x86.cpp >>> Assure buffer is not accessed at offset -1. >>> >>> os_linux.cpp >>> Numa_max_node returns int, and a -1 in some cases. >>> >>> moduleEntry.cpp >>> name might be NULL. Just a fix for tracing. >>> >>> systemDictionaryShared.cpp >>> clearify code. >>> It would be wrong if only entry == NULL would hold, one >>> would hit the assertion below. >>> >>> verifier.cpp >>> Fix tracing. >>> Illegal opcode is -1 and should not be passed to name array. >>> >>> logOutput.cpp >>> If n_selections == 0, best_selection would be NULL. >>> Move up the assertion and turn into a guarantee. >>> >>> filemap.cpp >>> Either base can be NULL, or parts of the code before are dead. >>> >>> metaspace.cpp >>> We now an exception is pending. >>> >>> klassVtable.cpp >>> Coverity does not like the format in a variable. >>> Anyways this is quite rough coding, transformed to use stringStream >>> as with other similar exceptions. >>> >>> jvmFlag.cpp >>> match might be NULL. >>> >>> writableFlags.cpp >>> name might be NULL. >>> >>> ostream.cpp >>> If ftell returns error code -1, we need not continue. >>> Especially we should not fseek(-1). >>> >>> logTestUtils.inline.hpp >>> ftell returns -1. >>> >>> test_metachunk.cpp >>> wrong datatype. From goetz.lindenmaier at sap.com Thu Jul 12 14:15:37 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 12 Jul 2018 14:15:37 +0000 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> <720f92fadf7b4327b63a995fa5aca667@sap.com> Message-ID: <1d8bbfbffb29489f9f66ad9f25c5836c@sap.com> Hi Coleen, yes, we have a Coverity team inhouse, and I keep giving them input to improve the tool. Also, I marked far more items as "false positive" in the database than I actually fix. The change passed jdk/submit, (jdk/submit11 was not reachable) and all our testing, so I think I can push this now. Best regards, Goetz. > -----Original Message----- > From: coleen.phillimore at oracle.com [mailto:coleen.phillimore at oracle.com] > Sent: Donnerstag, 12. Juli 2018 14:04 > To: Lindenmaier, Goetz ; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(M): 8206977: Minor improvements of runtime code. > > > > On 7/12/18 5:02 AM, Lindenmaier, Goetz wrote: > > Hi Coleen, > > > > thanks for looking at my change. > > > > New webrev: > > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/02/ > > > >> + name ? name->as_C_string() : ""); > >> Can you change to: > >> + name != NULL ? name->as_C_string() : ""); > > Fixed. > > > >> http://cr.openjdk.java.net/~goetz/wr18/8206977- > >> covRuntime/01/src/hotspot/share/oops/klassVtable.cpp.udiff.html > >> This looks a lot nicer!?? Similar code is in linkResolver.cpp, can you > >> look at changing it too? > > I removed it and leave it to Lois. > > Great. that's even better. > > > >> http://cr.openjdk.java.net/~goetz/wr18/8206977- > >> covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html > >> If name is null here, what would this do?? Should there be an 'else' to > >> print something? > > I think the case with the null name is for "MISSING_NAME" > > and thus no message is needed here. (If range is null no > > message is printed either.) > > Okay this is fine.?? For the record, if you think these are false > positives, you should file a report against the static analysis tool but > I think you should still make these simple changes. > > Thanks, > Coleen > > > > Best regards, > > Goetz. > > > > > >> I think this looks fine.? It doesn't look major to me.? The asserts > >> turned to guarantees don't appear to be anywhere performance > sensitive. > >> > >> Thanks, > >> Coleen > >> > >> > >> > >> On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: > >>> Hi, > >>> > >>> I ran coverity on the jdk11 hotspot sources and want to propose the > >>> following fixes to the runtime code. I scanned the linux x86_64 build. > >>> Some issues are similar to previous parfait fixes (check for NULL, add > >>> guarantees etc.) I also identified some issues I consider real problems. > >>> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ > >>> > >>> In detail: > >>> > >>> Real issues: > >>> ------------ > >>> > >>> jvmtiEnvBase.cpp > >>> || should be &&. > >>> Attention, this is the only change that really will change behaviour. > >>> But if thr == NULL we will see a crash below. > >>> > >>> perfMemory_linux.cpp: > >>> Wrong buffer length used. > >>> > >>> systemDictionary.cpp: > >>> Move code dereferencing ik under if (ik != NULL). > >>> > >>> virtualspace.cpp > >>> Initialization is missing. Moved constructor up to the other > >>> constructors. > >>> > >>> > >>> Useful code improvements: > >>> ------------------------- > >>> > >>> vm_version_ext_x86.cpp > >>> Assure buffer is not accessed at offset -1. > >>> > >>> os_linux.cpp > >>> Numa_max_node returns int, and a -1 in some cases. > >>> > >>> moduleEntry.cpp > >>> name might be NULL. Just a fix for tracing. > >>> > >>> systemDictionaryShared.cpp > >>> clearify code. > >>> It would be wrong if only entry == NULL would hold, one > >>> would hit the assertion below. > >>> > >>> verifier.cpp > >>> Fix tracing. > >>> Illegal opcode is -1 and should not be passed to name array. > >>> > >>> logOutput.cpp > >>> If n_selections == 0, best_selection would be NULL. > >>> Move up the assertion and turn into a guarantee. > >>> > >>> filemap.cpp > >>> Either base can be NULL, or parts of the code before are dead. > >>> > >>> metaspace.cpp > >>> We now an exception is pending. > >>> > >>> klassVtable.cpp > >>> Coverity does not like the format in a variable. > >>> Anyways this is quite rough coding, transformed to use stringStream > >>> as with other similar exceptions. > >>> > >>> jvmFlag.cpp > >>> match might be NULL. > >>> > >>> writableFlags.cpp > >>> name might be NULL. > >>> > >>> ostream.cpp > >>> If ftell returns error code -1, we need not continue. > >>> Especially we should not fseek(-1). > >>> > >>> logTestUtils.inline.hpp > >>> ftell returns -1. > >>> > >>> test_metachunk.cpp > >>> wrong datatype. From goetz.lindenmaier at sap.com Thu Jul 12 14:51:31 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 12 Jul 2018 14:51:31 +0000 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: References: Message-ID: <8b9a7f665ac541e4838263b7fb495554@sap.com> Hi Volker, I had a look at your change. Your intent is plausible. Won?t it help to just set -XX:+IgnoreUnrecognizedVMOptions? Alternatively, you could specify two test setups, one with @requires vm.debug, the other with !vm.debug. If non of these are possible, your change looks good. Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of David Holmes > Sent: Donnerstag, 12. Juli 2018 01:27 > To: Volker Simonis ; hotspot-runtime- > dev at openjdk.java.net runtime > Subject: Re: [11] RFR(S): 8207067: [test] prevent timeouts in > serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java > > Hi Volker, > > On 12/07/2018 3:30 AM, Volker Simonis wrote: > > Hi, > > can I please have a review for the following test fix which prevents > > eventual test timeouts: > > > > https://bugs.openjdk.java.net/browse/JDK-8207067 > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ > > > > The two tests > hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java > > produce more than 90_000 classes until they eat up ~70% of the 128M > > meta space they run with. The loading of each of these classes > > triggers a full dependency check for ALL nmethods in debug/fastdebug > > builds because 'VerifyDependencies' is 'true' there. This slows down > > the tests from about 3 sec. in the opt build to about 88 sec. in the > > fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on > > ppc64. > > > > Because the tests are not about dependency checking, it makes sense to > > switch of 'VerifyDependencies' if they are run inside a > > debug/fastdebug VM and decrease the execution time down to about 6 > > sec. on both x86_64 and ppc64. > > It's very annoying that this can't be fixed the obvious way by just > specifying -XX:-VerifyDependencies. Maybe jtreg could add a > debug_only(...) capability ... :( > > It's unclear to me whether it is safe to disable VerifyDependencies > after the VM has commenced execution. This might lead to > inconsistencies. Probably need the compiler folk to clarify that. > > That aside the comments are far too elaborate and better suited for the > bug report. The tests could use @bug lines (though it would be nice to > track down the original bug that added the tests). The comments could > reduce to a simple: > > // This test produces more than 90_000 classes until it eats up ~70% of > the 128M meta space. > // With VerifyDependencies enabled in debug builds this slows the test > down considerably. > // As it is a develop flag, if we see that it is "constant" then we know > this is a product build. > > Though there is already Platform.isDebugBuild() if you wanted something > more direct. (Given you need WB to change the flag it doesn't really > make much difference.) > > Thanks, > David > > > Thank you and best regards, > > Volker > > From lois.foltan at oracle.com Thu Jul 12 14:56:27 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 12 Jul 2018 10:56:27 -0400 Subject: RFR(M): 8206977: Minor improvements of runtime code. In-Reply-To: <720f92fadf7b4327b63a995fa5aca667@sap.com> References: <571d727a270e47cb8230d8a88b58a2a1@sap.com> <3d2d5c75-a219-9153-33e3-57a77bf88d92@oracle.com> <720f92fadf7b4327b63a995fa5aca667@sap.com> Message-ID: <6a40000d-58df-4c51-a455-11585fe9ae91@oracle.com> On 7/12/2018 5:02 AM, Lindenmaier, Goetz wrote: > Hi Coleen, > > thanks for looking at my change. > > New webrev: > http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/02/ > >> + name ? name->as_C_string() : ""); >> Can you change to: >> + name != NULL ? name->as_C_string() : ""); > Fixed. > >> http://cr.openjdk.java.net/~goetz/wr18/8206977- >> covRuntime/01/src/hotspot/share/oops/klassVtable.cpp.udiff.html >> This looks a lot nicer!?? Similar code is in linkResolver.cpp, can you >> look at changing it too? > I removed it and leave it to Lois. Sounds good.? I will pick this change up for klassVtable.cpp and make similar changes to linkResolver.cpp. Thanks! Lois > >> http://cr.openjdk.java.net/~goetz/wr18/8206977- >> covRuntime/01/src/hotspot/share/services/writeableFlags.cpp.udiff.html >> If name is null here, what would this do?? Should there be an 'else' to >> print something? > I think the case with the null name is for "MISSING_NAME" > and thus no message is needed here. (If range is null no > message is printed either.) > > Best regards, > Goetz. > > >> I think this looks fine.? It doesn't look major to me.? The asserts >> turned to guarantees don't appear to be anywhere performance sensitive. >> >> Thanks, >> Coleen >> >> >> >> On 7/10/18 6:53 AM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> I ran coverity on the jdk11 hotspot sources and want to propose the >>> following fixes to the runtime code. I scanned the linux x86_64 build. >>> Some issues are similar to previous parfait fixes (check for NULL, add >>> guarantees etc.) I also identified some issues I consider real problems. >>> http://cr.openjdk.java.net/~goetz/wr18/8206977-covRuntime/01/ >>> >>> In detail: >>> >>> Real issues: >>> ------------ >>> >>> jvmtiEnvBase.cpp >>> || should be &&. >>> Attention, this is the only change that really will change behaviour. >>> But if thr == NULL we will see a crash below. >>> >>> perfMemory_linux.cpp: >>> Wrong buffer length used. >>> >>> systemDictionary.cpp: >>> Move code dereferencing ik under if (ik != NULL). >>> >>> virtualspace.cpp >>> Initialization is missing. Moved constructor up to the other >>> constructors. >>> >>> >>> Useful code improvements: >>> ------------------------- >>> >>> vm_version_ext_x86.cpp >>> Assure buffer is not accessed at offset -1. >>> >>> os_linux.cpp >>> Numa_max_node returns int, and a -1 in some cases. >>> >>> moduleEntry.cpp >>> name might be NULL. Just a fix for tracing. >>> >>> systemDictionaryShared.cpp >>> clearify code. >>> It would be wrong if only entry == NULL would hold, one >>> would hit the assertion below. >>> >>> verifier.cpp >>> Fix tracing. >>> Illegal opcode is -1 and should not be passed to name array. >>> >>> logOutput.cpp >>> If n_selections == 0, best_selection would be NULL. >>> Move up the assertion and turn into a guarantee. >>> >>> filemap.cpp >>> Either base can be NULL, or parts of the code before are dead. >>> >>> metaspace.cpp >>> We now an exception is pending. >>> >>> klassVtable.cpp >>> Coverity does not like the format in a variable. >>> Anyways this is quite rough coding, transformed to use stringStream >>> as with other similar exceptions. >>> >>> jvmFlag.cpp >>> match might be NULL. >>> >>> writableFlags.cpp >>> name might be NULL. >>> >>> ostream.cpp >>> If ftell returns error code -1, we need not continue. >>> Especially we should not fseek(-1). >>> >>> logTestUtils.inline.hpp >>> ftell returns -1. >>> >>> test_metachunk.cpp >>> wrong datatype. From lois.foltan at oracle.com Thu Jul 12 16:54:36 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 12 Jul 2018 12:54:36 -0400 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <5B451446.6000905@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> <5B4504A0.5070607@oracle.com> <5B451446.6000905@oracle.com> Message-ID: On 7/10/2018 4:17 PM, Calvin Cheung wrote: > > > On 7/10/18, 12:34 PM, Lois Foltan wrote: >> On 7/10/2018 3:10 PM, Calvin Cheung wrote: >> >>> Hi Lois, >>> >>> I'm wondering if the ResourceMark in the following function in >>> universe.cpp could be removed? >>> If I understand the code correctly, the ResourceMark is necessary >>> for Universe::reinitialize_itables() which calls into >>> klassItable::initialize_itable() where you've added ResourceMark >>> with your change. >>> >>> bool universe_post_init() { >>> ? assert(!is_init_completed(), "Error: initialization not yet >>> completed!"); >>> ? Universe::_fully_initialized = true; >>> ? EXCEPTION_MARK; >>> ? { ResourceMark rm; >>> ??? Interpreter::initialize();????? // needed for interpreter entry >>> points >>> ??? if (!UseSharedSpaces) { >>> ????? HandleMark hm(THREAD); >>> ????? Klass* ok = SystemDictionary::Object_klass(); >>> ????? Universe::reinitialize_vtable_of(ok, CHECK_false); >>> ????? Universe::reinitialize_itables(CHECK_false); >>> ??? } >>> ? } >> >> Thanks Calvin for the review!? I wondered that as well, but I think >> the ResourceMark may be needed for the Interpreter::initialize(). For >> example, it calls TemplateTable::initialize() which logs timer >> information which I suspect may need a ResourceMark.? So, it wasn't >> clear that the ResourceMark in universe_post_init() was solely needed >> for the reinitialize_vtable and itables. > In timerTrace.hpp: > // TraceTime is used for tracing the execution time of a block > // Usage: > //? { > //??? TraceTime t("some timer", TIMERTRACE_LOG(Info, startuptime, > tagX...)); > //??? some_code(); > //? } > // > > I looked at several usage of TraceTime and they all don't have > ResourceMark before it. > > I'm fine with leaving the ResourceMark in universe_post_init() if you > want to play it safe. Thanks Calvin for looking at this! I think I am going to leave it as is. Lois > > thanks, > Calvin > >> >> Thanks, >> Lois >> >>> >>> It looks good otherwise. >>> >>> thanks, >>> Calvin >>> >>> On 7/10/18, 10:19 AM, Lois Foltan wrote: >>>> Please review this clean up change to correctly set ResourceMark >>>> from within klassVtable::initialize_vtable() and >>>> klassItable::initialize_itable() when applicable, instead of having >>>> all instances of calls to these two methods establish a >>>> ResourceMark unnecessarily prior to. >>>> >>>> open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ >>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 >>>> >>>> Testing: hs-tier1-3, jdk-tier1-3 (complete) >>>> ?????????????? hs-tier4-5 (in progress) >>>> >>>> Thanks, >>>> Lois >> From lois.foltan at oracle.com Thu Jul 12 16:55:07 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 12 Jul 2018 12:55:07 -0400 Subject: RFR (S) JDK-8178712: ResourceMark may be missing inside initialize_[vi]table In-Reply-To: <8bddaa7d-7c11-4f48-be56-3d2a44e9ffdc@oracle.com> References: <3822eeeb-0a58-cdbc-46a3-ed6c02e365ab@oracle.com> <172838b3-c501-c2e3-75ef-e67eb67ce791@oracle.com> <8bddaa7d-7c11-4f48-be56-3d2a44e9ffdc@oracle.com> Message-ID: On 7/10/2018 4:16 PM, Ioi Lam wrote: > > > On 7/10/18 1:12 PM, Lois Foltan wrote: >> On 7/10/2018 3:55 PM, Ioi Lam wrote: >> >>> Hi Lois, >>> >>> Looks good. >>> >>> ?905 int klassVtable::fill_in_mirandas(int initialized) { >>> ?906?? ResourceMark rm(Thread::current()); >>> >>> maybe this function can have an addition THREAD parameter? That way >>> you can avoid calling Thread::current(), which may be expensive. >> >> Thanks Ioi!? Good point, new webrev in case you want to see it at >> http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712.1/webrev/ >> Lois >> > > Looks good. Thanks! > - Ioi Thanks again for the review! Lois >>> >>> Thanks >>> >>> - Ioi >>> >>> >>> On 7/10/18 10:19 AM, Lois Foltan wrote: >>>> Please review this clean up change to correctly set ResourceMark >>>> from within klassVtable::initialize_vtable() and >>>> klassItable::initialize_itable() when applicable, instead of having >>>> all instances of calls to these two methods establish a >>>> ResourceMark unnecessarily prior to. >>>> >>>> open webrev at http://cr.openjdk.java.net/~lfoltan/bug_jdk8178712/ >>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8178712 >>>> >>>> Testing: hs-tier1-3, jdk-tier1-3 (complete) >>>> ?????????????? hs-tier4-5 (in progress) >>>> >>>> Thanks, >>>> Lois >>> >> > From martinrb at google.com Thu Jul 12 18:52:45 2018 From: martinrb at google.com (Martin Buchholz) Date: Thu, 12 Jul 2018 11:52:45 -0700 Subject: What to do: clang-4.0 fastdebug assertion failure in os_linux_x86:os::verify_stack_alignment() In-Reply-To: References: <58385C26-3644-447C-8E25-80C23355F83E@twitter.com> <21F9626A-F919-41CA-804A-276EE71490A6@twitter.com> Message-ID: >From the bug : I'm thinking all of this fiddling with esp is bogus, and that os::verify_stack_alignment() is more trouble than it's worth. But I'm still looking for something that will be approved. How about simply checking that __builtin_frame_address(0) is always aligned, at least with clang and gcc? #ifndef PRODUCT -void os::verify_stack_alignment() { +NOINLINE void os::verify_stack_alignment() { #ifdef AMD64 - assert(((intptr_t)os::current_stack_pointer() & (StackAlignmentInBytes-1)) == 0, "incorrect stack alignment"); + +#ifdef SPARC_WORKS + register void *esp; + __asm__("mov %%" SPELL_REG_SP ", %0":"=r"(esp)); +#else + // __builtin_frame_address supported by both gcc and clang + void *esp = __builtin_frame_address(0); +#endif + + assert(((intptr_t)esp & (StackAlignmentInBytes-1)) == 0, "incorrect stack alignment"); #endif } #endif From mikhailo.seledtsov at oracle.com Thu Jul 12 19:26:55 2018 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Thu, 12 Jul 2018 12:26:55 -0700 Subject: RFR(XS) 8139876: Exclude hanging nsk/stress/stack from execution with deoptimization enabled In-Reply-To: <8eb00ecb-9c89-ee85-59ec-6cb2a5cce334@oracle.com> References: <9F7B6107-AA4E-46AE-BB34-1930839C35A5@oracle.com> <8eb00ecb-9c89-ee85-59ec-6cb2a5cce334@oracle.com> Message-ID: <5B47AB7F.7000301@oracle.com> +1 On 7/10/18, 5:27 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 7/10/18 5:13 PM, Leonid Mesnik wrote: >> Hi >> >> Could you please review following fix which add run >> nsk/stress/stack/* tests only when DeoptimizeALot is disabled. >> The reason of exclusion is same as for >> JDK-8172854 >> [TESTBUG] Exclude >> runtime/ReservedStack/ReservedStackTest.java from being run with >> DeoptimizeALot option >> Tests create a lot of recursive calls to trigger stack overflow. >> Running tests with DeoptimizeALot increase time significantly. (Up to >> 10 hours...) >> >> Also fix slightly update tests runtime/ReservedStack/* to use >> correct option name. I occasionally found that test still executed >> after JDK-8172854 >> because of typo "Alot" instead of "ALot". >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8139876/webrev.00/ >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8139876 >> >> >> Please note that bug >> JDK-819988 >> [TESTBUG] >> fromTonga/nsk/stress/stack tests fail by timeout when -Xcomp is used >> is different. The tests hang with Xcomp intermittently and this issue >> require additional investigation and fix. >> >> Leonid >> From navy.xliu at gmail.com Thu Jul 12 20:51:12 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Thu, 12 Jul 2018 13:51:12 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> Message-ID: <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> Could you review this patch again? Revision #2. Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html The idea is simple. I just reset the problematic label when c1 compilation bailout happen. I manually ran tier1 on my laptop. it can pass all of them. Paul help me submit the patch to submit and here is the run result. Build Details: 2018-07-12-1736388.hohensee.source 0 Failed Tests Mach5 Tasks Results Summary PASSED: 75 UNABLE_TO_RUN: 0 KILLED: 0 NA: 0 FAILED: 0 EXECUTED_WITH_FAILURE: 0 Thanks, ?lx > On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: > > Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. > I made another revision. I will run tests thoroughly. > > Thanks, > ?lx > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul wrote: >> >> Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. >> >> Thanks, >> >> Paul >> >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" wrote: >> >> Hi, >> >> I think the idea is good, but doesn't work in all cases. >> We may bail out from code generation and discard the generated code leaving the label unbound. >> We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >> Sent: Mittwoch, 11. Juli 2018 03:34 >> To: Liu Xin ; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 >> >> I hit new assert in few other tests: >> >> compiler/codegen/TestCharVect2.java >> compiler/c2/cr6340864/* >> >> Regards, >> Vladimir >> >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>> Fix looks reasonable. I will test it in our framework. >>> >>> Thanks, >>> Vladimir >>> >>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>> Hi, Community, >>>> Could you please review this small patch? >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>> >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>> >>>> Problem: >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >>>> This patch align up x86 with other architectures(ppc, arm). >>>> Add an assertion to the destructor of Label. It will be wiped out in release build. >>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>> If this CR is approved, Paul Hohensee will push it. >>>> Thanks, >>>> --lx >>>> >> >> > From david.holmes at oracle.com Thu Jul 12 21:55:40 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Jul 2018 07:55:40 +1000 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: <8b9a7f665ac541e4838263b7fb495554@sap.com> References: <8b9a7f665ac541e4838263b7fb495554@sap.com> Message-ID: On 13/07/2018 12:51 AM, Lindenmaier, Goetz wrote: > Hi Volker, > > I had a look at your change. > Your intent is plausible. > > Won?t it help to just set -XX:+IgnoreUnrecognizedVMOptions? Yes! Great suggestion. > Alternatively, you could specify two test setups, one with @requires vm.debug, the > other with !vm.debug. I thought about that one too but the amount of duplicated @xxx stuff looked pretty ugly. Cheers, David > If non of these are possible, your change looks good. > > Best regards, > Goetz. > >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> bounces at openjdk.java.net] On Behalf Of David Holmes >> Sent: Donnerstag, 12. Juli 2018 01:27 >> To: Volker Simonis ; hotspot-runtime- >> dev at openjdk.java.net runtime >> Subject: Re: [11] RFR(S): 8207067: [test] prevent timeouts in >> serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java >> >> Hi Volker, >> >> On 12/07/2018 3:30 AM, Volker Simonis wrote: >>> Hi, >>> can I please have a review for the following test fix which prevents >>> eventual test timeouts: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8207067 >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ >>> >>> The two tests >> hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java >>> produce more than 90_000 classes until they eat up ~70% of the 128M >>> meta space they run with. The loading of each of these classes >>> triggers a full dependency check for ALL nmethods in debug/fastdebug >>> builds because 'VerifyDependencies' is 'true' there. This slows down >>> the tests from about 3 sec. in the opt build to about 88 sec. in the >>> fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on >>> ppc64. >>> >>> Because the tests are not about dependency checking, it makes sense to >>> switch of 'VerifyDependencies' if they are run inside a >>> debug/fastdebug VM and decrease the execution time down to about 6 >>> sec. on both x86_64 and ppc64. >> >> It's very annoying that this can't be fixed the obvious way by just >> specifying -XX:-VerifyDependencies. Maybe jtreg could add a >> debug_only(...) capability ... :( >> >> It's unclear to me whether it is safe to disable VerifyDependencies >> after the VM has commenced execution. This might lead to >> inconsistencies. Probably need the compiler folk to clarify that. >> >> That aside the comments are far too elaborate and better suited for the >> bug report. The tests could use @bug lines (though it would be nice to >> track down the original bug that added the tests). The comments could >> reduce to a simple: >> >> // This test produces more than 90_000 classes until it eats up ~70% of >> the 128M meta space. >> // With VerifyDependencies enabled in debug builds this slows the test >> down considerably. >> // As it is a develop flag, if we see that it is "constant" then we know >> this is a product build. >> >> Though there is already Platform.isDebugBuild() if you wanted something >> more direct. (Given you need WB to change the flag it doesn't really >> make much difference.) >> >> Thanks, >> David >> >>> Thank you and best regards, >>> Volker >>> From patricio.chilano.mateo at oracle.com Thu Jul 12 23:25:56 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Thu, 12 Jul 2018 19:25:56 -0400 Subject: RFR: 8206470: Incorrect use of os::lasterror in ClassListParser Message-ID: Hi all, Could you please review this small change? Summary: The change is for future-proof the code in case errno gets overwritten inside the allocation logic. Bug URL: https://bugs.openjdk.java.net/browse/JDK-8206470 Webrev URL: http://cr.openjdk.java.net/~coleenp/8206470.01/webrev/index.html Thanks, Patricio From david.holmes at oracle.com Thu Jul 12 23:35:39 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Jul 2018 09:35:39 +1000 Subject: RFR: 8206470: Incorrect use of os::lasterror in ClassListParser In-Reply-To: References: Message-ID: <28449e61-01f2-9f26-06bd-0d67469c5f01@oracle.com> Looks good Patricio - and trivial. Thanks, David On 13/07/2018 9:25 AM, patricio.chilano.mateo at oracle.com wrote: > Hi all, > > Could you please review this small change? > > Summary: The change is for future-proof the code in case errno gets > overwritten inside the allocation logic. > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8206470 > Webrev URL: > http://cr.openjdk.java.net/~coleenp/8206470.01/webrev/index.html > > > Thanks, > Patricio From patricio.chilano.mateo at oracle.com Thu Jul 12 23:54:56 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Thu, 12 Jul 2018 19:54:56 -0400 Subject: RFR: 8206470: Incorrect use of os::lasterror in ClassListParser In-Reply-To: <28449e61-01f2-9f26-06bd-0d67469c5f01@oracle.com> References: <28449e61-01f2-9f26-06bd-0d67469c5f01@oracle.com> Message-ID: <900d6a27-745f-0252-b425-9b112a5d112a@oracle.com> Thanks David! Patricio On 7/12/18 7:35 PM, David Holmes wrote: > Looks good Patricio - and trivial. > > Thanks, > David > > On 13/07/2018 9:25 AM, patricio.chilano.mateo at oracle.com wrote: >> Hi all, >> >> Could you please review this small change? >> >> Summary: The change is for future-proof the code in case errno gets >> overwritten inside the allocation logic. >> >> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8206470 >> Webrev URL: >> http://cr.openjdk.java.net/~coleenp/8206470.01/webrev/index.html >> >> >> Thanks, >> Patricio From volker.simonis at gmail.com Fri Jul 13 07:33:38 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 13 Jul 2018 09:33:38 +0200 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: References: <8b9a7f665ac541e4838263b7fb495554@sap.com> Message-ID: On Thu, Jul 12, 2018 at 11:55 PM, David Holmes wrote: > On 13/07/2018 12:51 AM, Lindenmaier, Goetz wrote: >> >> Hi Volker, >> >> I had a look at your change. >> Your intent is plausible. >> >> Won?t it help to just set -XX:+IgnoreUnrecognizedVMOptions? > > > Yes! Great suggestion. > I wouldn't have believed that it works, because VerifiyDependencies is actually not "unknown" but "constant" in non-debug builds. But it actually really does! This of course considerably simplifies the change. I've also shortened the comment as suggested by David. Please find the updated version here: http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067.v1/ OK to push now? Thank you and best regards, Volker >> Alternatively, you could specify two test setups, one with @requires >> vm.debug, the >> other with !vm.debug. > > > I thought about that one too but the amount of duplicated @xxx stuff looked > pretty ugly. > > Cheers, > David > > >> If non of these are possible, your change looks good. >> >> Best regards, >> Goetz. >> >>> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> bounces at openjdk.java.net] On Behalf Of David Holmes >>> Sent: Donnerstag, 12. Juli 2018 01:27 >>> To: Volker Simonis ; hotspot-runtime- >>> dev at openjdk.java.net runtime >>> Subject: Re: [11] RFR(S): 8207067: [test] prevent timeouts in >>> serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java >>> >>> Hi Volker, >>> >>> On 12/07/2018 3:30 AM, Volker Simonis wrote: >>>> >>>> Hi, >>>> can I please have a review for the following test fix which prevents >>>> eventual test timeouts: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8207067 >>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ >>>> >>>> The two tests >>> >>> hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java >>>> >>>> produce more than 90_000 classes until they eat up ~70% of the 128M >>>> meta space they run with. The loading of each of these classes >>>> triggers a full dependency check for ALL nmethods in debug/fastdebug >>>> builds because 'VerifyDependencies' is 'true' there. This slows down >>>> the tests from about 3 sec. in the opt build to about 88 sec. in the >>>> fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on >>>> ppc64. >>>> >>>> Because the tests are not about dependency checking, it makes sense to >>>> switch of 'VerifyDependencies' if they are run inside a >>>> debug/fastdebug VM and decrease the execution time down to about 6 >>>> sec. on both x86_64 and ppc64. >>> >>> >>> It's very annoying that this can't be fixed the obvious way by just >>> specifying -XX:-VerifyDependencies. Maybe jtreg could add a >>> debug_only(...) capability ... :( >>> >>> It's unclear to me whether it is safe to disable VerifyDependencies >>> after the VM has commenced execution. This might lead to >>> inconsistencies. Probably need the compiler folk to clarify that. >>> >>> That aside the comments are far too elaborate and better suited for the >>> bug report. The tests could use @bug lines (though it would be nice to >>> track down the original bug that added the tests). The comments could >>> reduce to a simple: >>> >>> // This test produces more than 90_000 classes until it eats up ~70% of >>> the 128M meta space. >>> // With VerifyDependencies enabled in debug builds this slows the test >>> down considerably. >>> // As it is a develop flag, if we see that it is "constant" then we know >>> this is a product build. >>> >>> Though there is already Platform.isDebugBuild() if you wanted something >>> more direct. (Given you need WB to change the flag it doesn't really >>> make much difference.) >>> >>> Thanks, >>> David >>> >>>> Thank you and best regards, >>>> Volker >>>> > From goetz.lindenmaier at sap.com Fri Jul 13 07:39:33 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 13 Jul 2018 07:39:33 +0000 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: References: <8b9a7f665ac541e4838263b7fb495554@sap.com> , Message-ID: <2742F564-AF49-47D8-8B7A-CD19ACB7EA85@sap.com> Yes, Looks Good! Thanks, Goetz > Am 13.07.2018 um 09:33 schrieb Volker Simonis : > >> On Thu, Jul 12, 2018 at 11:55 PM, David Holmes wrote: >>> On 13/07/2018 12:51 AM, Lindenmaier, Goetz wrote: >>> >>> Hi Volker, >>> >>> I had a look at your change. >>> Your intent is plausible. >>> >>> Won?t it help to just set -XX:+IgnoreUnrecognizedVMOptions? >> >> >> Yes! Great suggestion. >> > > I wouldn't have believed that it works, because VerifiyDependencies is > actually not "unknown" but "constant" in non-debug builds. But it > actually really does! > > This of course considerably simplifies the change. I've also shortened > the comment as suggested by David. Please find the updated version > here: > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067.v1/ > > OK to push now? > > Thank you and best regards, > Volker > >>> Alternatively, you could specify two test setups, one with @requires >>> vm.debug, the >>> other with !vm.debug. >> >> >> I thought about that one too but the amount of duplicated @xxx stuff looked >> pretty ugly. >> >> Cheers, >> David >> >> >>> If non of these are possible, your change looks good. >>> >>> Best regards, >>> Goetz. >>> >>>> -----Original Message----- >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> bounces at openjdk.java.net] On Behalf Of David Holmes >>>> Sent: Donnerstag, 12. Juli 2018 01:27 >>>> To: Volker Simonis ; hotspot-runtime- >>>> dev at openjdk.java.net runtime >>>> Subject: Re: [11] RFR(S): 8207067: [test] prevent timeouts in >>>> serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java >>>> >>>> Hi Volker, >>>> >>>>> On 12/07/2018 3:30 AM, Volker Simonis wrote: >>>>> >>>>> Hi, >>>>> can I please have a review for the following test fix which prevents >>>>> eventual test timeouts: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8207067 >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ >>>>> >>>>> The two tests >>>> >>>> hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java >>>>> >>>>> produce more than 90_000 classes until they eat up ~70% of the 128M >>>>> meta space they run with. The loading of each of these classes >>>>> triggers a full dependency check for ALL nmethods in debug/fastdebug >>>>> builds because 'VerifyDependencies' is 'true' there. This slows down >>>>> the tests from about 3 sec. in the opt build to about 88 sec. in the >>>>> fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on >>>>> ppc64. >>>>> >>>>> Because the tests are not about dependency checking, it makes sense to >>>>> switch of 'VerifyDependencies' if they are run inside a >>>>> debug/fastdebug VM and decrease the execution time down to about 6 >>>>> sec. on both x86_64 and ppc64. >>>> >>>> >>>> It's very annoying that this can't be fixed the obvious way by just >>>> specifying -XX:-VerifyDependencies. Maybe jtreg could add a >>>> debug_only(...) capability ... :( >>>> >>>> It's unclear to me whether it is safe to disable VerifyDependencies >>>> after the VM has commenced execution. This might lead to >>>> inconsistencies. Probably need the compiler folk to clarify that. >>>> >>>> That aside the comments are far too elaborate and better suited for the >>>> bug report. The tests could use @bug lines (though it would be nice to >>>> track down the original bug that added the tests). The comments could >>>> reduce to a simple: >>>> >>>> // This test produces more than 90_000 classes until it eats up ~70% of >>>> the 128M meta space. >>>> // With VerifyDependencies enabled in debug builds this slows the test >>>> down considerably. >>>> // As it is a develop flag, if we see that it is "constant" then we know >>>> this is a product build. >>>> >>>> Though there is already Platform.isDebugBuild() if you wanted something >>>> more direct. (Given you need WB to change the flag it doesn't really >>>> make much difference.) >>>> >>>> Thanks, >>>> David >>>> >>>>> Thank you and best regards, >>>>> Volker >>>>> >> From david.holmes at oracle.com Fri Jul 13 08:32:33 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Jul 2018 18:32:33 +1000 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: References: <8b9a7f665ac541e4838263b7fb495554@sap.com> Message-ID: On 13/07/2018 5:33 PM, Volker Simonis wrote: > On Thu, Jul 12, 2018 at 11:55 PM, David Holmes wrote: >> On 13/07/2018 12:51 AM, Lindenmaier, Goetz wrote: >>> >>> Hi Volker, >>> >>> I had a look at your change. >>> Your intent is plausible. >>> >>> Won?t it help to just set -XX:+IgnoreUnrecognizedVMOptions? >> >> >> Yes! Great suggestion. >> > > I wouldn't have believed that it works, because VerifiyDependencies is > actually not "unknown" but "constant" in non-debug builds. But it > actually really does! I was surprised too. :) > This of course considerably simplifies the change. I've also shortened > the comment as suggested by David. Please find the updated version > here: > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067.v1/ The comment placement seems odd. You could just add an extra line to the @comment part, or just move to be a class level comment rather than inside main. No need to see any update regardless. Thanks, David > OK to push now? > > Thank you and best regards, > Volker > >>> Alternatively, you could specify two test setups, one with @requires >>> vm.debug, the >>> other with !vm.debug. >> >> >> I thought about that one too but the amount of duplicated @xxx stuff looked >> pretty ugly. >> >> Cheers, >> David >> >> >>> If non of these are possible, your change looks good. >>> >>> Best regards, >>> Goetz. >>> >>>> -----Original Message----- >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> bounces at openjdk.java.net] On Behalf Of David Holmes >>>> Sent: Donnerstag, 12. Juli 2018 01:27 >>>> To: Volker Simonis ; hotspot-runtime- >>>> dev at openjdk.java.net runtime >>>> Subject: Re: [11] RFR(S): 8207067: [test] prevent timeouts in >>>> serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java >>>> >>>> Hi Volker, >>>> >>>> On 12/07/2018 3:30 AM, Volker Simonis wrote: >>>>> >>>>> Hi, >>>>> can I please have a review for the following test fix which prevents >>>>> eventual test timeouts: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8207067 >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ >>>>> >>>>> The two tests >>>> >>>> hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java >>>>> >>>>> produce more than 90_000 classes until they eat up ~70% of the 128M >>>>> meta space they run with. The loading of each of these classes >>>>> triggers a full dependency check for ALL nmethods in debug/fastdebug >>>>> builds because 'VerifyDependencies' is 'true' there. This slows down >>>>> the tests from about 3 sec. in the opt build to about 88 sec. in the >>>>> fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on >>>>> ppc64. >>>>> >>>>> Because the tests are not about dependency checking, it makes sense to >>>>> switch of 'VerifyDependencies' if they are run inside a >>>>> debug/fastdebug VM and decrease the execution time down to about 6 >>>>> sec. on both x86_64 and ppc64. >>>> >>>> >>>> It's very annoying that this can't be fixed the obvious way by just >>>> specifying -XX:-VerifyDependencies. Maybe jtreg could add a >>>> debug_only(...) capability ... :( >>>> >>>> It's unclear to me whether it is safe to disable VerifyDependencies >>>> after the VM has commenced execution. This might lead to >>>> inconsistencies. Probably need the compiler folk to clarify that. >>>> >>>> That aside the comments are far too elaborate and better suited for the >>>> bug report. The tests could use @bug lines (though it would be nice to >>>> track down the original bug that added the tests). The comments could >>>> reduce to a simple: >>>> >>>> // This test produces more than 90_000 classes until it eats up ~70% of >>>> the 128M meta space. >>>> // With VerifyDependencies enabled in debug builds this slows the test >>>> down considerably. >>>> // As it is a develop flag, if we see that it is "constant" then we know >>>> this is a product build. >>>> >>>> Though there is already Platform.isDebugBuild() if you wanted something >>>> more direct. (Given you need WB to change the flag it doesn't really >>>> make much difference.) >>>> >>>> Thanks, >>>> David >>>> >>>>> Thank you and best regards, >>>>> Volker >>>>> >> From volker.simonis at gmail.com Fri Jul 13 09:09:37 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 13 Jul 2018 11:09:37 +0200 Subject: [11] RFR(S): 8207067: [test] prevent timeouts in serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java In-Reply-To: References: <8b9a7f665ac541e4838263b7fb495554@sap.com> Message-ID: Thanks David & Goetz! Regards, Volker On Fri, Jul 13, 2018 at 10:32 AM, David Holmes wrote: > On 13/07/2018 5:33 PM, Volker Simonis wrote: >> >> On Thu, Jul 12, 2018 at 11:55 PM, David Holmes >> wrote: >>> >>> On 13/07/2018 12:51 AM, Lindenmaier, Goetz wrote: >>>> >>>> >>>> Hi Volker, >>>> >>>> I had a look at your change. >>>> Your intent is plausible. >>>> >>>> Won?t it help to just set -XX:+IgnoreUnrecognizedVMOptions? >>> >>> >>> >>> Yes! Great suggestion. >>> >> >> I wouldn't have believed that it works, because VerifiyDependencies is >> actually not "unknown" but "constant" in non-debug builds. But it >> actually really does! > > > I was surprised too. :) > >> This of course considerably simplifies the change. I've also shortened >> the comment as suggested by David. Please find the updated version >> here: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067.v1/ > > > The comment placement seems odd. You could just add an extra line to the > @comment part, or just move to be a class level comment rather than inside > main. > > No need to see any update regardless. > > Thanks, > David > > >> OK to push now? >> >> Thank you and best regards, >> Volker >> >>>> Alternatively, you could specify two test setups, one with @requires >>>> vm.debug, the >>>> other with !vm.debug. >>> >>> >>> >>> I thought about that one too but the amount of duplicated @xxx stuff >>> looked >>> pretty ugly. >>> >>> Cheers, >>> David >>> >>> >>>> If non of these are possible, your change looks good. >>>> >>>> Best regards, >>>> Goetz. >>>> >>>>> -----Original Message----- >>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>> bounces at openjdk.java.net] On Behalf Of David Holmes >>>>> Sent: Donnerstag, 12. Juli 2018 01:27 >>>>> To: Volker Simonis ; hotspot-runtime- >>>>> dev at openjdk.java.net runtime >>>>> Subject: Re: [11] RFR(S): 8207067: [test] prevent timeouts in >>>>> serviceability/tmtools/jstat/{GcTest02, GcCauseTest02}.java >>>>> >>>>> Hi Volker, >>>>> >>>>> On 12/07/2018 3:30 AM, Volker Simonis wrote: >>>>>> >>>>>> >>>>>> Hi, >>>>>> can I please have a review for the following test fix which prevents >>>>>> eventual test timeouts: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8207067 >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8207067/ >>>>>> >>>>>> The two tests >>>>> >>>>> >>>>> >>>>> hotspot/jtreg/serviceability/tmtools/jstat/{GcTest02,GcCauseTest02}.java >>>>>> >>>>>> >>>>>> produce more than 90_000 classes until they eat up ~70% of the 128M >>>>>> meta space they run with. The loading of each of these classes >>>>>> triggers a full dependency check for ALL nmethods in debug/fastdebug >>>>>> builds because 'VerifyDependencies' is 'true' there. This slows down >>>>>> the tests from about 3 sec. in the opt build to about 88 sec. in the >>>>>> fastdebug build on x86_64 and from about 4 sec. to about 560 sec. on >>>>>> ppc64. >>>>>> >>>>>> Because the tests are not about dependency checking, it makes sense to >>>>>> switch of 'VerifyDependencies' if they are run inside a >>>>>> debug/fastdebug VM and decrease the execution time down to about 6 >>>>>> sec. on both x86_64 and ppc64. >>>>> >>>>> >>>>> >>>>> It's very annoying that this can't be fixed the obvious way by just >>>>> specifying -XX:-VerifyDependencies. Maybe jtreg could add a >>>>> debug_only(...) capability ... :( >>>>> >>>>> It's unclear to me whether it is safe to disable VerifyDependencies >>>>> after the VM has commenced execution. This might lead to >>>>> inconsistencies. Probably need the compiler folk to clarify that. >>>>> >>>>> That aside the comments are far too elaborate and better suited for the >>>>> bug report. The tests could use @bug lines (though it would be nice to >>>>> track down the original bug that added the tests). The comments could >>>>> reduce to a simple: >>>>> >>>>> // This test produces more than 90_000 classes until it eats up ~70% of >>>>> the 128M meta space. >>>>> // With VerifyDependencies enabled in debug builds this slows the test >>>>> down considerably. >>>>> // As it is a develop flag, if we see that it is "constant" then we >>>>> know >>>>> this is a product build. >>>>> >>>>> Though there is already Platform.isDebugBuild() if you wanted something >>>>> more direct. (Given you need WB to change the flag it doesn't really >>>>> make much difference.) >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thank you and best regards, >>>>>> Volker >>>>>> >>> > From martin.doerr at sap.com Fri Jul 13 09:54:31 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 13 Jul 2018 09:54:31 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> Message-ID: <4d861aa62585483b8f2c9f626406e346@sap.com> Hi, thanks for fixing the issue in templateTable_x86. It looks correct. I think even better would be "UseOnStackReplacement ? &backedge_counter_overflow : NULL" and "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. But I leave it up to you if you want to change it. I'm also ok with your version. I'm not convinced that the label assertion is reliable. There may be many more places in hotspot where we bail out having an unbound label. Running a few tests on x86 is by far not sufficient. The assertion may fire sporadically in large scenarios on some platforms. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Liu Xin Sent: Donnerstag, 12. Juli 2018 22:51 To: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Could you review this patch again? Revision #2. Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html The idea is simple. I just reset the problematic label when c1 compilation bailout happen. I manually ran tier1 on my laptop. it can pass all of them. Paul help me submit the patch to submit and here is the run result. Build Details: 2018-07-12-1736388.hohensee.source 0 Failed Tests Mach5 Tasks Results Summary PASSED: 75 UNABLE_TO_RUN: 0 KILLED: 0 NA: 0 FAILED: 0 EXECUTED_WITH_FAILURE: 0 Thanks, ?lx > On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: > > Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. > I made another revision. I will run tests thoroughly. > > Thanks, > ?lx > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul wrote: >> >> Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. >> >> Thanks, >> >> Paul >> >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" wrote: >> >> Hi, >> >> I think the idea is good, but doesn't work in all cases. >> We may bail out from code generation and discard the generated code leaving the label unbound. >> We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >> Sent: Mittwoch, 11. Juli 2018 03:34 >> To: Liu Xin ; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 >> >> I hit new assert in few other tests: >> >> compiler/codegen/TestCharVect2.java >> compiler/c2/cr6340864/* >> >> Regards, >> Vladimir >> >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>> Fix looks reasonable. I will test it in our framework. >>> >>> Thanks, >>> Vladimir >>> >>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>> Hi, Community, >>>> Could you please review this small patch? >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>> >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>> >>>> Problem: >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >>>> This patch align up x86 with other architectures(ppc, arm). >>>> Add an assertion to the destructor of Label. It will be wiped out in release build. >>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>> If this CR is approved, Paul Hohensee will push it. >>>> Thanks, >>>> --lx >>>> >> >> > From jcbeyler at google.com Fri Jul 13 17:23:43 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 13 Jul 2018 10:23:43 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled Message-ID: Hi all, (Not sure this is the right list, it's a C1 fix but regarding runtime interactions...) With Robbin Ehn, we had worked together on making TLAB and contiguous inlining consistent in the interpreter across architectures. When testing my heap monitoring system, it turned out that C1 is still being inconsistent. I'm not sure if we had left this case intentionally or not but, if we want it all to be consistent, we should perhaps fix it. I created this to track this: https://bugs.openjdk.java.net/browse/JDK-8190862 I also created a fix to make it consistent: http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.00/ Basically, we say: - If TLAB is enabled, then go directly to the slow path in these cases - If TLAB is disabled, try inline contiguous allocations if possible Let me know if you have any questions and if someone could review if we agree to it, that would be great! Thanks, Jc From ioi.lam at oracle.com Fri Jul 13 18:50:13 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 13 Jul 2018 11:50:13 -0700 Subject: Proposal for improving CDS archive creation In-Reply-To: <064d4f4d-e707-682e-cd7b-99d7a0afec86@oracle.com> References: <20c6adc7-9910-435b-bb5b-549ade7034b9@oracle.com> <064d4f4d-e707-682e-cd7b-99d7a0afec86@oracle.com> Message-ID: When writing into the buffer, the algorithm works like this ??? MetaspaceObj* get_buffered(MetaspaceObj *p) { MetaspaceObj* saved = buffer_find(p); ? ?? ?? if (saved == NULL) { ??? ? ?? ?? saved = buffer_write(p); ?? ? ?? } ??? } So when you're writing a vtable into the buffer: ??? Method** vtable = ...; // points to the "real" class X ??? Method** vtable_buffered = ...; // points to the "buffered" class X ??? for (int i=0; i Volker originally suggested the idea in the email thread "Improving > AppCDS for Custom Loaders". I think this is a cleaner approach. > > Thanks, > > Jiangli > > > On 7/11/18 4:13 PM, Ioi Lam wrote: >> I had an off-line discussion with Jiangli, and she has an alternative >> proposal: >> >> When -Xshare:autocreate is specified, but the CDS archive is not >> available, >> >> 1. Load classes as normal. After each InstanceKlass is loaded, but >> before it's used, >> ?? make a deep copy of this class into an internal cache. >> >> 2. The deep copy includes all methods, etc, for this class. However, >> if a Method is >> ?? inherited from a super class, then only a reference to this Method >> is copied. >> >> 3. At a certain point (probably at VM exit), copy all the (suitable) >> classes from the >> ?? cache and write them into the CDS archive. >> >> The advantage of this approach is we will be able to archive classes >> that were >> loaded by custom loaders, but have been freed at VM exit time because >> the class >> loaders were GC'ed. >> >> >> Note: When a class X is loaded, if its supertype(s) have already been >> redefined, >> we probably should not copy X into the buffer. That's because the >> vtable of X may >> point to some redefined methods from a supertype, which do not match >> the bytecodes >> of these methods in the supertype's original class file, so it's a >> messy situation. >> >> Thanks >> - Ioi >> >> >> >> On 7/10/18 12:50 PM, Ioi Lam wrote: >>> Fixing some sloppy text below .... >>> >>> >>> On 7/10/18 10:16 AM, Ioi Lam wrote: >>>> I have a proposal for improving the process of creating of the CDS >>>> archive(s), >>>> so we can make CDS easier to use and support more use cases. >>>> >>>> ?? - better support for custom loaders >>>> ?? - remove explicit training run >>>> ?? - support 2 levels of shared archives >>>> >>>> I think the proposal is relatively straight-forward to implement, >>>> as we already >>>> have most of the required infrastructures: >>>> >>>> ?? + the ability to use Java class loaders at archive creation time >>>> ?? + the ability to relocate MetaspaceObjects >>>> >>>> Parts of this proposal will also simplify the CDS code and make it >>>> more >>>> maintainable. >>>> >>>> Current process of creating the base archive - [C] >>>> ================================================== >>>> >>>> Currently each JVM process can map at most one CDS archive. Let's >>>> call this >>>> the "base archive". It is created by [ref1]: >>>> >>>> ?C1. Reserve a region R of 3GB at 0x800000000. >>>> ?C2. Load all classes specified in the class list. All data for >>>> these classes >>>> ???? live outside of R. >>>> ???? (E.g., the Klass objects are loaded into tmp_class_space, >>>> which is >>>> ????? adjacent to R). >>>> ?C3. Copy the metadata of all archivable classes (e.g, exclude >>>> generated >>>> ???? Lambda classes) into R. At this step, R is divided into several >>>> ? ?? sections (RO, RW, etc). >>>> >>>> >>>> ? //? +-- SharedBaseAddress?? (default = 0x800000000) >>>> ? //? +-- _narrow_klass._base >>>> ? //? | >>>> ? //? |?????????????????????????????? +-tmp_class_space.base >>>> ? //? v?????????????????????????????? V >>>> ? //? +----+----+----+----+----+-....-+-------------------+ >>>> ? //? |<-?????????? R?????????????? ->| >>>> ? //? | MC | RW | RO | MD | OD |unused| tmp_class_space?? | >>>> ? //? +----+----+----+----+----+------+-------------------+ >>>> ? //? |<--? 3GB??????? -------------->| >>>> ? //? |<-- UnscaledClassSpaceMax = 4GB ------------------>| >>>> >>>> >>>> New process for creating the base archive - [N] >>>> =============================================== >>>> >>>> Currently we have a lot of "if (DumpSharedSpaces)" code to for >>>> special case >>>> handling of the above scheme. We can improve it by >>>> >>>> ?N1. Remove all code for special memory layout initialization for >>>> -Xshare:dump. >>>> ???? As a result, we will reserve a region R of 1GB at 0x800000000, >>>> which >>>> ???? is used by Klass objects (this is the same as if -Xshare:off were >>>> ???? specified.) >>>> ?N2. Load all classes in the class list. >>>> ?N3. Now R contains the Klass objects of all loaded classes. >>>> ???? Allocate a temporary space T, and copy all contents of R into T. >>>> ?N4. Now R is empty. Copy the metadata of all archivable classes >>>> into R. >>>> >>>> >>>> Dump-as-you-go for the base archive - [G] >>>> ========================================= >>>> >>>> Note that the [N] scheme will work even if you're running an app with >>>> -Xshare:off. At some point (e.g., when the VM is about to exit), you >>>> can: >>>> >>>> ?G1. Enter a safe point >>>> ?G2. Go to step [N3]. >>>> >>>> The benefit of [G] is you don't need a separate run to dump the >>>> archive, and >>>> there's no need to use the class list. Instead, we can have an >>>> option like: >>>> >>>> ?? java -Xshare:autocreate -cp app.jar >>>> -XX:SharedArchiveFile=foo.jsa App >>>> >>>> If foo.jsa is not available, we run in [G] mode. At VM exit, we >>>> dump into >>>> foo.jsa. >>>> >>>> This way, we don't need to have an explicit training run with >>>> -XX:DumpLoadedClassList. Instead, the training run is >>>> >>> I meant, "Instead, your first run, when the archive is not yet >>> available, becomes the >>> training run". >>> >>> Thanks to Calvin and Dan for spotting this :-) >>> - Ioi >>> >>>> This also makes it easy to support the classes from custom loaders. >>>> There's no >>>> need for special tooling to convert -Xlog:class+load=debug output >>>> into a >>>> classlist. [ref2] >>>> >>>> >>>> Dumping for second-level archive - [S] >>>> ====================================== >>>> >>>> ?S1. Load the base archive >>>> ?S2. Run the app as normal >>>> ?S3. All Klass objects of the dynamically loaded classes will be >>>> loaded in >>>> ???? the region R, which immediately follows the end of the base >>>> archive. >>>> >>>> ? //? +-- SharedBaseAddress >>>> ? //? |????????????????????????? +--- dynamically loaded Klasses >>>> ? //? |????????????????????????? |??? start from here. >>>> ? //? v????????????????????????? v >>>> ? // +--------------------------+---------...-----------------| >>>> ? //? | base archive???????????? | region R | >>>> ? // +--------------------------+---------...-----------------| >>>> ? //? |<- size of base archive ->| >>>> ? //? |<--??????????? 1GB -->| >>>> >>>> >>>> ? S4. At some point (possible when the VM is about to exit) we start >>>> ????? dumping the second level archive >>>> ? S5. Enter safe point >>>> ? S6. Now R contains the Klass objects of all dynamically loaded >>>> classes. >>>> ????? Allocate a temporary space T, and copy all contents of R into T. >>>> ? S7. Now R is empty. Copy the metadata of all archivable, >>>> dynamically loaded >>>> ????? classes into R. >>>> ? S8. Create a new shared_dictionary (and shared_symbol_table) that >>>> contains >>>> ????? all the Klasses (Symbols) from both the base and second-level >>>> archives. >>>> >>>> References >>>> ========== >>>> >>>> [ref1] Current initialization of memory space layout during >>>> -Xshare:dump >>>> http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 >>>> >>>> [ref2] Volker Simonis's tool for support custom class loaders in CDS >>>> ?????? https://github.com/simonis/cl4cds >>>> ---------------------------------------------------------------------- >>>> >>>> >>>> >>>> Any thoughts? >>>> >>>> Thanks >>>> - Ioi >>> >> > From jiangli.zhou at oracle.com Fri Jul 13 19:43:14 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 13 Jul 2018 12:43:14 -0700 Subject: Proposal for improving CDS archive creation In-Reply-To: References: <20c6adc7-9910-435b-bb5b-549ade7034b9@oracle.com> <064d4f4d-e707-682e-cd7b-99d7a0afec86@oracle.com> Message-ID: On 7/13/18 11:50 AM, Ioi Lam wrote: > When writing into the buffer, the algorithm works like this > > > ??? MetaspaceObj* get_buffered(MetaspaceObj *p) { > MetaspaceObj* saved = buffer_find(p); > ? ?? ?? if (saved == NULL) { > ??? ? ?? ?? saved = buffer_write(p); > ?? ? ?? } > ??? } > > So when you're writing a vtable into the buffer: > > ??? Method** vtable = ...; // points to the "real" class X > ??? Method** vtable_buffered = ...; // points to the "buffered" class X > > ??? for (int i=0; i ??????? Method* m = vtable[i]; > Method* buffered_m = get_buffered(m); > ??????? vtable_buffered[i] = buffered_m; > ??? } > > buffer_write(m) will not happen if m is a method defined by a super > class of X. > > However, with some class are unloaded and the metaspace blocks are > being reused, a new MetaspaceObject may happen to occupy the exact > same address as an old MetaspaceObject from an unloaded class. This > would make the buffering operation more complicated. > > We have 2 choices: > > [1] Disable the deallocation of MetaspaceObjects when > -Xshare:autocreate is specified. > [2] When a MetaspaceObject is deallocated, remove it from the hash > table used by buffer_find(). > > We can start with [1] as it has a lesser chance of working > incorrectly, (except it might run out of metaspace memory for some > pathological cases). That sounds reasonable to me. Could you please also add a note to the RFE report so we can keep track the design decision. Thanks, Jiangli > > > - Ioi > > On 7/11/18 5:46 PM, Jiangli Zhou wrote: >> Volker originally suggested the idea in the email thread "Improving >> AppCDS for Custom Loaders". I think this is a cleaner approach. >> >> Thanks, >> >> Jiangli >> >> >> On 7/11/18 4:13 PM, Ioi Lam wrote: >>> I had an off-line discussion with Jiangli, and she has an >>> alternative proposal: >>> >>> When -Xshare:autocreate is specified, but the CDS archive is not >>> available, >>> >>> 1. Load classes as normal. After each InstanceKlass is loaded, but >>> before it's used, >>> ?? make a deep copy of this class into an internal cache. >>> >>> 2. The deep copy includes all methods, etc, for this class. However, >>> if a Method is >>> ?? inherited from a super class, then only a reference to this >>> Method is copied. >>> >>> 3. At a certain point (probably at VM exit), copy all the (suitable) >>> classes from the >>> ?? cache and write them into the CDS archive. >>> >>> The advantage of this approach is we will be able to archive classes >>> that were >>> loaded by custom loaders, but have been freed at VM exit time >>> because the class >>> loaders were GC'ed. >>> >>> >>> Note: When a class X is loaded, if its supertype(s) have already >>> been redefined, >>> we probably should not copy X into the buffer. That's because the >>> vtable of X may >>> point to some redefined methods from a supertype, which do not match >>> the bytecodes >>> of these methods in the supertype's original class file, so it's a >>> messy situation. >>> >>> Thanks >>> - Ioi >>> >>> >>> >>> On 7/10/18 12:50 PM, Ioi Lam wrote: >>>> Fixing some sloppy text below .... >>>> >>>> >>>> On 7/10/18 10:16 AM, Ioi Lam wrote: >>>>> I have a proposal for improving the process of creating of the CDS >>>>> archive(s), >>>>> so we can make CDS easier to use and support more use cases. >>>>> >>>>> ?? - better support for custom loaders >>>>> ?? - remove explicit training run >>>>> ?? - support 2 levels of shared archives >>>>> >>>>> I think the proposal is relatively straight-forward to implement, >>>>> as we already >>>>> have most of the required infrastructures: >>>>> >>>>> ?? + the ability to use Java class loaders at archive creation time >>>>> ?? + the ability to relocate MetaspaceObjects >>>>> >>>>> Parts of this proposal will also simplify the CDS code and make it >>>>> more >>>>> maintainable. >>>>> >>>>> Current process of creating the base archive - [C] >>>>> ================================================== >>>>> >>>>> Currently each JVM process can map at most one CDS archive. Let's >>>>> call this >>>>> the "base archive". It is created by [ref1]: >>>>> >>>>> ?C1. Reserve a region R of 3GB at 0x800000000. >>>>> ?C2. Load all classes specified in the class list. All data for >>>>> these classes >>>>> ???? live outside of R. >>>>> ???? (E.g., the Klass objects are loaded into tmp_class_space, >>>>> which is >>>>> ????? adjacent to R). >>>>> ?C3. Copy the metadata of all archivable classes (e.g, exclude >>>>> generated >>>>> ???? Lambda classes) into R. At this step, R is divided into several >>>>> ? ?? sections (RO, RW, etc). >>>>> >>>>> >>>>> ? //? +-- SharedBaseAddress?? (default = 0x800000000) >>>>> ? //? +-- _narrow_klass._base >>>>> ? //? | >>>>> ? //? | +-tmp_class_space.base >>>>> ? //? v?????????????????????????????? V >>>>> ? // +----+----+----+----+----+-....-+-------------------+ >>>>> ? //? |<-?????????? R?????????????? ->| >>>>> ? //? | MC | RW | RO | MD | OD |unused| tmp_class_space | >>>>> ? // +----+----+----+----+----+------+-------------------+ >>>>> ? //? |<--? 3GB??????? -------------->| >>>>> ? //? |<-- UnscaledClassSpaceMax = 4GB ------------------>| >>>>> >>>>> >>>>> New process for creating the base archive - [N] >>>>> =============================================== >>>>> >>>>> Currently we have a lot of "if (DumpSharedSpaces)" code to for >>>>> special case >>>>> handling of the above scheme. We can improve it by >>>>> >>>>> ?N1. Remove all code for special memory layout initialization for >>>>> -Xshare:dump. >>>>> ???? As a result, we will reserve a region R of 1GB at >>>>> 0x800000000, which >>>>> ???? is used by Klass objects (this is the same as if -Xshare:off >>>>> were >>>>> ???? specified.) >>>>> ?N2. Load all classes in the class list. >>>>> ?N3. Now R contains the Klass objects of all loaded classes. >>>>> ???? Allocate a temporary space T, and copy all contents of R into T. >>>>> ?N4. Now R is empty. Copy the metadata of all archivable classes >>>>> into R. >>>>> >>>>> >>>>> Dump-as-you-go for the base archive - [G] >>>>> ========================================= >>>>> >>>>> Note that the [N] scheme will work even if you're running an app with >>>>> -Xshare:off. At some point (e.g., when the VM is about to exit), you >>>>> can: >>>>> >>>>> ?G1. Enter a safe point >>>>> ?G2. Go to step [N3]. >>>>> >>>>> The benefit of [G] is you don't need a separate run to dump the >>>>> archive, and >>>>> there's no need to use the class list. Instead, we can have an >>>>> option like: >>>>> >>>>> ?? java -Xshare:autocreate -cp app.jar >>>>> -XX:SharedArchiveFile=foo.jsa App >>>>> >>>>> If foo.jsa is not available, we run in [G] mode. At VM exit, we >>>>> dump into >>>>> foo.jsa. >>>>> >>>>> This way, we don't need to have an explicit training run with >>>>> -XX:DumpLoadedClassList. Instead, the training run is >>>>> >>>> I meant, "Instead, your first run, when the archive is not yet >>>> available, becomes the >>>> training run". >>>> >>>> Thanks to Calvin and Dan for spotting this :-) >>>> - Ioi >>>> >>>>> This also makes it easy to support the classes from custom >>>>> loaders. There's no >>>>> need for special tooling to convert -Xlog:class+load=debug output >>>>> into a >>>>> classlist. [ref2] >>>>> >>>>> >>>>> Dumping for second-level archive - [S] >>>>> ====================================== >>>>> >>>>> ?S1. Load the base archive >>>>> ?S2. Run the app as normal >>>>> ?S3. All Klass objects of the dynamically loaded classes will be >>>>> loaded in >>>>> ???? the region R, which immediately follows the end of the base >>>>> archive. >>>>> >>>>> ? //? +-- SharedBaseAddress >>>>> ? //? |????????????????????????? +--- dynamically loaded Klasses >>>>> ? //? |????????????????????????? |??? start from here. >>>>> ? //? v????????????????????????? v >>>>> ? // +--------------------------+---------...-----------------| >>>>> ? //? | base archive???????????? | region R | >>>>> ? // +--------------------------+---------...-----------------| >>>>> ? //? |<- size of base archive ->| >>>>> ? //? |<--??????????? 1GB -->| >>>>> >>>>> >>>>> ? S4. At some point (possible when the VM is about to exit) we start >>>>> ????? dumping the second level archive >>>>> ? S5. Enter safe point >>>>> ? S6. Now R contains the Klass objects of all dynamically loaded >>>>> classes. >>>>> ????? Allocate a temporary space T, and copy all contents of R >>>>> into T. >>>>> ? S7. Now R is empty. Copy the metadata of all archivable, >>>>> dynamically loaded >>>>> ????? classes into R. >>>>> ? S8. Create a new shared_dictionary (and shared_symbol_table) >>>>> that contains >>>>> ????? all the Klasses (Symbols) from both the base and >>>>> second-level archives. >>>>> >>>>> References >>>>> ========== >>>>> >>>>> [ref1] Current initialization of memory space layout during >>>>> -Xshare:dump >>>>> http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 >>>>> >>>>> [ref2] Volker Simonis's tool for support custom class loaders in CDS >>>>> ?????? https://github.com/simonis/cl4cds >>>>> ---------------------------------------------------------------------- >>>>> >>>>> >>>>> >>>>> >>>>> Any thoughts? >>>>> >>>>> Thanks >>>>> - Ioi >>>> >>> >> > From john.r.rose at oracle.com Fri Jul 13 20:08:37 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 13 Jul 2018 13:08:37 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: Message-ID: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: > > I'm not sure if we had left this case intentionally or not but, if we want > it all to be consistent, we should perhaps fix it. Well, you put in that logic last February, so unless somebody speaks up quickly, I support your adjusting it to be the way you want it. Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" suggests that the GC group is most active in touching this feature. If Robbin is OK with it, there's your reviewer. FWIW, you can use me as a reviewer, but I'd get one other person working on the GC to OK it. ? John From navy.xliu at gmail.com Fri Jul 13 20:29:35 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Fri, 13 Jul 2018 13:29:35 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <4d861aa62585483b8f2c9f626406e346@sap.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> Message-ID: Hello, Martin, Thanks for reviewing it. I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" and is running tests. The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop and tell you immediately. I am curious about this "We also may generate code with the purpose to determine its size.". Could you tell me where is it? it looks quite slow to get buffer size in this way. thanks, --lx On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin wrote: > Hi, > > thanks for fixing the issue in templateTable_x86. It looks correct. > I think even better would be > "UseOnStackReplacement ? &backedge_counter_overflow : NULL" > and > "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. > But I leave it up to you if you want to change it. I'm also ok with your > version. > > I'm not convinced that the label assertion is reliable. There may be many > more places in hotspot where we bail out having an unbound label. Running a > few tests on x86 is by far not sufficient. The assertion may fire > sporadically in large scenarios on some platforms. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Liu Xin > Sent: Donnerstag, 12. Juli 2018 22:51 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels > for x86 > > Could you review this patch again? > > Revision #2. > Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 < > https://bugs.openjdk.java.net/browse/JDK-8206075> > CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ > openjdk8u/webrev/index.html amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html> > > > The idea is simple. I just reset the problematic label when c1 compilation > bailout happen. > I manually ran tier1 on my laptop. it can pass all of them. > Paul help me submit the patch to submit and here is the run result. > Build Details: 2018-07-12-1736388.hohensee.source > > 0 Failed Tests > > Mach5 Tasks Results Summary > > PASSED: 75 > UNABLE_TO_RUN: 0 > KILLED: 0 > NA: 0 > FAILED: 0 > EXECUTED_WITH_FAILURE: 0 > > > Thanks, > ?lx > > On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: > > > > Thank you for your reviews. Indeed, I didn?t deal with bailout > situation. "compiler/codegen/TestCharVect2.java? is the case of > codeBuffer overflow and leave a unbound label behind. > > I made another revision. I will run tests thoroughly. > > > > Thanks, > > ?lx > > > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > wrote: > >> > >> Imo it's still good hygiene to require that Labels be bound if they're > used, even if the generated code will never be executed. E.g., code that > generates code for sizing purposes may be repurposed to generate executable > code, in which case an unbound label may be a lurking bug. Also, I'm > unaware (I may be corrected!) of any situation where bailing out happens in > such a way as to both leave a Label unbound and execute its destructor. > Even if there are, I'd say that'd be indicative of another real problem, > such as code buffer overflow, so no harm would result. > >> > >> Thanks, > >> > >> Paul > >> > >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" < > hotspot-runtime-dev-bounces at openjdk.java.net on behalf of > martin.doerr at sap.com> wrote: > >> > >> Hi, > >> > >> I think the idea is good, but doesn't work in all cases. > >> We may bail out from code generation and discard the generated code > leaving the label unbound. > >> We also may generate code with the purpose to determine its size. We > don't need to bind labels because the code will never get executed. > >> > >> Best regards, > >> Martin > >> > >> > >> -----Original Message----- > >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov > >> Sent: Mittwoch, 11. Juli 2018 03:34 > >> To: Liu Xin ; hotspot-runtime-dev at openjdk. > java.net > >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler > Labels for x86 > >> > >> I hit new assert in few other tests: > >> > >> compiler/codegen/TestCharVect2.java > >> compiler/c2/cr6340864/* > >> > >> Regards, > >> Vladimir > >> > >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > >>> Fix looks reasonable. I will test it in our framework. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> On 7/10/18 9:50 AM, Liu Xin wrote: > >>>> Hi, Community, > >>>> Could you please review this small patch? > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > >>>> > >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ > >>>> > >>>> Problem: > >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. > >>>> This patch align up x86 with other architectures(ppc, arm). > >>>> Add an assertion to the destructor of Label. It will be wiped out in > release build. > >>>> Previously, hotspot cannot pass this test with assertion on x86-64. > >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java > >>>> If this CR is approved, Paul Hohensee will push it. > >>>> Thanks, > >>>> --lx > >>>> > >> > >> > > > > From jcbeyler at google.com Fri Jul 13 20:54:43 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 13 Jul 2018 13:54:43 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> Message-ID: Yes, you are right, I did those changes due to: https://bugs.openjdk.java.net/browse/JDK-8194084 If Robbin agrees to this change, and if no one sees an issue, I'll go ahead and propagate the change across architectures. Thanks for the review, I'll wait for Robbin (or anyone else's comment and review) :) Jc On Fri, Jul 13, 2018 at 1:08 PM John Rose wrote: > On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: > > > I'm not sure if we had left this case intentionally or not but, if we want > it all to be consistent, we should perhaps fix it. > > > Well, you put in that logic last February, so unless somebody speaks up > quickly, I support your adjusting it to be the way you want it. > > Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" > suggests that the GC group is most active in touching this feature. > If Robbin is OK with it, there's your reviewer. > > FWIW, you can use me as a reviewer, but I'd get one other person > working on the GC to OK it. > > ? John > -- Thanks, Jc From kim.barrett at oracle.com Sat Jul 14 00:17:26 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 13 Jul 2018 20:17:26 -0400 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> Message-ID: <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> > On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: > > Yes, you are right, I did those changes due to: > https://bugs.openjdk.java.net/browse/JDK-8194084 > > If Robbin agrees to this change, and if no one sees an issue, I'll go ahead > and propagate the change across architectures. > > Thanks for the review, I'll wait for Robbin (or anyone else's comment and > review) :) > Jc > > On Fri, Jul 13, 2018 at 1:08 PM John Rose wrote: > >> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: >> >> >> I'm not sure if we had left this case intentionally or not but, if we want >> it all to be consistent, we should perhaps fix it. >> >> >> Well, you put in that logic last February, so unless somebody speaks up >> quickly, I support your adjusting it to be the way you want it. >> >> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" >> suggests that the GC group is most active in touching this feature. >> If Robbin is OK with it, there's your reviewer. >> >> FWIW, you can use me as a reviewer, but I'd get one other person >> working on the GC to OK it. >> >> ? John >> > > > -- > > Thanks, > Jc Robbin is on vacation; you might not hear from him for a while. I'm assuming you'll open a new bug for this? Except for a few minor nits (below), this looks okay to me. The comment at line 1052 needs updating. pre-existing: The retry_tlab label declared on line 1054 is unused. pre-existing: The try_eden label declared on line 1054 is bound at line 1058, but unreferenced. I like the wording of the comment at 1139 better than the wording at 1016. From jcbeyler at google.com Sat Jul 14 04:16:39 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 13 Jul 2018 21:16:39 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Hi Kim, I opened this bug https://bugs.openjdk.java.net/browse/JDK-8190862 and now I've done an update: http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ I basically have done your nits but also removed the try_eden (it was used to bind a label but was not used). I updated the comments to use the one you preferred. I still have to do the other architectures though but at least we seem to have a consensus on this architecture, correct? Thanks for the review, Jc On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett wrote: > > On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: > > > > Yes, you are right, I did those changes due to: > > https://bugs.openjdk.java.net/browse/JDK-8194084 > > > > If Robbin agrees to this change, and if no one sees an issue, I'll go > ahead > > and propagate the change across architectures. > > > > Thanks for the review, I'll wait for Robbin (or anyone else's comment and > > review) :) > > Jc > > > > On Fri, Jul 13, 2018 at 1:08 PM John Rose > wrote: > > > >> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: > >> > >> > >> I'm not sure if we had left this case intentionally or not but, if we > want > >> it all to be consistent, we should perhaps fix it. > >> > >> > >> Well, you put in that logic last February, so unless somebody speaks up > >> quickly, I support your adjusting it to be the way you want it. > >> > >> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" > >> suggests that the GC group is most active in touching this feature. > >> If Robbin is OK with it, there's your reviewer. > >> > >> FWIW, you can use me as a reviewer, but I'd get one other person > >> working on the GC to OK it. > >> > >> ? John > >> > > > > > > -- > > > > Thanks, > > Jc > > Robbin is on vacation; you might not hear from him for a while. > > I'm assuming you'll open a new bug for this? > > Except for a few minor nits (below), this looks okay to me. > > The comment at line 1052 needs updating. > > pre-existing: The retry_tlab label declared on line 1054 is unused. > > pre-existing: The try_eden label declared on line 1054 is bound at > line 1058, but unreferenced. > > I like the wording of the comment at 1139 better than the wording at 1016. > > -- Thanks, Jc From martin.doerr at sap.com Mon Jul 16 08:30:47 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 16 Jul 2018 08:30:47 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> Message-ID: Hi Liu Xin, thanks for changing. > The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. > it's trivial if fastdebug/slowdebug stop and tell you immediately. I understand that. But an assertion should only get added when we are convinced that it won?t produce false positives. It?s very annoying if long running tests break due to an incorrect assertion after running many days. > I am curious about this "We also may generate code with the purpose to determine its size.". > Could you tell me where is it? it looks quite slow to get buffer size in this way. C2 Compiler does that in Compile::scratch_emit_size. Please note that I?ll be on vacation soon, so other people will have to review. Thanks again for fixing the -XX:-UseOnStackReplacement issue. Best regards, Martin From: Liu Xin [mailto:navy.xliu at gmail.com] Sent: Freitag, 13. Juli 2018 22:30 To: Doerr, Martin Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Hello, Martin, Thanks for reviewing it. I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" and is running tests. The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop and tell you immediately. I am curious about this "We also may generate code with the purpose to determine its size.". Could you tell me where is it? it looks quite slow to get buffer size in this way. thanks, --lx On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > wrote: Hi, thanks for fixing the issue in templateTable_x86. It looks correct. I think even better would be "UseOnStackReplacement ? &backedge_counter_overflow : NULL" and "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. But I leave it up to you if you want to change it. I'm also ok with your version. I'm not convinced that the label assertion is reliable. There may be many more places in hotspot where we bail out having an unbound label. Running a few tests on x86 is by far not sufficient. The assertion may fire sporadically in large scenarios on some platforms. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Liu Xin Sent: Donnerstag, 12. Juli 2018 22:51 To: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Could you review this patch again? Revision #2. Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html The idea is simple. I just reset the problematic label when c1 compilation bailout happen. I manually ran tier1 on my laptop. it can pass all of them. Paul help me submit the patch to submit and here is the run result. Build Details: 2018-07-12-1736388.hohensee.source 0 Failed Tests Mach5 Tasks Results Summary PASSED: 75 UNABLE_TO_RUN: 0 KILLED: 0 NA: 0 FAILED: 0 EXECUTED_WITH_FAILURE: 0 Thanks, ?lx > On Jul 11, 2018, at 10:35 AM, Liu Xin > wrote: > > Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. > I made another revision. I will run tests thoroughly. > > Thanks, > ?lx > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > wrote: >> >> Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. >> >> Thanks, >> >> Paul >> >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" on behalf of martin.doerr at sap.com> wrote: >> >> Hi, >> >> I think the idea is good, but doesn't work in all cases. >> We may bail out from code generation and discard the generated code leaving the label unbound. >> We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >> Sent: Mittwoch, 11. Juli 2018 03:34 >> To: Liu Xin >; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 >> >> I hit new assert in few other tests: >> >> compiler/codegen/TestCharVect2.java >> compiler/c2/cr6340864/* >> >> Regards, >> Vladimir >> >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>> Fix looks reasonable. I will test it in our framework. >>> >>> Thanks, >>> Vladimir >>> >>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>> Hi, Community, >>>> Could you please review this small patch? >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>> >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>> >>>> Problem: >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >>>> This patch align up x86 with other architectures(ppc, arm). >>>> Add an assertion to the destructor of Label. It will be wiped out in release build. >>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>> If this CR is approved, Paul Hohensee will push it. >>>> Thanks, >>>> --lx >>>> >> >> > From martin.doerr at sap.com Mon Jul 16 14:06:30 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 16 Jul 2018 14:06:30 +0000 Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) Message-ID: <00ed04d166534bc694af29b3cb244296@sap.com> Hi, I'd like to fix the "printing register info" step in hs_err files for jdk11 if possible. The function os::print_location misses a check if the pointer is readable. For example "jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version" generates a hs_err file which doesn't analyze the registers correctly because of "error occurred during error reporting (printing register info)" in section "Register to memory mapping". In addition, registers are missing on PPC64 and s390. My proposal looks a little larger than S, but it's small besides moving the duplicated "is_readable_pointer" from codeHeapState and misc_aix to os: http://cr.openjdk.java.net/~mdoerr/8207342_register_info/webrev.00/ Please review. Best regards, Martin From harold.seigel at oracle.com Mon Jul 16 19:24:39 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Mon, 16 Jul 2018 15:24:39 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL Message-ID: Hi, Please review this JDK-12 fix for bug JDK-8202171.? The fix changes a few functions in oop.cpp into static functions to avoid comparisons between 'this' and NULL. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8202171/webrev/index.html JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8202171 This fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests on Linux-x64. Thanks, Harold From navy.xliu at gmail.com Mon Jul 16 21:09:59 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Mon, 16 Jul 2018 14:09:59 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> Message-ID: <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Hi, List, Could you review this new revision? https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/label_bugfix/index.html i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I don?t understand all the assemblies, but I think they are guarded for UseOnStackReplacement in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). TemplateTable_arm.cpp is a little different. It explicitly binds it later. if (!UseOnStackReplacement) { __ bind(backedge_counter_overflow); } i) I checked the Compile::scratch_emit_size. It only uses the label fakeL for those MachBranch nodes. Because fakeL will be bound to a trivial address if the nodes are MachBranch, It?s also safe for the assertion. bool is_branch = n->is_MachBranch(); if (is_branch) { MacroAssembler masm(&buf); masm.bind(fakeL); n->as_MachBranch()->save_label(&saveL, &save_bnum); n->as_MachBranch()->label_set(&fakeL, 0); } Thanks, ?lx > On Jul 16, 2018, at 1:30 AM, Doerr, Martin wrote: > > Hi Liu Xin, > > thanks for changing. > > > The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. > > it's trivial if fastdebug/slowdebug stop and tell you immediately. > > I understand that. But an assertion should only get added when we are convinced that it won?t produce false positives. > It?s very annoying if long running tests break due to an incorrect assertion after running many days. > > > I am curious about this "We also may generate code with the purpose to determine its size.". > > Could you tell me where is it? it looks quite slow to get buffer size in this way. > > C2 Compiler does that in Compile::scratch_emit_size. > > Please note that I?ll be on vacation soon, so other people will have to review. > Thanks again for fixing the -XX:-UseOnStackReplacement issue. > > Best regards, > Martin > > > From: Liu Xin [mailto:navy.xliu at gmail.com] > Sent: Freitag, 13. Juli 2018 22:30 > To: Doerr, Martin > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 > > Hello, Martin, > > Thanks for reviewing it. > > I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" and is running tests. > > The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop and tell you immediately. > > I am curious about this "We also may generate code with the purpose to determine its size.". Could you tell me where is it? it looks quite slow to get buffer size in this way. > > thanks, > --lx > > > On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > wrote: > Hi, > > thanks for fixing the issue in templateTable_x86. It looks correct. > I think even better would be > "UseOnStackReplacement ? &backedge_counter_overflow : NULL" > and > "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. > But I leave it up to you if you want to change it. I'm also ok with your version. > > I'm not convinced that the label assertion is reliable. There may be many more places in hotspot where we bail out having an unbound label. Running a few tests on x86 is by far not sufficient. The assertion may fire sporadically in large scenarios on some platforms. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net ] On Behalf Of Liu Xin > Sent: Donnerstag, 12. Juli 2018 22:51 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 > > Could you review this patch again? > > Revision #2. > Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > > CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html > > > > The idea is simple. I just reset the problematic label when c1 compilation bailout happen. > I manually ran tier1 on my laptop. it can pass all of them. > Paul help me submit the patch to submit and here is the run result. > Build Details: 2018-07-12-1736388.hohensee.source > > 0 Failed Tests > > Mach5 Tasks Results Summary > > PASSED: 75 > UNABLE_TO_RUN: 0 > KILLED: 0 > NA: 0 > FAILED: 0 > EXECUTED_WITH_FAILURE: 0 > > > Thanks, > ?lx > > On Jul 11, 2018, at 10:35 AM, Liu Xin > wrote: > > > > Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. > > I made another revision. I will run tests thoroughly. > > > > Thanks, > > ?lx > > > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > wrote: > >> > >> Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. > >> > >> Thanks, > >> > >> Paul > >> > >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" on behalf of martin.doerr at sap.com > wrote: > >> > >> Hi, > >> > >> I think the idea is good, but doesn't work in all cases. > >> We may bail out from code generation and discard the generated code leaving the label unbound. > >> We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. > >> > >> Best regards, > >> Martin > >> > >> > >> -----Original Message----- > >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net ] On Behalf Of Vladimir Kozlov > >> Sent: Mittwoch, 11. Juli 2018 03:34 > >> To: Liu Xin >; hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 > >> > >> I hit new assert in few other tests: > >> > >> compiler/codegen/TestCharVect2.java > >> compiler/c2/cr6340864/* > >> > >> Regards, > >> Vladimir > >> > >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > >>> Fix looks reasonable. I will test it in our framework. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> On 7/10/18 9:50 AM, Liu Xin wrote: > >>>> Hi, Community, > >>>> Could you please review this small patch? > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > >>>> > > >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ > >>>> > > >>>> Problem: > >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. > >>>> This patch align up x86 with other architectures(ppc, arm). > >>>> Add an assertion to the destructor of Label. It will be wiped out in release build. > >>>> Previously, hotspot cannot pass this test with assertion on x86-64. > >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java > >>>> If this CR is approved, Paul Hohensee will push it. > >>>> Thanks, > >>>> --lx > >>>> > >> > >> > > > From kim.barrett at oracle.com Mon Jul 16 21:33:33 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 16 Jul 2018 17:33:33 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: References: Message-ID: <9F96C987-F5FB-43F7-8B50-125F8A8FB202@oracle.com> > On Jul 16, 2018, at 3:24 PM, Harold David Seigel wrote: > > Hi, > > Please review this JDK-12 fix for bug JDK-8202171. The fix changes a few functions in oop.cpp into static functions to avoid comparisons between 'this' and NULL. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8202171/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8202171 > > This fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests on Linux-x64. > > Thanks, Harold This looks good. However, there is additional work to be done (described below). I think that work can be handled by new CRs, if not already covered by other existing CRs. ------------------------------------------------------------------------------ src/hotspot/share/oops/oop.cpp 92 void oopDesc::verify() { 93 verify_on(tty, this); 94 } The compiler could elide the NULL check in the call to verify_on from here, since "this" cannot be NULL here. (Inline the call to verify_on from verify, and notices the argument to verify_on was "this", which cannot be NULL.) I think oopDesc::verify() needs similar treatment, e.g. making it a static function taking an oop argument. I don't know how many places call oopDesc::verify(). If it's a lot, dealing with this could be done as a followup. I think there are some other functions with similar issues, e.g. print, print_value, print_address, print_string, print_value_string (not sure I listed all of them). ------------------------------------------------------------------------------ There are more than a dozen other comparisons of "this" with NULL. Are there bugs for any of these? find . -type f -exec egrep -H "this\s*(\!|=)=\s*NULL" {} \; ./share/adlc/formssel.cpp: if( this != NULL ) { ./share/adlc/formssel.cpp: if( this == NULL ) return; ./share/libadt/set.cpp: if( this == NULL ) return os::strdup("{no set}"); ./share/runtime/perfData.cpp: if (this == NULL) ./share/opto/chaitin.cpp: if( this == NULL ) { // Not got anything? ./share/asm/codeBuffer.cpp: if (this == NULL) { ./share/oops/metadata.cpp: if (this == NULL) { ./share/oops/symbol.cpp: if (this == NULL) { ./share/oops/symbol.cpp: if (this == NULL) { ./share/oops/metadata.hpp: if (this == NULL) ./share/oops/metadata.hpp: if (this == NULL) ./share/oops/method.cpp: if (this == NULL) { ./os/bsd/osThread_bsd.cpp: assert(this != NULL, "check"); ./os/aix/osThread_aix.cpp: assert(this != NULL, "check"); ./os/linux/osThread_linux.cpp: assert(this != NULL, "check"); Some of these might be things that really can't happen, and the check is a mistaken attempt to be defensive. In other cases, a NULL value may be possible, but might not be handled as expected. 8202171 deals with these: ./share/oops/oop.cpp: if (this == NULL) { ./share/oops/oop.cpp: if (this == NULL) { ./share/oops/oop.cpp: if (this != NULL) { ------------------------------------------------------------------------------ From jcbeyler at google.com Mon Jul 16 21:58:20 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 16 Jul 2018 14:58:20 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Hi all, Here is a webrev that does all the architectures in the same way: http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ Could anyone review the other architectures and test? - arm, sparc & aarch64 are also modified now to follow the same "if no tlab, then consider eden space allocation" logic. Thanks for your help! Jc On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: > Hi Kim, > > I opened this bug > https://bugs.openjdk.java.net/browse/JDK-8190862 > > and now I've done an update: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ > > I basically have done your nits but also removed the try_eden (it was used > to bind a label but was not used). I updated the comments to use the one > you preferred. > > I still have to do the other architectures though but at least we seem to > have a consensus on this architecture, correct? > > Thanks for the review, > Jc > > On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett > wrote: > >> > On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >> > >> > Yes, you are right, I did those changes due to: >> > https://bugs.openjdk.java.net/browse/JDK-8194084 >> > >> > If Robbin agrees to this change, and if no one sees an issue, I'll go >> ahead >> > and propagate the change across architectures. >> > >> > Thanks for the review, I'll wait for Robbin (or anyone else's comment >> and >> > review) :) >> > Jc >> > >> > On Fri, Jul 13, 2018 at 1:08 PM John Rose >> wrote: >> > >> >> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: >> >> >> >> >> >> I'm not sure if we had left this case intentionally or not but, if we >> want >> >> it all to be consistent, we should perhaps fix it. >> >> >> >> >> >> Well, you put in that logic last February, so unless somebody speaks up >> >> quickly, I support your adjusting it to be the way you want it. >> >> >> >> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" >> >> suggests that the GC group is most active in touching this feature. >> >> If Robbin is OK with it, there's your reviewer. >> >> >> >> FWIW, you can use me as a reviewer, but I'd get one other person >> >> working on the GC to OK it. >> >> >> >> ? John >> >> >> > >> > >> > -- >> > >> > Thanks, >> > Jc >> >> Robbin is on vacation; you might not hear from him for a while. >> >> I'm assuming you'll open a new bug for this? >> >> Except for a few minor nits (below), this looks okay to me. >> >> The comment at line 1052 needs updating. >> >> pre-existing: The retry_tlab label declared on line 1054 is unused. >> >> pre-existing: The try_eden label declared on line 1054 is bound at >> line 1058, but unreferenced. >> >> I like the wording of the comment at 1139 better than the wording at 1016. >> >> > > -- > > Thanks, > Jc > -- Thanks, Jc From coleen.phillimore at oracle.com Mon Jul 16 23:10:02 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 16 Jul 2018 19:10:02 -0400 Subject: [11] RFR 8207368: Race with ConcurrentHashTable deleting items on insert with cleanup thread Message-ID: This is a straight export/import from the change checked into jdk12: https://bugs.openjdk.java.net/browse/JDK-8206471 open webrev at http://cr.openjdk.java.net/~coleenp/8207368.01/webrev Tested with hs-tier1,2. Thanks, Coleen From kim.barrett at oracle.com Mon Jul 16 23:58:23 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 16 Jul 2018 19:58:23 -0400 Subject: [11] RFR 8207368: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: References: Message-ID: <2F8186FF-CBC0-4F7B-BAA0-20A888309BA6@oracle.com> > On Jul 16, 2018, at 7:10 PM, coleen.phillimore at oracle.com wrote: > > This is a straight export/import from the change checked into jdk12: > > https://bugs.openjdk.java.net/browse/JDK-8206471 > > open webrev at http://cr.openjdk.java.net/~coleenp/8207368.01/webrev > > Tested with hs-tier1,2. > > Thanks, > Coleen Looks good. From coleen.phillimore at oracle.com Tue Jul 17 00:34:52 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 16 Jul 2018 20:34:52 -0400 Subject: [11] RFR 8207368: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: <2F8186FF-CBC0-4F7B-BAA0-20A888309BA6@oracle.com> References: <2F8186FF-CBC0-4F7B-BAA0-20A888309BA6@oracle.com> Message-ID: <9d5ee83f-c109-008a-f6cc-a653335c638d@oracle.com> On 7/16/18 7:58 PM, Kim Barrett wrote: >> On Jul 16, 2018, at 7:10 PM, coleen.phillimore at oracle.com wrote: >> >> This is a straight export/import from the change checked into jdk12: >> >> https://bugs.openjdk.java.net/browse/JDK-8206471 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8207368.01/webrev >> >> Tested with hs-tier1,2. >> >> Thanks, >> Coleen > Looks good. Since it's an export/import, I believe I have to keep the original reviewers, but thank you for reviewing this, Kim. Coleen From daniel.daugherty at oracle.com Tue Jul 17 00:40:30 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 16 Jul 2018 20:40:30 -0400 Subject: [11] RFR 8207368: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: References: Message-ID: <8d117daf-1f4e-dfee-0765-74677b8d02ac@oracle.com> Your subject line and webrev use the backport ID: 8207368 Please make sure that you push with the main ID: 8206471 Dan On 7/16/18 7:10 PM, coleen.phillimore at oracle.com wrote: > This is a straight export/import from the change checked into jdk12: > > https://bugs.openjdk.java.net/browse/JDK-8206471 > > open webrev at http://cr.openjdk.java.net/~coleenp/8207368.01/webrev > > Tested with hs-tier1,2. > > Thanks, > Coleen From martin.doerr at sap.com Tue Jul 17 09:31:25 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 17 Jul 2018 09:31:25 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Message-ID: Hi Liu Xin, I also believe that the -UseOnStackReplacement Label problem only exists on x86. But thanks for looking at other platforms (s390 is missing in your list). About Compile::scratch_emit_size: The concern is not about the is_branch case. Problematic could be n->emit(buf, this->regalloc()); while in_scratch_emit_size() is true. The node specific emit function may use labels without binding. If this assertion is desired, I'd expect at least an attempt to identify all bail out situations. And there should be some test coverage on all platforms. Unfortunately, I don't have time to help with that in the near future. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Liu Xin Sent: Montag, 16. Juli 2018 23:10 To: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Hi, List, Could you review this new revision? https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/label_bugfix/index.html i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I don?t understand all the assemblies, but I think they are guarded for UseOnStackReplacement in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). TemplateTable_arm.cpp is a little different. It explicitly binds it later. if (!UseOnStackReplacement) { __ bind(backedge_counter_overflow); } i) I checked the Compile::scratch_emit_size. It only uses the label fakeL for those MachBranch nodes. Because fakeL will be bound to a trivial address if the nodes are MachBranch, It?s also safe for the assertion. bool is_branch = n->is_MachBranch(); if (is_branch) { MacroAssembler masm(&buf); masm.bind(fakeL); n->as_MachBranch()->save_label(&saveL, &save_bnum); n->as_MachBranch()->label_set(&fakeL, 0); } Thanks, ?lx > On Jul 16, 2018, at 1:30 AM, Doerr, Martin wrote: > > Hi Liu Xin, > > thanks for changing. > > > The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. > > it's trivial if fastdebug/slowdebug stop and tell you immediately. > > I understand that. But an assertion should only get added when we are convinced that it won?t produce false positives. > It?s very annoying if long running tests break due to an incorrect assertion after running many days. > > > I am curious about this "We also may generate code with the purpose to determine its size.". > > Could you tell me where is it? it looks quite slow to get buffer size in this way. > > C2 Compiler does that in Compile::scratch_emit_size. > > Please note that I?ll be on vacation soon, so other people will have to review. > Thanks again for fixing the -XX:-UseOnStackReplacement issue. > > Best regards, > Martin > > > From: Liu Xin [mailto:navy.xliu at gmail.com] > Sent: Freitag, 13. Juli 2018 22:30 > To: Doerr, Martin > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 > > Hello, Martin, > > Thanks for reviewing it. > > I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" and is running tests. > > The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop and tell you immediately. > > I am curious about this "We also may generate code with the purpose to determine its size.". Could you tell me where is it? it looks quite slow to get buffer size in this way. > > thanks, > --lx > > > On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > wrote: > Hi, > > thanks for fixing the issue in templateTable_x86. It looks correct. > I think even better would be > "UseOnStackReplacement ? &backedge_counter_overflow : NULL" > and > "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. > But I leave it up to you if you want to change it. I'm also ok with your version. > > I'm not convinced that the label assertion is reliable. There may be many more places in hotspot where we bail out having an unbound label. Running a few tests on x86 is by far not sufficient. The assertion may fire sporadically in large scenarios on some platforms. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net ] On Behalf Of Liu Xin > Sent: Donnerstag, 12. Juli 2018 22:51 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 > > Could you review this patch again? > > Revision #2. > Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > > CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html > > > > The idea is simple. I just reset the problematic label when c1 compilation bailout happen. > I manually ran tier1 on my laptop. it can pass all of them. > Paul help me submit the patch to submit and here is the run result. > Build Details: 2018-07-12-1736388.hohensee.source > > 0 Failed Tests > > Mach5 Tasks Results Summary > > PASSED: 75 > UNABLE_TO_RUN: 0 > KILLED: 0 > NA: 0 > FAILED: 0 > EXECUTED_WITH_FAILURE: 0 > > > Thanks, > ?lx > > On Jul 11, 2018, at 10:35 AM, Liu Xin > wrote: > > > > Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. > > I made another revision. I will run tests thoroughly. > > > > Thanks, > > ?lx > > > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > wrote: > >> > >> Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. > >> > >> Thanks, > >> > >> Paul > >> > >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" on behalf of martin.doerr at sap.com > wrote: > >> > >> Hi, > >> > >> I think the idea is good, but doesn't work in all cases. > >> We may bail out from code generation and discard the generated code leaving the label unbound. > >> We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. > >> > >> Best regards, > >> Martin > >> > >> > >> -----Original Message----- > >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net ] On Behalf Of Vladimir Kozlov > >> Sent: Mittwoch, 11. Juli 2018 03:34 > >> To: Liu Xin >; hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 > >> > >> I hit new assert in few other tests: > >> > >> compiler/codegen/TestCharVect2.java > >> compiler/c2/cr6340864/* > >> > >> Regards, > >> Vladimir > >> > >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > >>> Fix looks reasonable. I will test it in our framework. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> On 7/10/18 9:50 AM, Liu Xin wrote: > >>>> Hi, Community, > >>>> Could you please review this small patch? > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > >>>> > > >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ > >>>> > > >>>> Problem: > >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. > >>>> This patch align up x86 with other architectures(ppc, arm). > >>>> Add an assertion to the destructor of Label. It will be wiped out in release build. > >>>> Previously, hotspot cannot pass this test with assertion on x86-64. > >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java > >>>> If this CR is approved, Paul Hohensee will push it. > >>>> Thanks, > >>>> --lx > >>>> > >> > >> > > > From goetz.lindenmaier at sap.com Tue Jul 17 09:41:06 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 17 Jul 2018 09:41:06 +0000 Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) In-Reply-To: <00ed04d166534bc694af29b3cb244296@sap.com> References: <00ed04d166534bc694af29b3cb244296@sap.com> Message-ID: <6c9f9cc81fd943d29ae272cce1eca2de@sap.com> Hi Martin, thanks for making this fix. It's good to have this fix in 11, even if making is_readable_pointer available is a sensible, but untypical refactoring for RDP. How did you test this change? Best regards, Goetz. From: Doerr, Martin Sent: Montag, 16. Juli 2018 16:07 To: hotspot-runtime-dev at openjdk.java.net; Lindenmaier, Goetz Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) Hi, I'd like to fix the "printing register info" step in hs_err files for jdk11 if possible. The function os::print_location misses a check if the pointer is readable. For example "jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version" generates a hs_err file which doesn't analyze the registers correctly because of "error occurred during error reporting (printing register info)" in section "Register to memory mapping". In addition, registers are missing on PPC64 and s390. My proposal looks a little larger than S, but it's small besides moving the duplicated "is_readable_pointer" from codeHeapState and misc_aix to os: http://cr.openjdk.java.net/~mdoerr/8207342_register_info/webrev.00/ Please review. Best regards, Martin From martin.doerr at sap.com Tue Jul 17 09:57:40 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 17 Jul 2018 09:57:40 +0000 Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) In-Reply-To: <6c9f9cc81fd943d29ae272cce1eca2de@sap.com> References: <00ed04d166534bc694af29b3cb244296@sap.com> <6c9f9cc81fd943d29ae272cce1eca2de@sap.com> Message-ID: Hi G?tz, thanks for the review! I have tested it on the command line on x86, PPC64 (linux+AIX) and s390: jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version I can see all registers in the hs_err file which is not the case without the fix. I'd like to contribute more improvements for hs_err files for jdk12 at a later point of time. I think we should add jtreg tests when doing that. Best regards, Martin From: Lindenmaier, Goetz Sent: Dienstag, 17. Juli 2018 11:41 To: Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(S): 8207342: error occurred during error reporting (printing register info) Hi Martin, thanks for making this fix. It's good to have this fix in 11, even if making is_readable_pointer available is a sensible, but untypical refactoring for RDP. How did you test this change? Best regards, Goetz. From: Doerr, Martin Sent: Montag, 16. Juli 2018 16:07 To: hotspot-runtime-dev at openjdk.java.net; Lindenmaier, Goetz > Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) Hi, I'd like to fix the "printing register info" step in hs_err files for jdk11 if possible. The function os::print_location misses a check if the pointer is readable. For example "jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version" generates a hs_err file which doesn't analyze the registers correctly because of "error occurred during error reporting (printing register info)" in section "Register to memory mapping". In addition, registers are missing on PPC64 and s390. My proposal looks a little larger than S, but it's small besides moving the duplicated "is_readable_pointer" from codeHeapState and misc_aix to os: http://cr.openjdk.java.net/~mdoerr/8207342_register_info/webrev.00/ Please review. Best regards, Martin From goetz.lindenmaier at sap.com Tue Jul 17 09:59:23 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 17 Jul 2018 09:59:23 +0000 Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) In-Reply-To: References: <00ed04d166534bc694af29b3cb244296@sap.com> <6c9f9cc81fd943d29ae272cce1eca2de@sap.com> Message-ID: <04de6563f298475cac62e524330c2e5e@sap.com> Ok, Reviewed. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Dienstag, 17. Juli 2018 11:58 > To: Lindenmaier, Goetz ; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(S): 8207342: error occurred during error reporting (printing > register info) > > Hi G?tz, > > > > thanks for the review! > > > > I have tested it on the command line on x86, PPC64 (linux+AIX) and s390: > > jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version > > > > I can see all registers in the hs_err file which is not the case without the fix. > > > > I'd like to contribute more improvements for hs_err files for jdk12 at a later > point of time. > > I think we should add jtreg tests when doing that. > > > > Best regards, > > Martin > > > > > > From: Lindenmaier, Goetz > Sent: Dienstag, 17. Juli 2018 11:41 > To: Doerr, Martin ; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(S): 8207342: error occurred during error reporting (printing > register info) > > > > Hi Martin, > > > > thanks for making this fix. It's good to have this fix > > in 11, even if making is_readable_pointer available is > > a sensible, but untypical refactoring for RDP. > > > > How did you test this change? > > > > Best regards, > > Goetz. > > > > > > > > From: Doerr, Martin > Sent: Montag, 16. Juli 2018 16:07 > To: hotspot-runtime-dev at openjdk.java.net dev at openjdk.java.net> ; Lindenmaier, Goetz > > Subject: RFR(S): 8207342: error occurred during error reporting (printing > register info) > > > > Hi, > > > > I'd like to fix the "printing register info" step in hs_err files for jdk11 if > possible. > > > > The function os::print_location misses a check if the pointer is readable. > > For example "jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version" > generates a hs_err file which doesn't analyze the registers correctly because > of "error occurred during error reporting (printing register info)" in section > "Register to memory mapping". > > > > In addition, registers are missing on PPC64 and s390. > > > > My proposal looks a little larger than S, but it's small besides moving the > duplicated "is_readable_pointer" from codeHeapState and misc_aix to os: > > http://cr.openjdk.java.net/~mdoerr/8207342_register_info/webrev.00/ > > > > Please review. > > > > Best regards, > > Martin > > From coleen.phillimore at oracle.com Tue Jul 17 14:03:42 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 17 Jul 2018 10:03:42 -0400 Subject: [11] RFR 8207368: Race with ConcurrentHashTable deleting items on insert with cleanup thread In-Reply-To: <8d117daf-1f4e-dfee-0765-74677b8d02ac@oracle.com> References: <8d117daf-1f4e-dfee-0765-74677b8d02ac@oracle.com> Message-ID: On 7/16/18 8:40 PM, Daniel D. Daugherty wrote: > Your subject line and webrev use the backport ID: 8207368 > > Please make sure that you push with the main ID: 8206471 Yes, that is my plan.? The backport import was clean so the original bug id will be used. Thanks! Coleen > > Dan > > > On 7/16/18 7:10 PM, coleen.phillimore at oracle.com wrote: >> This is a straight export/import from the change checked into jdk12: >> >> https://bugs.openjdk.java.net/browse/JDK-8206471 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8207368.01/webrev >> >> Tested with hs-tier1,2. >> >> Thanks, >> Coleen > From coleen.phillimore at oracle.com Tue Jul 17 14:10:40 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 17 Jul 2018 10:10:40 -0400 Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) In-Reply-To: <00ed04d166534bc694af29b3cb244296@sap.com> References: <00ed04d166534bc694af29b3cb244296@sap.com> Message-ID: <2726446d-23af-c0cf-4eec-d51b935d0360@oracle.com> This looks really good, helpful, and worth having in jdk11. Thanks, Coleen On 7/16/18 10:06 AM, Doerr, Martin wrote: > Hi, > > I'd like to fix the "printing register info" step in hs_err files for jdk11 if possible. > > The function os::print_location misses a check if the pointer is readable. > For example "jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version" generates a hs_err file which doesn't analyze the registers correctly because of "error occurred during error reporting (printing register info)" in section "Register to memory mapping". > > In addition, registers are missing on PPC64 and s390. > > My proposal looks a little larger than S, but it's small besides moving the duplicated "is_readable_pointer" from codeHeapState and misc_aix to os: > http://cr.openjdk.java.net/~mdoerr/8207342_register_info/webrev.00/ > > Please review. > > Best regards, > Martin > From vaibhav.x.choudhary at oracle.com Tue Jul 17 14:31:52 2018 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Tue, 17 Jul 2018 20:01:52 +0530 Subject: RFR:8189762: [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration Message-ID: <6D1F8A5A-769C-499C-B647-26DE01D072EA@oracle.com> Hi, Please review the following backport test enhancement for JDK8u written for container awareness. Webrev : http://cr.openjdk.java.net/~rpatil/8189762/webrev.00/ Bug https://bugs.openjdk.java.net/browse/JDK-8189762 [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration Its a backport from JDK10. JDK10 changeset: http://hg.openjdk.java.net/jdk/jdk/rev/d6d00f785f39 JDK10 review thread : http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-November/025086.html Description: Tests are very similar to JDK10, but differs in logging mechanism. -XX options like UseContainerSupport, PrintContainerInfo has been used in place of -Xlog. Few changes has been done in the Util files to make the test compatible. Testing: Testing has been done on Ubuntu with and without Docker environment. Thanks, Vaibhav C From coleen.phillimore at oracle.com Tue Jul 17 15:51:42 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 17 Jul 2018 11:51:42 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero Message-ID: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> Summary: Use cmpxchg for non permanent symbol refcounting, and pack refcount and length into an int. This is a precurser change to the concurrent SymbolTable change. Zeroed refcounted entries can be deleted at anytime so they cannot be allowed to be zero in runtime code.? Thanks to Kim for writing the packing function and helping me avoid undefined behavior. open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8207359 Tested with solaris ptrace helper, mach5 tier1-5 including solaris. Added multithreaded gtest which exercises the code. Thanks, Coleen From calvin.cheung at oracle.com Tue Jul 17 16:18:59 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 17 Jul 2018 09:18:59 -0700 Subject: RFR(S): 8204591: Expire/remove the UseAppCDS option in JDK 12 Message-ID: <5B4E16F3.507@oracle.com> bug: https://bugs.openjdk.java.net/browse/JDK-8204591 webrev: http://cr.openjdk.java.net/~ccheung/8204591/webrev.00/ The UseAppCDS option has been obsoleted in JDK 11, it will be removed in JDK 12. With this change, if the UseAppCDS option is specified, vm will not start. bash-4.2$ $MYJDK/bin/java -XX:+UseAppCDS -version Unrecognized VM option 'UseAppCDS' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Ran hs-tier{1,2,3} tests successfully. thanks, Calvin From jiangli.zhou at oracle.com Tue Jul 17 16:26:50 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 17 Jul 2018 09:26:50 -0700 Subject: RFR(S): 8204591: Expire/remove the UseAppCDS option in JDK 12 In-Reply-To: <5B4E16F3.507@oracle.com> References: <5B4E16F3.507@oracle.com> Message-ID: Hi Calvin, Looks good! You probably have already done it, please double check all jtreg tests to make sure there is no additional reference to UseAppCDS flag left. Thanks, Jiangli On 7/17/18 9:18 AM, Calvin Cheung wrote: > bug: https://bugs.openjdk.java.net/browse/JDK-8204591 > > webrev: http://cr.openjdk.java.net/~ccheung/8204591/webrev.00/ > > The UseAppCDS option has been obsoleted in JDK 11, it will be removed > in JDK 12. > > With this change, if the UseAppCDS option is specified, vm will not > start. > > bash-4.2$ $MYJDK/bin/java -XX:+UseAppCDS -version > Unrecognized VM option 'UseAppCDS' > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > > Ran hs-tier{1,2,3} tests successfully. > > thanks, > Calvin From navy.xliu at gmail.com Tue Jul 17 17:17:25 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Tue, 17 Jul 2018 10:17:25 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Message-ID: Hi, Martin, Thank you for the feedback. I totally agree with you that we shouldn?t introduce false positive assertion. Let?s insist on the high bar here. I browsed many sources in hotspot recently. Hotspot is the most monolithic software I ever seen. I am glad to be directed by a guidance and clear target. I think I dealt with c1 bailout case. This case triggers "codebuffer overflow" in middle of c1 compilation. compiler/codegen/TestCharVect2.java I am still not sure about c2 bailout case. Let me try to make one. For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp contains many emits methods for MachNode. I will double-check if they could leave unused labels. Thanks, ?lx > On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: > > Hi, List, > > Could you review this new revision? > https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/label_bugfix/index.html > > > i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I don?t understand all the assemblies, but I think they are guarded for UseOnStackReplacement > in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). > > TemplateTable_arm.cpp is a little different. It explicitly binds it later. > if (!UseOnStackReplacement) { > __ bind(backedge_counter_overflow); > } > > i) I checked the Compile::scratch_emit_size. It only uses the label fakeL for those MachBranch nodes. > Because fakeL will be bound to a trivial address if the nodes are MachBranch, It?s also safe for the assertion. > > bool is_branch = n->is_MachBranch(); > if (is_branch) { > MacroAssembler masm(&buf); > masm.bind(fakeL); > n->as_MachBranch()->save_label(&saveL, &save_bnum); > n->as_MachBranch()->label_set(&fakeL, 0); > } > > Thanks, > ?lx > > > >> On Jul 16, 2018, at 1:30 AM, Doerr, Martin > wrote: >> >> Hi Liu Xin, >> >> thanks for changing. >> >> > The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. >> > it's trivial if fastdebug/slowdebug stop and tell you immediately. >> >> I understand that. But an assertion should only get added when we are convinced that it won?t produce false positives. >> It?s very annoying if long running tests break due to an incorrect assertion after running many days. >> >> > I am curious about this "We also may generate code with the purpose to determine its size.". >> > Could you tell me where is it? it looks quite slow to get buffer size in this way. >> >> C2 Compiler does that in Compile::scratch_emit_size. >> >> Please note that I?ll be on vacation soon, so other people will have to review. >> Thanks again for fixing the -XX:-UseOnStackReplacement issue. >> >> Best regards, >> Martin >> >> >> From: Liu Xin [mailto:navy.xliu at gmail.com ] >> Sent: Freitag, 13. Juli 2018 22:30 >> To: Doerr, Martin > >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 >> >> Hello, Martin, >> >> Thanks for reviewing it. >> >> I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" and is running tests. >> >> The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop and tell you immediately. >> >> I am curious about this "We also may generate code with the purpose to determine its size.". Could you tell me where is it? it looks quite slow to get buffer size in this way. >> >> thanks, >> --lx >> >> >> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > wrote: >> Hi, >> >> thanks for fixing the issue in templateTable_x86. It looks correct. >> I think even better would be >> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" >> and >> "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. >> But I leave it up to you if you want to change it. I'm also ok with your version. >> >> I'm not convinced that the label assertion is reliable. There may be many more places in hotspot where we bail out having an unbound label. Running a few tests on x86 is by far not sufficient. The assertion may fire sporadically in large scenarios on some platforms. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net ] On Behalf Of Liu Xin >> Sent: Donnerstag, 12. Juli 2018 22:51 >> To: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 >> >> Could you review this patch again? >> >> Revision #2. >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > >> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html > >> >> >> The idea is simple. I just reset the problematic label when c1 compilation bailout happen. >> I manually ran tier1 on my laptop. it can pass all of them. >> Paul help me submit the patch to submit and here is the run result. >> Build Details: 2018-07-12-1736388.hohensee.source >> >> 0 Failed Tests >> >> Mach5 Tasks Results Summary >> >> PASSED: 75 >> UNABLE_TO_RUN: 0 >> KILLED: 0 >> NA: 0 >> FAILED: 0 >> EXECUTED_WITH_FAILURE: 0 >> >> >> Thanks, >> ?lx >> > On Jul 11, 2018, at 10:35 AM, Liu Xin > wrote: >> > >> > Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. >> > I made another revision. I will run tests thoroughly. >> > >> > Thanks, >> > ?lx >> > >> >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > wrote: >> >> >> >> Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. >> >> >> >> Thanks, >> >> >> >> Paul >> >> >> >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" on behalf of martin.doerr at sap.com > wrote: >> >> >> >> Hi, >> >> >> >> I think the idea is good, but doesn't work in all cases. >> >> We may bail out from code generation and discard the generated code leaving the label unbound. >> >> We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. >> >> >> >> Best regards, >> >> Martin >> >> >> >> >> >> -----Original Message----- >> >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net ] On Behalf Of Vladimir Kozlov >> >> Sent: Mittwoch, 11. Juli 2018 03:34 >> >> To: Liu Xin >; hotspot-runtime-dev at openjdk.java.net >> >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 >> >> >> >> I hit new assert in few other tests: >> >> >> >> compiler/codegen/TestCharVect2.java >> >> compiler/c2/cr6340864/* >> >> >> >> Regards, >> >> Vladimir >> >> >> >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >> >>> Fix looks reasonable. I will test it in our framework. >> >>> >> >>> Thanks, >> >>> Vladimir >> >>> >> >>> On 7/10/18 9:50 AM, Liu Xin wrote: >> >>>> Hi, Community, >> >>>> Could you please review this small patch? >> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >> >>>> > >> >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >> >>>> > >> >>>> Problem: >> >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >> >>>> This patch align up x86 with other architectures(ppc, arm). >> >>>> Add an assertion to the destructor of Label. It will be wiped out in release build. >> >>>> Previously, hotspot cannot pass this test with assertion on x86-64. >> >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >> >>>> If this CR is approved, Paul Hohensee will push it. >> >>>> Thanks, >> >>>> --lx >> >>>> >> >> >> >> >> > >> > From mikhailo.seledtsov at oracle.com Tue Jul 17 17:26:52 2018 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Tue, 17 Jul 2018 10:26:52 -0700 Subject: RFR(S): 8204591: Expire/remove the UseAppCDS option in JDK 12 In-Reply-To: References: <5B4E16F3.507@oracle.com> Message-ID: +1, change looks good. I agree with Jiangli, please grep all HotSpot and JDK tests for 'UseAppCDS', as well as task definitions (closed/task-definitions/), just to be sure. Also, check the make files (CDS mode execution). And make sure to run all CDS and AppCDS tests. Thank you, Misha On 07/17/2018 09:26 AM, Jiangli Zhou wrote: > Hi Calvin, > > Looks good! You probably have already done it, please double check all > jtreg tests to make sure there is no additional reference to UseAppCDS > flag left. > > Thanks, > > Jiangli > > > On 7/17/18 9:18 AM, Calvin Cheung wrote: >> bug: https://bugs.openjdk.java.net/browse/JDK-8204591 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8204591/webrev.00/ >> >> The UseAppCDS option has been obsoleted in JDK 11, it will be removed >> in JDK 12. >> >> With this change, if the UseAppCDS option is specified, vm will not >> start. >> >> bash-4.2$ $MYJDK/bin/java -XX:+UseAppCDS -version >> Unrecognized VM option 'UseAppCDS' >> Error: Could not create the Java Virtual Machine. >> Error: A fatal exception has occurred. Program will exit. >> >> Ran hs-tier{1,2,3} tests successfully. >> >> thanks, >> Calvin > From calvin.cheung at oracle.com Tue Jul 17 17:48:41 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 17 Jul 2018 10:48:41 -0700 Subject: RFR(S): 8204591: Expire/remove the UseAppCDS option in JDK 12 In-Reply-To: References: <5B4E16F3.507@oracle.com> Message-ID: <5B4E2BF9.1020501@oracle.com> Hi Misha, Jiangli, Thanks for your quick review. I already checked the entire repo and the UseAppCDS flag is no longer being used. I also ran all CDS/AppCDS tests both locally and using mach5. thanks, Calvin On 7/17/18, 10:26 AM, mikhailo wrote: > +1, change looks good. > > > I agree with Jiangli, please grep all HotSpot and JDK tests for > 'UseAppCDS', as well as task definitions (closed/task-definitions/), > just to be sure. > > Also, check the make files (CDS mode execution). > > And make sure to run all CDS and AppCDS tests. > > > Thank you, > > Misha > > > On 07/17/2018 09:26 AM, Jiangli Zhou wrote: >> Hi Calvin, >> >> Looks good! You probably have already done it, please double check >> all jtreg tests to make sure there is no additional reference to >> UseAppCDS flag left. >> >> Thanks, >> >> Jiangli >> >> >> On 7/17/18 9:18 AM, Calvin Cheung wrote: >>> bug: https://bugs.openjdk.java.net/browse/JDK-8204591 >>> >>> webrev: http://cr.openjdk.java.net/~ccheung/8204591/webrev.00/ >>> >>> The UseAppCDS option has been obsoleted in JDK 11, it will be >>> removed in JDK 12. >>> >>> With this change, if the UseAppCDS option is specified, vm will not >>> start. >>> >>> bash-4.2$ $MYJDK/bin/java -XX:+UseAppCDS -version >>> Unrecognized VM option 'UseAppCDS' >>> Error: Could not create the Java Virtual Machine. >>> Error: A fatal exception has occurred. Program will exit. >>> >>> Ran hs-tier{1,2,3} tests successfully. >>> >>> thanks, >>> Calvin >> > From ioi.lam at oracle.com Tue Jul 17 17:56:40 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 17 Jul 2018 10:56:40 -0700 Subject: RFR(S): 8204591: Expire/remove the UseAppCDS option in JDK 12 In-Reply-To: <5B4E2BF9.1020501@oracle.com> References: <5B4E16F3.507@oracle.com> <5B4E2BF9.1020501@oracle.com> Message-ID: Looks good! - Ioi On 7/17/18 10:48 AM, Calvin Cheung wrote: > Hi Misha, Jiangli, > > Thanks for your quick review. > > I already checked the entire repo and the UseAppCDS flag is no longer > being used. > I also ran all CDS/AppCDS tests both locally and using mach5. > > thanks, > Calvin > > On 7/17/18, 10:26 AM, mikhailo wrote: >> +1, change looks good. >> >> >> I agree with Jiangli, please grep all HotSpot and JDK tests for >> 'UseAppCDS', as well as task definitions (closed/task-definitions/), >> just to be sure. >> >> Also, check the make files (CDS mode execution). >> >> And make sure to run all CDS and AppCDS tests. >> >> >> Thank you, >> >> Misha >> >> >> On 07/17/2018 09:26 AM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> Looks good! You probably have already done it, please double check >>> all jtreg tests to make sure there is no additional reference to >>> UseAppCDS flag left. >>> >>> Thanks, >>> >>> Jiangli >>> >>> >>> On 7/17/18 9:18 AM, Calvin Cheung wrote: >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8204591 >>>> >>>> webrev: http://cr.openjdk.java.net/~ccheung/8204591/webrev.00/ >>>> >>>> The UseAppCDS option has been obsoleted in JDK 11, it will be >>>> removed in JDK 12. >>>> >>>> With this change, if the UseAppCDS option is specified, vm will not >>>> start. >>>> >>>> bash-4.2$ $MYJDK/bin/java -XX:+UseAppCDS -version >>>> Unrecognized VM option 'UseAppCDS' >>>> Error: Could not create the Java Virtual Machine. >>>> Error: A fatal exception has occurred. Program will exit. >>>> >>>> Ran hs-tier{1,2,3} tests successfully. >>>> >>>> thanks, >>>> Calvin >>> >> From calvin.cheung at oracle.com Tue Jul 17 18:46:22 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 17 Jul 2018 11:46:22 -0700 Subject: RFR(S): 8204591: Expire/remove the UseAppCDS option in JDK 12 In-Reply-To: References: <5B4E16F3.507@oracle.com> <5B4E2BF9.1020501@oracle.com> Message-ID: <5B4E397E.9090404@oracle.com> Thanks Ioi. On 7/17/18, 10:56 AM, Ioi Lam wrote: > Looks good! > > - Ioi > > > On 7/17/18 10:48 AM, Calvin Cheung wrote: >> Hi Misha, Jiangli, >> >> Thanks for your quick review. >> >> I already checked the entire repo and the UseAppCDS flag is no longer >> being used. >> I also ran all CDS/AppCDS tests both locally and using mach5. >> >> thanks, >> Calvin >> >> On 7/17/18, 10:26 AM, mikhailo wrote: >>> +1, change looks good. >>> >>> >>> I agree with Jiangli, please grep all HotSpot and JDK tests for >>> 'UseAppCDS', as well as task definitions (closed/task-definitions/), >>> just to be sure. >>> >>> Also, check the make files (CDS mode execution). >>> >>> And make sure to run all CDS and AppCDS tests. >>> >>> >>> Thank you, >>> >>> Misha >>> >>> >>> On 07/17/2018 09:26 AM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> Looks good! You probably have already done it, please double check >>>> all jtreg tests to make sure there is no additional reference to >>>> UseAppCDS flag left. >>>> >>>> Thanks, >>>> >>>> Jiangli >>>> >>>> >>>> On 7/17/18 9:18 AM, Calvin Cheung wrote: >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8204591 >>>>> >>>>> webrev: http://cr.openjdk.java.net/~ccheung/8204591/webrev.00/ >>>>> >>>>> The UseAppCDS option has been obsoleted in JDK 11, it will be >>>>> removed in JDK 12. >>>>> >>>>> With this change, if the UseAppCDS option is specified, vm will >>>>> not start. >>>>> >>>>> bash-4.2$ $MYJDK/bin/java -XX:+UseAppCDS -version >>>>> Unrecognized VM option 'UseAppCDS' >>>>> Error: Could not create the Java Virtual Machine. >>>>> Error: A fatal exception has occurred. Program will exit. >>>>> >>>>> Ran hs-tier{1,2,3} tests successfully. >>>>> >>>>> thanks, >>>>> Calvin >>>> >>> > From gerard.ziemski at oracle.com Tue Jul 17 20:13:53 2018 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Tue, 17 Jul 2018 15:13:53 -0500 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> Message-ID: <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> Thank you Coleen (and Kim)! #1 Need copyright year updates: src/hotspot/share/oops/symbol.cpp src/hotspot/share/classfile/symbolTable.cpp src/hotspot/share/classfile/compactHashtable.inline.hpp #2 What?s the purpose of this code in src/hotspot/share/oops/symbol.cpp 38 STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); when we have: 117 enum { 118 // max_symbol_length is constrained by type of _length 119 max_symbol_length = (1 << 16) -1 120 }; Wouldn?t that always be true? Is it to make sure that nobody changes max_symbol_length, because the implementation needs it to be that? If so, should we add comment to: 119 max_symbol_length = (1 << 16) -1 with a big warning of some sorts? #3 If we have: 39 STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); then why not 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff or 39 STATIC_ASSERT(PERM_REFCOUNT == 0xffff; 101 #define PERM_REFCOUNT 0xffff #4 We have: 221 void Symbol::increment_refcount() { 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol 223 if (!try_increment_refcount()) { 224 #ifdef ASSERT 225 print(); 226 #endif 227 fatal("refcount has gone to zero"); but 233 void Symbol::decrement_refcount() { 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol 235 int new_value = Atomic::sub((uint32_t)1, &_length_and_refcount); 236 #ifdef ASSERT 237 // Check if we have transitioned to 0xffff 238 if (extract_refcount(new_value) == PERM_REFCOUNT) { 239 print(); 240 fatal("refcount underflow"); 241 } 242 #endif Where the line: 240 fatal("refcount underflow?); is inside #ifdef ASSERT, but: 227 fatal("refcount has gone to zero?); is outside. Shouldn't ?fatal" be consistent in both? cheers > On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: > > Summary: Use cmpxchg for non permanent symbol refcounting, and pack refcount and length into an int. > > This is a precurser change to the concurrent SymbolTable change. Zeroed refcounted entries can be deleted at anytime so they cannot be allowed to be zero in runtime code. Thanks to Kim for writing the packing function and helping me avoid undefined behavior. > > open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8207359 > > Tested with solaris ptrace helper, mach5 tier1-5 including solaris. Added multithreaded gtest which exercises the code. > > Thanks, > Coleen From coleen.phillimore at oracle.com Tue Jul 17 21:08:40 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 17 Jul 2018 17:08:40 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> Message-ID: Gerard, thank you for the code review. On 7/17/18 4:13 PM, Gerard Ziemski wrote: > Thank you Coleen (and Kim)! > > #1 Need copyright year updates: > > src/hotspot/share/oops/symbol.cpp > src/hotspot/share/classfile/symbolTable.cpp > src/hotspot/share/classfile/compactHashtable.inline.hpp Yes, I'll update with my commit. > > #2 What?s the purpose of this code in src/hotspot/share/oops/symbol.cpp > > 38 STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); > > when we have: > > 117 enum { > 118 // max_symbol_length is constrained by type of _length > 119 max_symbol_length = (1 << 16) -1 > 120 }; > > Wouldn?t that always be true? Is it to make sure that nobody changes max_symbol_length, because the implementation needs it to be that? If so, should we add comment to: > > 119 max_symbol_length = (1 << 16) -1 > > with a big warning of some sorts? Yes, it's so that we can store the length of the symbol into 16 bits. How I change the comment above max_symbol_length from: ??? // max_symbol_length is constrained by type of _length to ??? // max_symbol_length must fit into the top 16 bits of _length_and_refcount > > #3 If we have: > > 39 STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); > > then why not > > 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff > > or > 39 STATIC_ASSERT(PERM_REFCOUNT == 0xffff; > 101 #define PERM_REFCOUNT 0xffff > I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. > #4 We have: > > 221 void Symbol::increment_refcount() { > 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol > 223 if (!try_increment_refcount()) { > 224 #ifdef ASSERT > 225 print(); > 226 #endif > 227 fatal("refcount has gone to zero"); > > but > > 233 void Symbol::decrement_refcount() { > 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol > 235 int new_value = Atomic::sub((uint32_t)1, &_length_and_refcount); > 236 #ifdef ASSERT > 237 // Check if we have transitioned to 0xffff > 238 if (extract_refcount(new_value) == PERM_REFCOUNT) { > 239 print(); > 240 fatal("refcount underflow"); > 241 } > 242 #endif > > Where the line: > > 240 fatal("refcount underflow?); > > is inside #ifdef ASSERT, but: > > 227 fatal("refcount has gone to zero?); > > is outside. Shouldn't ?fatal" be consistent in both? > I was thought that looked strange too.? I'll move the #endif from 226 to after 227. Thank you for reviewing the code! Coleen > cheers > > >> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >> >> Summary: Use cmpxchg for non permanent symbol refcounting, and pack refcount and length into an int. >> >> This is a precurser change to the concurrent SymbolTable change. Zeroed refcounted entries can be deleted at anytime so they cannot be allowed to be zero in runtime code. Thanks to Kim for writing the packing function and helping me avoid undefined behavior. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >> >> Tested with solaris ptrace helper, mach5 tier1-5 including solaris. Added multithreaded gtest which exercises the code. >> >> Thanks, >> Coleen From mikhailo.seledtsov at oracle.com Tue Jul 17 21:18:04 2018 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Tue, 17 Jul 2018 14:18:04 -0700 Subject: RFR(S): 8204591: Expire/remove the UseAppCDS option in JDK 12 In-Reply-To: <5B4E2BF9.1020501@oracle.com> References: <5B4E16F3.507@oracle.com> <5B4E2BF9.1020501@oracle.com> Message-ID: <83f3245d-4b48-131e-b0fa-8f69d24f34dd@oracle.com> Thank you Calvin. Just wanted to make sure. Looks good to me. Misha On 07/17/2018 10:48 AM, Calvin Cheung wrote: > Hi Misha, Jiangli, > > Thanks for your quick review. > > I already checked the entire repo and the UseAppCDS flag is no longer > being used. > I also ran all CDS/AppCDS tests both locally and using mach5. > > thanks, > Calvin > > On 7/17/18, 10:26 AM, mikhailo wrote: >> +1, change looks good. >> >> >> I agree with Jiangli, please grep all HotSpot and JDK tests for >> 'UseAppCDS', as well as task definitions (closed/task-definitions/), >> just to be sure. >> >> Also, check the make files (CDS mode execution). >> >> And make sure to run all CDS and AppCDS tests. >> >> >> Thank you, >> >> Misha >> >> >> On 07/17/2018 09:26 AM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> Looks good! You probably have already done it, please double check >>> all jtreg tests to make sure there is no additional reference to >>> UseAppCDS flag left. >>> >>> Thanks, >>> >>> Jiangli >>> >>> >>> On 7/17/18 9:18 AM, Calvin Cheung wrote: >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8204591 >>>> >>>> webrev: http://cr.openjdk.java.net/~ccheung/8204591/webrev.00/ >>>> >>>> The UseAppCDS option has been obsoleted in JDK 11, it will be >>>> removed in JDK 12. >>>> >>>> With this change, if the UseAppCDS option is specified, vm will not >>>> start. >>>> >>>> bash-4.2$ $MYJDK/bin/java -XX:+UseAppCDS -version >>>> Unrecognized VM option 'UseAppCDS' >>>> Error: Could not create the Java Virtual Machine. >>>> Error: A fatal exception has occurred. Program will exit. >>>> >>>> Ran hs-tier{1,2,3} tests successfully. >>>> >>>> thanks, >>>> Calvin >>> >> From coleen.phillimore at oracle.com Tue Jul 17 21:38:00 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 17 Jul 2018 17:38:00 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: References: Message-ID: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> Hi Harold, Looking at this change, I would like us to keep the nonstatic print() and print_on(outputStream*) functions because other Metadata and types within the jvm have these functions.? I think the few places where the oop can be NULL at the caller should be checked instead and remove the this == NULL check in the oopDesc::print_on() function.? Most places already do check for NULL.? The verify function seems fine to make a static member function though. I agree with Kim that there are other places where "this" is compared to NULL which shouldn't be done, and we should file separate RFEs to deal with them, specifically Method::is_valid_method() and Metadata::print_{value_}on_maybe_null() functions. Thanks, Coleen On 7/16/18 3:24 PM, Harold David Seigel wrote: > Hi, > > Please review this JDK-12 fix for bug JDK-8202171.? The fix changes a > few functions in oop.cpp into static functions to avoid comparisons > between 'this' and NULL. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8202171/webrev/index.html > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8202171 > > This fix was regression tested by running Mach5 tiers 1 and 2 tests > and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running > tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests > on Linux-x64. > > Thanks, Harold > From navy.xliu at gmail.com Wed Jul 18 07:31:42 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Wed, 18 Jul 2018 00:31:42 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Message-ID: I made a special built which intentionally disables 'compile1' and enable assertion. I ran hs-tier1 and the result has 11 failures, but they are not related to the Label dtor. I think it tested c2 quite much on x86_64. i) for c2 bailout case, I think the array of labels here are suspicious. void Compile::fill_buffer(CodeBuffer* cb, uint* blk_starts) in opto/output.cpp 1124 // Create an array of labels, one for each basic block 1125: Label *blk_labels = NEW_RESOURCE_ARRAY(Label, nblocks+1); These labels may abort early before bind them. It's strange because I don't think hotspot has chance to call their destructors. Actually, hotspot even doesn't have chance to call their constructors because of ' NEW_RESOURCE_ARRAY'. it's very misleading to invoke the method 'init()' with uninitialized object. Am I right here? should I fix it? i) I don't think scratch_emit_size is problem. I took a look at all generated x86/arm code. they really bind all labels except for MachBranchNodes. for MachBranchNode nodes, they refers to fakeL which is bound as well in scratch_emit_size(). thanks, --lx From martin.doerr at sap.com Wed Jul 18 09:33:00 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 18 Jul 2018 09:33:00 +0000 Subject: RFR(S): 8207342: error occurred during error reporting (printing register info) In-Reply-To: <2726446d-23af-c0cf-4eec-d51b935d0360@oracle.com> References: <00ed04d166534bc694af29b3cb244296@sap.com> <2726446d-23af-c0cf-4eec-d51b935d0360@oracle.com> Message-ID: Hi Coleen, thanks for the review. Pushed. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of coleen.phillimore at oracle.com Sent: Dienstag, 17. Juli 2018 16:11 To: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8207342: error occurred during error reporting (printing register info) This looks really good, helpful, and worth having in jdk11. Thanks, Coleen On 7/16/18 10:06 AM, Doerr, Martin wrote: > Hi, > > I'd like to fix the "printing register info" step in hs_err files for jdk11 if possible. > > The function os::print_location misses a check if the pointer is readable. > For example "jdk/bin/java -XX:+CrashGCForDumpingJavaThread -version" generates a hs_err file which doesn't analyze the registers correctly because of "error occurred during error reporting (printing register info)" in section "Register to memory mapping". > > In addition, registers are missing on PPC64 and s390. > > My proposal looks a little larger than S, but it's small besides moving the duplicated "is_readable_pointer" from codeHeapState and misc_aix to os: > http://cr.openjdk.java.net/~mdoerr/8207342_register_info/webrev.00/ > > Please review. > > Best regards, > Martin > From martin.doerr at sap.com Wed Jul 18 11:07:30 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 18 Jul 2018 11:07:30 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Message-ID: Hi Liu Xin, thanks for understanding my point and checking other places. The templateTable_x86.cpp was reviewed by me. I can?t review the label assertion before my vacation. If other reviewers are convinced that the it is correct, ok. Would be great if somebody could assist with testing other platforms. Best regards, Martin From: Liu Xin [mailto:navy.xliu at gmail.com] Sent: Dienstag, 17. Juli 2018 19:17 To: Doerr, Martin Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Hi, Martin, Thank you for the feedback. I totally agree with you that we shouldn?t introduce false positive assertion. Let?s insist on the high bar here. I browsed many sources in hotspot recently. Hotspot is the most monolithic software I ever seen. I am glad to be directed by a guidance and clear target. I think I dealt with c1 bailout case. This case triggers "codebuffer overflow" in middle of c1 compilation. compiler/codegen/TestCharVect2.java I am still not sure about c2 bailout case. Let me try to make one. For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp contains many emits methods for MachNode. I will double-check if they could leave unused labels. Thanks, ?lx On Jul 16, 2018, at 2:09 PM, Liu Xin > wrote: Hi, List, Could you review this new revision? https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/label_bugfix/index.html i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I don?t understand all the assemblies, but I think they are guarded for UseOnStackReplacement in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). TemplateTable_arm.cpp is a little different. It explicitly binds it later. if (!UseOnStackReplacement) { __ bind(backedge_counter_overflow); } i) I checked the Compile::scratch_emit_size. It only uses the label fakeL for those MachBranch nodes. Because fakeL will be bound to a trivial address if the nodes are MachBranch, It?s also safe for the assertion. bool is_branch = n->is_MachBranch(); if (is_branch) { MacroAssembler masm(&buf); masm.bind(fakeL); n->as_MachBranch()->save_label(&saveL, &save_bnum); n->as_MachBranch()->label_set(&fakeL, 0); } Thanks, ?lx On Jul 16, 2018, at 1:30 AM, Doerr, Martin > wrote: Hi Liu Xin, thanks for changing. > The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. > it's trivial if fastdebug/slowdebug stop and tell you immediately. I understand that. But an assertion should only get added when we are convinced that it won?t produce false positives. It?s very annoying if long running tests break due to an incorrect assertion after running many days. > I am curious about this "We also may generate code with the purpose to determine its size.". > Could you tell me where is it? it looks quite slow to get buffer size in this way. C2 Compiler does that in Compile::scratch_emit_size. Please note that I?ll be on vacation soon, so other people will have to review. Thanks again for fixing the -XX:-UseOnStackReplacement issue. Best regards, Martin From: Liu Xin [mailto:navy.xliu at gmail.com] Sent: Freitag, 13. Juli 2018 22:30 To: Doerr, Martin > Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Hello, Martin, Thanks for reviewing it. I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" and is running tests. The background of this Assertion is that our engineer used to spend many hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop and tell you immediately. I am curious about this "We also may generate code with the purpose to determine its size.". Could you tell me where is it? it looks quite slow to get buffer size in this way. thanks, --lx On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > wrote: Hi, thanks for fixing the issue in templateTable_x86. It looks correct. I think even better would be "UseOnStackReplacement ? &backedge_counter_overflow : NULL" and "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. But I leave it up to you if you want to change it. I'm also ok with your version. I'm not convinced that the label assertion is reliable. There may be many more places in hotspot where we bail out having an unbound label. Running a few tests on x86 is by far not sufficient. The assertion may fire sporadically in large scenarios on some platforms. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Liu Xin Sent: Donnerstag, 12. Juli 2018 22:51 To: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Could you review this patch again? Revision #2. Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/openjdk8u/webrev/index.html The idea is simple. I just reset the problematic label when c1 compilation bailout happen. I manually ran tier1 on my laptop. it can pass all of them. Paul help me submit the patch to submit and here is the run result. Build Details: 2018-07-12-1736388.hohensee.source 0 Failed Tests Mach5 Tasks Results Summary PASSED: 75 UNABLE_TO_RUN: 0 KILLED: 0 NA: 0 FAILED: 0 EXECUTED_WITH_FAILURE: 0 Thanks, ?lx > On Jul 11, 2018, at 10:35 AM, Liu Xin > wrote: > > Thank you for your reviews. Indeed, I didn?t deal with bailout situation. "compiler/codegen/TestCharVect2.java? is the case of codeBuffer overflow and leave a unbound label behind. > I made another revision. I will run tests thoroughly. > > Thanks, > ?lx > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > wrote: >> >> Imo it's still good hygiene to require that Labels be bound if they're used, even if the generated code will never be executed. E.g., code that generates code for sizing purposes may be repurposed to generate executable code, in which case an unbound label may be a lurking bug. Also, I'm unaware (I may be corrected!) of any situation where bailing out happens in such a way as to both leave a Label unbound and execute its destructor. Even if there are, I'd say that'd be indicative of another real problem, such as code buffer overflow, so no harm would result. >> >> Thanks, >> >> Paul >> >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" on behalf of martin.doerr at sap.com> wrote: >> >> Hi, >> >> I think the idea is good, but doesn't work in all cases. >> We may bail out from code generation and discard the generated code leaving the label unbound. >> We also may generate code with the purpose to determine its size. We don't need to bind labels because the code will never get executed. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >> Sent: Mittwoch, 11. Juli 2018 03:34 >> To: Liu Xin >; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 >> >> I hit new assert in few other tests: >> >> compiler/codegen/TestCharVect2.java >> compiler/c2/cr6340864/* >> >> Regards, >> Vladimir >> >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>> Fix looks reasonable. I will test it in our framework. >>> >>> Thanks, >>> Vladimir >>> >>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>> Hi, Community, >>>> Could you please review this small patch? >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>> >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>> >>>> Problem: >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. >>>> This patch align up x86 with other architectures(ppc, arm). >>>> Add an assertion to the destructor of Label. It will be wiped out in release build. >>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>> If this CR is approved, Paul Hohensee will push it. >>>> Thanks, >>>> --lx >>>> >> >> > From harold.seigel at oracle.com Wed Jul 18 12:44:53 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Wed, 18 Jul 2018 08:44:53 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> References: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> Message-ID: <37e16040-a847-eefe-fe41-088891d4ae07@oracle.com> Hi Coleen, Kim, Thanks for your comments! I'll make the changes suggested by Coleen and put out a new webrev. Thanks, Harold On 7/17/2018 5:38 PM, coleen.phillimore at oracle.com wrote: > > Hi Harold, > > Looking at this change, I would like us to keep the nonstatic print() > and print_on(outputStream*) functions because other Metadata and types > within the jvm have these functions.? I think the few places where the > oop can be NULL at the caller should be checked instead and remove the > this == NULL check in the oopDesc::print_on() function.? Most places > already do check for NULL.? The verify function seems fine to make a > static member function though. > > I agree with Kim that there are other places where "this" is compared > to NULL which shouldn't be done, and we should file separate RFEs to > deal with them, specifically Method::is_valid_method() and > Metadata::print_{value_}on_maybe_null() functions. > > Thanks, > Coleen > > On 7/16/18 3:24 PM, Harold David Seigel wrote: >> Hi, >> >> Please review this JDK-12 fix for bug JDK-8202171.? The fix changes a >> few functions in oop.cpp into static functions to avoid comparisons >> between 'this' and NULL. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8202171/webrev/index.html >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8202171 >> >> This fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and >> VM tests on Linux-x64. >> >> Thanks, Harold >> > From jcbeyler at google.com Wed Jul 18 16:21:19 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 18 Jul 2018 09:21:19 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Subject Was: Re: RFR (S): C1 still does eden allocations when TLAB is enabled + serviceability-dev Hi all, Could anyone else give me a review of this webrev and check/test the various architecture changes? http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ Thanks for all your help! Jc On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: > Hi all, > > Here is a webrev that does all the architectures in the same way: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > > Could anyone review the other architectures and test? > - arm, sparc & aarch64 are also modified now to follow the same "if no > tlab, then consider eden space allocation" logic. > > Thanks for your help! > Jc > > On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: > >> Hi Kim, >> >> I opened this bug >> https://bugs.openjdk.java.net/browse/JDK-8190862 >> >> and now I've done an update: >> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >> >> I basically have done your nits but also removed the try_eden (it was >> used to bind a label but was not used). I updated the comments to use the >> one you preferred. >> >> I still have to do the other architectures though but at least we seem to >> have a consensus on this architecture, correct? >> >> Thanks for the review, >> Jc >> >> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >> wrote: >> >>> > On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>> > >>> > Yes, you are right, I did those changes due to: >>> > https://bugs.openjdk.java.net/browse/JDK-8194084 >>> > >>> > If Robbin agrees to this change, and if no one sees an issue, I'll go >>> ahead >>> > and propagate the change across architectures. >>> > >>> > Thanks for the review, I'll wait for Robbin (or anyone else's comment >>> and >>> > review) :) >>> > Jc >>> > >>> > On Fri, Jul 13, 2018 at 1:08 PM John Rose >>> wrote: >>> > >>> >> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: >>> >> >>> >> >>> >> I'm not sure if we had left this case intentionally or not but, if we >>> want >>> >> it all to be consistent, we should perhaps fix it. >>> >> >>> >> >>> >> Well, you put in that logic last February, so unless somebody speaks >>> up >>> >> quickly, I support your adjusting it to be the way you want it. >>> >> >>> >> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" >>> >> suggests that the GC group is most active in touching this feature. >>> >> If Robbin is OK with it, there's your reviewer. >>> >> >>> >> FWIW, you can use me as a reviewer, but I'd get one other person >>> >> working on the GC to OK it. >>> >> >>> >> ? John >>> >> >>> > >>> > >>> > -- >>> > >>> > Thanks, >>> > Jc >>> >>> Robbin is on vacation; you might not hear from him for a while. >>> >>> I'm assuming you'll open a new bug for this? >>> >>> Except for a few minor nits (below), this looks okay to me. >>> >>> The comment at line 1052 needs updating. >>> >>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>> >>> pre-existing: The try_eden label declared on line 1054 is bound at >>> line 1058, but unreferenced. >>> >>> I like the wording of the comment at 1139 better than the wording at >>> 1016. >>> >>> >> >> -- >> >> Thanks, >> Jc >> > > > -- > > Thanks, > Jc > -- Thanks, Jc From ioi.lam at oracle.com Wed Jul 18 21:14:53 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 18 Jul 2018 14:14:53 -0700 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> Message-ID: Hi Coleen, The changes look good! The new operations on _length_and_refcount are much cleaner than my old ATOMIC_SHORT_PAIR hack. symbolTable.cpp: ?SymbolTable::lookup_dynamic() { ?... ?214?????? Symbol* sym = e->literal(); ?215?????? if (sym->equals(name, len) && sym->try_increment_refcount()) { ?216???????? // something is referencing this symbol now. ?217???????? return sym; ?218?????? } symbol.cpp: ?221 void Symbol::increment_refcount() { ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol ?223???? if (!try_increment_refcount()) { ?224 #ifdef ASSERT ?225?????? print(); ?226 #endif ?227?????? fatal("refcount has gone to zero"); ?228???? } ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) ?230?? } ?231 } ?246 // Atomically increment while checking for zero, zero is bad. ?247 bool Symbol::try_increment_refcount() { ?248?? uint32_t old_value = _length_and_refcount;? // fetch once ?249?? int refc = extract_refcount(old_value); ?250 ?251?? if (refc == PERM_REFCOUNT) { ?252???? return true; ?253?? } else if (refc == 0) { ?254???? return false; // effectively dead, can't revive ?255?? } ?256 ?257?? uint32_t now; ?258?? while ((now = Atomic::cmpxchg(old_value + 1, &_length_and_refcount, old_value)) != old_value) { ?259???? // failed to increment, check refcount again. ?260???? refc = extract_refcount(now); ?261???? if (refc == 0) { ?262?????? return false; // just died ?263???? } else if (refc == PERM_REFCOUNT) { ?264?????? return true; // just became permanent ?265???? } ?266???? old_value = now; // refcount changed, try again ?267?? } ?268?? return true; ?269 } So is it valid for Symbol::try_increment_refcount() to return false? SymbolTable::lookup_dynamic() seems to suggest YES, but Symbol::increment_refcount() seems to suggest NO. If it's always an invalid condition, I think the fatal() should be moved inside try_increment_refcount. Otherwise, I think you need to add comments in all 3 places, to say when it's possible to get a 0 refcount, and when it's not. And, it might be worth expanding on why "zero is bad" :-) My guess is: + if you're doing a lookup, you might be seeing Symbols that have already been marked for deletion, which is indicated by a 0 refcount. You want to skip such Symbols. + if you're incrementing the refcount, that means you're holding a valid Symbol, which means this Symbol should have never been marked for deletion. Is this correct? Thanks - Ioi On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: > > Gerard, thank you for the code review. > > On 7/17/18 4:13 PM, Gerard Ziemski wrote: >> Thank you Coleen (and Kim)! >> >> #1 Need copyright year updates: >> >> src/hotspot/share/oops/symbol.cpp >> src/hotspot/share/classfile/symbolTable.cpp >> src/hotspot/share/classfile/compactHashtable.inline.hpp > > Yes, I'll update with my commit. >> >> #2 What?s the purpose of this code in src/hotspot/share/oops/symbol.cpp >> >> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >> >> when we have: >> >> ? 117?? enum { >> ? 118???? // max_symbol_length is constrained by type of _length >> ? 119???? max_symbol_length = (1 << 16) -1 >> ? 120?? }; >> >> Wouldn?t that always be true?? Is it to make sure that nobody changes >> max_symbol_length, because the implementation needs it to be that? If >> so, should we add comment to: >> >> ? 119???? max_symbol_length = (1 << 16) -1 >> >> with a big warning of some sorts? > > Yes, it's so that we can store the length of the symbol into 16 bits. > > How I change the comment above max_symbol_length from: > > ??? // max_symbol_length is constrained by type of _length > > to > > ??? // max_symbol_length must fit into the top 16 bits of > _length_and_refcount > >> >> #3 If we have: >> >> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >> >> then why not >> >> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >> >> or >> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >> ? 101 #define PERM_REFCOUNT 0xffff >> > I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. > >> #4 We have: >> >> ? 221 void Symbol::increment_refcount() { >> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >> ? 223???? if (!try_increment_refcount()) { >> ? 224 #ifdef ASSERT >> ? 225?????? print(); >> ? 226 #endif >> ? 227?????? fatal("refcount has gone to zero"); >> >> but >> >> ? 233 void Symbol::decrement_refcount() { >> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >> ? 235???? int new_value = Atomic::sub((uint32_t)1, >> &_length_and_refcount); >> ? 236 #ifdef ASSERT >> ? 237???? // Check if we have transitioned to 0xffff >> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >> ? 239?????? print(); >> ? 240?????? fatal("refcount underflow"); >> ? 241???? } >> ? 242 #endif >> >> Where the line: >> >> ? 240?????? fatal("refcount underflow?); >> >> is inside #ifdef ASSERT, but: >> >> 227?????? fatal("refcount has gone to zero?); >> >> is outside. Shouldn't ?fatal" be consistent in both? >> > > I was thought that looked strange too.? I'll move the #endif from 226 > to after 227. > > Thank you for reviewing the code! > Coleen > >> cheers >> >> >>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>> >>> Summary: Use cmpxchg for non permanent symbol refcounting, and pack >>> refcount and length into an int. >>> >>> This is a precurser change to the concurrent SymbolTable change. >>> Zeroed refcounted entries can be deleted at anytime so they cannot >>> be allowed to be zero in runtime code.? Thanks to Kim for writing >>> the packing function and helping me avoid undefined behavior. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>> >>> Tested with solaris ptrace helper, mach5 tier1-5 including solaris. >>> Added multithreaded gtest which exercises the code. >>> >>> Thanks, >>> Coleen > From coleen.phillimore at oracle.com Wed Jul 18 21:45:35 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 18 Jul 2018 17:45:35 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> Message-ID: On 7/18/18 5:14 PM, Ioi Lam wrote: > Hi Coleen, > > The changes look good! The new operations on _length_and_refcount are > much cleaner than my old ATOMIC_SHORT_PAIR hack. Yes, this makes more sense to me. > > symbolTable.cpp: > > ?SymbolTable::lookup_dynamic() { > ?... > ?214?????? Symbol* sym = e->literal(); > ?215?????? if (sym->equals(name, len) && sym->try_increment_refcount()) { > ?216???????? // something is referencing this symbol now. > ?217???????? return sym; > ?218?????? } > > > symbol.cpp: > > ?221 void Symbol::increment_refcount() { > ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol > ?223???? if (!try_increment_refcount()) { > ?224 #ifdef ASSERT > ?225?????? print(); > ?226 #endif > ?227?????? fatal("refcount has gone to zero"); > ?228???? } > ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) > ?230?? } > ?231 } > > ?246 // Atomically increment while checking for zero, zero is bad. > ?247 bool Symbol::try_increment_refcount() { > ?248?? uint32_t old_value = _length_and_refcount;? // fetch once > ?249?? int refc = extract_refcount(old_value); > ?250 > ?251?? if (refc == PERM_REFCOUNT) { > ?252???? return true; > ?253?? } else if (refc == 0) { > ?254???? return false; // effectively dead, can't revive > ?255?? } > ?256 > ?257?? uint32_t now; > ?258?? while ((now = Atomic::cmpxchg(old_value + 1, > &_length_and_refcount, old_value)) != old_value) { > ?259???? // failed to increment, check refcount again. > ?260???? refc = extract_refcount(now); > ?261???? if (refc == 0) { > ?262?????? return false; // just died > ?263???? } else if (refc == PERM_REFCOUNT) { > ?264?????? return true; // just became permanent > ?265???? } > ?266???? old_value = now; // refcount changed, try again > ?267?? } > ?268?? return true; > ?269 } > > > So is it valid for Symbol::try_increment_refcount() to return false? > SymbolTable::lookup_dynamic() seems to suggest YES, but > Symbol::increment_refcount() seems to suggest NO. True.? If you are looking up a symbol and someone other thread has decremented the refcount to zero, this symbol should not be returned.? My test exercises this code even without the concurrent hashtable.? When the hashtable is concurrent, a zero-ed Symbol could be deallocated so we don't want to return it. In the case where you call increment_refcount() not during lookup, it is assumed that you have a symbol with a non-zero refcount and it can't go away while you are holding it. > > If it's always an invalid condition, I think the fatal() should be > moved inside try_increment_refcount. > It isn't fatal at lookup.? The lookup must skip a zero-ed entry. > Otherwise, I think you need to add comments in all 3 places, to say > when it's possible to get a 0 refcount, and when it's not. And, it > might be worth expanding on why "zero is bad" :-) How about this comment to try_increment_refcount: // Increment refcount while checking for zero.? If the Symbol's refcount becomes zero // a thread could be concurrently removing the Symbol.? This is used during SymbolTable // lookup to avoid reviving a dead Symbol. > > My guess is: > + if you're doing a lookup, you might be seeing Symbols that have > already been marked for deletion, which is indicated by a 0 refcount. > You want to skip such Symbols. > > + if you're incrementing the refcount, that means you're holding a > valid Symbol, which means this Symbol should have never been marked > for deletion. > > Is this correct? Yes, both true. Thanks, Coleen > > Thanks > - Ioi > > > On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >> >> Gerard, thank you for the code review. >> >> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>> Thank you Coleen (and Kim)! >>> >>> #1 Need copyright year updates: >>> >>> src/hotspot/share/oops/symbol.cpp >>> src/hotspot/share/classfile/symbolTable.cpp >>> src/hotspot/share/classfile/compactHashtable.inline.hpp >> >> Yes, I'll update with my commit. >>> >>> #2 What?s the purpose of this code in src/hotspot/share/oops/symbol.cpp >>> >>> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>> >>> when we have: >>> >>> ? 117?? enum { >>> ? 118???? // max_symbol_length is constrained by type of _length >>> ? 119???? max_symbol_length = (1 << 16) -1 >>> ? 120?? }; >>> >>> Wouldn?t that always be true?? Is it to make sure that nobody >>> changes max_symbol_length, because the implementation needs it to be >>> that? If so, should we add comment to: >>> >>> ? 119???? max_symbol_length = (1 << 16) -1 >>> >>> with a big warning of some sorts? >> >> Yes, it's so that we can store the length of the symbol into 16 bits. >> >> How I change the comment above max_symbol_length from: >> >> ??? // max_symbol_length is constrained by type of _length >> >> to >> >> ??? // max_symbol_length must fit into the top 16 bits of >> _length_and_refcount >> >>> >>> #3 If we have: >>> >>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>> >>> then why not >>> >>> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>> >>> or >>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>> ? 101 #define PERM_REFCOUNT 0xffff >>> >> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >> >>> #4 We have: >>> >>> ? 221 void Symbol::increment_refcount() { >>> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>> ? 223???? if (!try_increment_refcount()) { >>> ? 224 #ifdef ASSERT >>> ? 225?????? print(); >>> ? 226 #endif >>> ? 227?????? fatal("refcount has gone to zero"); >>> >>> but >>> >>> ? 233 void Symbol::decrement_refcount() { >>> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>> ? 235???? int new_value = Atomic::sub((uint32_t)1, >>> &_length_and_refcount); >>> ? 236 #ifdef ASSERT >>> ? 237???? // Check if we have transitioned to 0xffff >>> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >>> ? 239?????? print(); >>> ? 240?????? fatal("refcount underflow"); >>> ? 241???? } >>> ? 242 #endif >>> >>> Where the line: >>> >>> ? 240?????? fatal("refcount underflow?); >>> >>> is inside #ifdef ASSERT, but: >>> >>> 227?????? fatal("refcount has gone to zero?); >>> >>> is outside. Shouldn't ?fatal" be consistent in both? >>> >> >> I was thought that looked strange too.? I'll move the #endif from 226 >> to after 227. >> >> Thank you for reviewing the code! >> Coleen >> >>> cheers >>> >>> >>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Summary: Use cmpxchg for non permanent symbol refcounting, and pack >>>> refcount and length into an int. >>>> >>>> This is a precurser change to the concurrent SymbolTable change. >>>> Zeroed refcounted entries can be deleted at anytime so they cannot >>>> be allowed to be zero in runtime code. Thanks to Kim for writing >>>> the packing function and helping me avoid undefined behavior. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>> >>>> Tested with solaris ptrace helper, mach5 tier1-5 including solaris. >>>> Added multithreaded gtest which exercises the code. >>>> >>>> Thanks, >>>> Coleen >> > From kim.barrett at oracle.com Wed Jul 18 22:14:14 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 18 Jul 2018 18:14:14 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> Message-ID: > On Jul 17, 2018, at 11:51 AM, coleen.phillimore at oracle.com wrote: > > Summary: Use cmpxchg for non permanent symbol refcounting, and pack refcount and length into an int. > > This is a precurser change to the concurrent SymbolTable change. Zeroed refcounted entries can be deleted at anytime so they cannot be allowed to be zero in runtime code. Thanks to Kim for writing the packing function and helping me avoid undefined behavior. > > open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8207359 > > Tested with solaris ptrace helper, mach5 tier1-5 including solaris. Added multithreaded gtest which exercises the code. > > Thanks, > Coleen ------------------------------------------------------------------------------ src/hotspot/os/solaris/dtrace/jhelper.d 467 OFFSET_Symbol_length_and_refcount); 474 OFFSET_Symbol_length_and_refcount); 483 OFFSET_Symbol_length_and_refcount); I think these only work on a big-endian platform, and are making further assumptions about the Symbol implementation. I have no idea what, if anything, to do about that here. At the very least, some comments seem warranted. Similarly in java.base/solaris/native/libjvm_db/libjvm_db.c, though some comments were provided there. ------------------------------------------------------------------------------ src/hotspot/share/oops/symbol.cpp 233 void Symbol::decrement_refcount() { 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol 235 int new_value = Atomic::sub((uint32_t)1, &_length_and_refcount); This isn't safe, and can lose counts. Consider refcount is currently PERM-1. Decrement checks and finds it's not PERM. Then three other threads add references, one setting to PERM and two others leaving it at PERM. Now decrement does the sub back to PERM-1, discarding two valid references. ------------------------------------------------------------------------------ src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java 78 public long getLength() { 79 int i = (int)length.getValue(this.addr); 80 return (i >> 16); 81 } Consider a symbol whose length is in the range [2^15, 2^16). The cast result is implementation defined. The right shift of a negative value is implemention defined, and might (or might not) return a negative value. ------------------------------------------------------------------------------ test/hotspot/gtest/classfile/test_symbolTable.cpp 79 // Test overflowing refcount making symbol permanent 80 Symbol* bigsym = SymbolTable::new_symbol("bigsym", CATCH); Add another check after that a PERM_REFCOUNT value is sticky against decrement. ------------------------------------------------------------------------------ test/hotspot/gtest/threadHelper.inline.hpp 24 #ifndef GTEST_THREAD_HELPER_INLINE_HPP s/THREAD_HELPER/THREADHELPER/ to match the usual Hotspot naming. ------------------------------------------------------------------------------ From ioi.lam at oracle.com Wed Jul 18 22:35:50 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 18 Jul 2018 15:35:50 -0700 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> Message-ID: <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: > > > On 7/18/18 5:14 PM, Ioi Lam wrote: >> Hi Coleen, >> >> The changes look good! The new operations on _length_and_refcount are >> much cleaner than my old ATOMIC_SHORT_PAIR hack. > > Yes, this makes more sense to me. >> >> symbolTable.cpp: >> >> ?SymbolTable::lookup_dynamic() { >> ?... >> ?214?????? Symbol* sym = e->literal(); >> ?215?????? if (sym->equals(name, len) && >> sym->try_increment_refcount()) { >> ?216???????? // something is referencing this symbol now. >> ?217???????? return sym; >> ?218?????? } >> >> >> symbol.cpp: >> >> ?221 void Symbol::increment_refcount() { >> ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >> ?223???? if (!try_increment_refcount()) { >> ?224 #ifdef ASSERT >> ?225?????? print(); >> ?226 #endif >> ?227?????? fatal("refcount has gone to zero"); >> ?228???? } >> ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) >> ?230?? } >> ?231 } >> >> ?246 // Atomically increment while checking for zero, zero is bad. >> ?247 bool Symbol::try_increment_refcount() { >> ?248?? uint32_t old_value = _length_and_refcount;? // fetch once >> ?249?? int refc = extract_refcount(old_value); >> ?250 >> ?251?? if (refc == PERM_REFCOUNT) { >> ?252???? return true; >> ?253?? } else if (refc == 0) { >> ?254???? return false; // effectively dead, can't revive >> ?255?? } >> ?256 >> ?257?? uint32_t now; >> ?258?? while ((now = Atomic::cmpxchg(old_value + 1, >> &_length_and_refcount, old_value)) != old_value) { >> ?259???? // failed to increment, check refcount again. >> ?260???? refc = extract_refcount(now); >> ?261???? if (refc == 0) { >> ?262?????? return false; // just died >> ?263???? } else if (refc == PERM_REFCOUNT) { >> ?264?????? return true; // just became permanent >> ?265???? } >> ?266???? old_value = now; // refcount changed, try again >> ?267?? } >> ?268?? return true; >> ?269 } >> >> >> So is it valid for Symbol::try_increment_refcount() to return false? >> SymbolTable::lookup_dynamic() seems to suggest YES, but >> Symbol::increment_refcount() seems to suggest NO. > > True.? If you are looking up a symbol and someone other thread has > decremented the refcount to zero, this symbol should not be returned.? > My test exercises this code even without the concurrent hashtable.? > When the hashtable is concurrent, a zero-ed Symbol could be > deallocated so we don't want to return it. > I think the following should be added as a comment in increment_refcount(). > In the case where you call increment_refcount() not during lookup, it > is assumed that you have a symbol with a non-zero refcount and it > can't go away while you are holding it. >> >> If it's always an invalid condition, I think the fatal() should be >> moved inside try_increment_refcount. >> > > It isn't fatal at lookup.? The lookup must skip a zero-ed entry. >> Otherwise, I think you need to add comments in all 3 places, to say >> when it's possible to get a 0 refcount, and when it's not. And, it >> might be worth expanding on why "zero is bad" :-) > > How about this comment to try_increment_refcount: > > // Increment refcount while checking for zero.? If the Symbol's > refcount becomes zero > // a thread could be concurrently removing the Symbol.? This is used > during SymbolTable > // lookup to avoid reviving a dead Symbol. Sounds good. Thanks - Ioi >> >> My guess is: >> + if you're doing a lookup, you might be seeing Symbols that have >> already been marked for deletion, which is indicated by a 0 refcount. >> You want to skip such Symbols. >> >> + if you're incrementing the refcount, that means you're holding a >> valid Symbol, which means this Symbol should have never been marked >> for deletion. >> >> Is this correct? > > Yes, both true. > > Thanks, > Coleen >> >> Thanks >> - Ioi >> >> >> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>> >>> Gerard, thank you for the code review. >>> >>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>> Thank you Coleen (and Kim)! >>>> >>>> #1 Need copyright year updates: >>>> >>>> src/hotspot/share/oops/symbol.cpp >>>> src/hotspot/share/classfile/symbolTable.cpp >>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>> >>> Yes, I'll update with my commit. >>>> >>>> #2 What?s the purpose of this code in >>>> src/hotspot/share/oops/symbol.cpp >>>> >>>> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>> >>>> when we have: >>>> >>>> ? 117?? enum { >>>> ? 118???? // max_symbol_length is constrained by type of _length >>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>> ? 120?? }; >>>> >>>> Wouldn?t that always be true?? Is it to make sure that nobody >>>> changes max_symbol_length, because the implementation needs it to >>>> be that? If so, should we add comment to: >>>> >>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>> >>>> with a big warning of some sorts? >>> >>> Yes, it's so that we can store the length of the symbol into 16 bits. >>> >>> How I change the comment above max_symbol_length from: >>> >>> ??? // max_symbol_length is constrained by type of _length >>> >>> to >>> >>> ??? // max_symbol_length must fit into the top 16 bits of >>> _length_and_refcount >>> >>>> >>>> #3 If we have: >>>> >>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>> >>>> then why not >>>> >>>> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>> >>>> or >>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>> ? 101 #define PERM_REFCOUNT 0xffff >>>> >>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>> >>>> #4 We have: >>>> >>>> ? 221 void Symbol::increment_refcount() { >>>> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>> ? 223???? if (!try_increment_refcount()) { >>>> ? 224 #ifdef ASSERT >>>> ? 225?????? print(); >>>> ? 226 #endif >>>> ? 227?????? fatal("refcount has gone to zero"); >>>> >>>> but >>>> >>>> ? 233 void Symbol::decrement_refcount() { >>>> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>> ? 235???? int new_value = Atomic::sub((uint32_t)1, >>>> &_length_and_refcount); >>>> ? 236 #ifdef ASSERT >>>> ? 237???? // Check if we have transitioned to 0xffff >>>> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>> ? 239?????? print(); >>>> ? 240?????? fatal("refcount underflow"); >>>> ? 241???? } >>>> ? 242 #endif >>>> >>>> Where the line: >>>> >>>> ? 240?????? fatal("refcount underflow?); >>>> >>>> is inside #ifdef ASSERT, but: >>>> >>>> 227?????? fatal("refcount has gone to zero?); >>>> >>>> is outside. Shouldn't ?fatal" be consistent in both? >>>> >>> >>> I was thought that looked strange too.? I'll move the #endif from >>> 226 to after 227. >>> >>> Thank you for reviewing the code! >>> Coleen >>> >>>> cheers >>>> >>>> >>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and >>>>> pack refcount and length into an int. >>>>> >>>>> This is a precurser change to the concurrent SymbolTable change. >>>>> Zeroed refcounted entries can be deleted at anytime so they cannot >>>>> be allowed to be zero in runtime code. Thanks to Kim for writing >>>>> the packing function and helping me avoid undefined behavior. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>> >>>>> Tested with solaris ptrace helper, mach5 tier1-5 including >>>>> solaris. Added multithreaded gtest which exercises the code. >>>>> >>>>> Thanks, >>>>> Coleen >>> >> > From manc at google.com Thu Jul 19 01:53:19 2018 From: manc at google.com (Man Cao) Date: Wed, 18 Jul 2018 18:53:19 -0700 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS Message-ID: Hello, The Java platform team at Google has maintained a local patch to inline os::SpinPause() since 2014. We would like to upstream this patch to OpenJDK. Could someone sponsor this patch? It is difficult to demonstrate performance improvement in Java benchmarks. It is more of a code refactoring to better utilize modern GCC. It partly addresses the comment about inlining SpinPause() above its declaration in os.hpp. I found an interesting discussion about PAUSE and a microbenchmark in: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html However, the microbenchmark has a large variance in our experiment, making it difficult to tell if there's any benefit from inlining PAUSE. Inlining PAUSE does seem to reduce the variance a bit. The patch is inlined and attached below: diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s @@ -63,15 +63,6 @@ popl %eax ret - .globl SYMBOL(SpinPause) - ELF_TYPE(SpinPause, at function) - .p2align 4,,15 -SYMBOL(SpinPause): - rep - nop - movl $1, %eax - ret - # Support for void Copy::conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s @@ -46,15 +46,6 @@ .text - .globl SYMBOL(SpinPause) - .p2align 4,,15 - ELF_TYPE(SpinPause, at function) -SYMBOL(SpinPause): - rep - nop - movq $1, %rax - ret - # Support for void Copy::arrayof_conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s @@ -42,15 +42,6 @@ .text - .globl SpinPause - .type SpinPause, at function - .p2align 4,,15 -SpinPause: - rep - nop - movl $1, %eax - ret - # Support for void Copy::conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s @@ -38,15 +38,6 @@ .text - .globl SpinPause - .align 16 - .type SpinPause, at function -SpinPause: - rep - nop - movq $1, %rax - ret - # Support for void Copy::arrayof_conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s @@ -51,15 +51,6 @@ movq %fs:0x0,%rax ret - .globl SpinPause - .align 16 -SpinPause: - rep - nop - movq $1, %rax - ret - - / Support for void Copy::arrayof_conjoint_bytes(void* from, / void* to, / size_t count) diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp --- a/src/hotspot/share/runtime/os.hpp +++ b/src/hotspot/share/runtime/os.hpp @@ -1031,6 +1031,13 @@ // of the global SpinPause() with C linkage. // It'd also be eligible for inlining on many platforms. +#if defined(X86) && !defined(_WINDOWS) +extern "C" int inline SpinPause() { + __asm__ __volatile__ ("pause"); + return 1; +} +#else extern "C" int SpinPause(); +#endif #endif // SHARE_VM_RUNTIME_OS_HPP -Man -------------- next part -------------- A non-text attachment was scrubbed... Name: inline_spinpause.patch Type: text/x-patch Size: 3778 bytes Desc: not available URL: From coleen.phillimore at oracle.com Thu Jul 19 02:48:25 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 18 Jul 2018 22:48:25 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> Message-ID: <410d56a6-3602-485e-f17f-90a143ec2cd1@oracle.com> On 7/18/18 6:14 PM, Kim Barrett wrote: >> On Jul 17, 2018, at 11:51 AM, coleen.phillimore at oracle.com wrote: >> >> Summary: Use cmpxchg for non permanent symbol refcounting, and pack refcount and length into an int. >> >> This is a precurser change to the concurrent SymbolTable change. Zeroed refcounted entries can be deleted at anytime so they cannot be allowed to be zero in runtime code. Thanks to Kim for writing the packing function and helping me avoid undefined behavior. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >> >> Tested with solaris ptrace helper, mach5 tier1-5 including solaris. Added multithreaded gtest which exercises the code. >> >> Thanks, >> Coleen > ------------------------------------------------------------------------------ > src/hotspot/os/solaris/dtrace/jhelper.d > 467 OFFSET_Symbol_length_and_refcount); > 474 OFFSET_Symbol_length_and_refcount); > 483 OFFSET_Symbol_length_and_refcount); > > I think these only work on a big-endian platform, and are making > further assumptions about the Symbol implementation. I have no idea > what, if anything, to do about that here. At the very least, some > comments seem warranted. > > Similarly in java.base/solaris/native/libjvm_db/libjvm_db.c, though > some comments were provided there. Yes, I can add a comment, like: ? /* Because sparc is big endian, the top half length is at the correct offset. */ I don't know the changes in dtrace language. > > ------------------------------------------------------------------------------ > src/hotspot/share/oops/symbol.cpp > 233 void Symbol::decrement_refcount() { > 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol > 235 int new_value = Atomic::sub((uint32_t)1, &_length_and_refcount); > > This isn't safe, and can lose counts. Consider refcount is currently > PERM-1. Decrement checks and finds it's not PERM. Then three other > threads add references, one setting to PERM and two others leaving it > at PERM. Now decrement does the sub back to PERM-1, discarding two > valid references. I've fixed decrement_refcount() to use a CAS loop also.? Thank you for finding and discussing this problem.? I'll post a new webrev after testing. > > ------------------------------------------------------------------------------ > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java > 78 public long getLength() { > 79 int i = (int)length.getValue(this.addr); > 80 return (i >> 16); > 81 } > > Consider a symbol whose length is in the range [2^15, 2^16). The cast > result is implementation defined. The right shift of a negative value > is implemention defined, and might (or might not) return a negative value. There is no unsigned int in Java, so I added back the & 0xffff which seems to work, maybe with less undefined behavior (?) ? public long getLength() { ??? long i = length.getValue(this.addr); ??? return (i >> 16) & 0xffff; ? } > > ------------------------------------------------------------------------------ > test/hotspot/gtest/classfile/test_symbolTable.cpp > 79 // Test overflowing refcount making symbol permanent > 80 Symbol* bigsym = SymbolTable::new_symbol("bigsym", CATCH); > > Add another check after that a PERM_REFCOUNT value is sticky against > decrement. Ok. > > ------------------------------------------------------------------------------ > test/hotspot/gtest/threadHelper.inline.hpp > 24 #ifndef GTEST_THREAD_HELPER_INLINE_HPP > > s/THREAD_HELPER/THREADHELPER/ to match the usual Hotspot naming. Fixed. > ------------------------------------------------------------------------------ > Thanks for the thorough review. Coleen From coleen.phillimore at oracle.com Thu Jul 19 02:50:17 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 18 Jul 2018 22:50:17 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> Message-ID: <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> On 7/18/18 6:35 PM, Ioi Lam wrote: > > > On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/18/18 5:14 PM, Ioi Lam wrote: >>> Hi Coleen, >>> >>> The changes look good! The new operations on _length_and_refcount >>> are much cleaner than my old ATOMIC_SHORT_PAIR hack. >> >> Yes, this makes more sense to me. >>> >>> symbolTable.cpp: >>> >>> ?SymbolTable::lookup_dynamic() { >>> ?... >>> ?214?????? Symbol* sym = e->literal(); >>> ?215?????? if (sym->equals(name, len) && >>> sym->try_increment_refcount()) { >>> ?216???????? // something is referencing this symbol now. >>> ?217???????? return sym; >>> ?218?????? } >>> >>> >>> symbol.cpp: >>> >>> ?221 void Symbol::increment_refcount() { >>> ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>> ?223???? if (!try_increment_refcount()) { >>> ?224 #ifdef ASSERT >>> ?225?????? print(); >>> ?226 #endif >>> ?227?????? fatal("refcount has gone to zero"); >>> ?228???? } >>> ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) >>> ?230?? } >>> ?231 } >>> >>> ?246 // Atomically increment while checking for zero, zero is bad. >>> ?247 bool Symbol::try_increment_refcount() { >>> ?248?? uint32_t old_value = _length_and_refcount;? // fetch once >>> ?249?? int refc = extract_refcount(old_value); >>> ?250 >>> ?251?? if (refc == PERM_REFCOUNT) { >>> ?252???? return true; >>> ?253?? } else if (refc == 0) { >>> ?254???? return false; // effectively dead, can't revive >>> ?255?? } >>> ?256 >>> ?257?? uint32_t now; >>> ?258?? while ((now = Atomic::cmpxchg(old_value + 1, >>> &_length_and_refcount, old_value)) != old_value) { >>> ?259???? // failed to increment, check refcount again. >>> ?260???? refc = extract_refcount(now); >>> ?261???? if (refc == 0) { >>> ?262?????? return false; // just died >>> ?263???? } else if (refc == PERM_REFCOUNT) { >>> ?264?????? return true; // just became permanent >>> ?265???? } >>> ?266???? old_value = now; // refcount changed, try again >>> ?267?? } >>> ?268?? return true; >>> ?269 } >>> >>> >>> So is it valid for Symbol::try_increment_refcount() to return false? >>> SymbolTable::lookup_dynamic() seems to suggest YES, but >>> Symbol::increment_refcount() seems to suggest NO. >> >> True.? If you are looking up a symbol and someone other thread has >> decremented the refcount to zero, this symbol should not be >> returned.? My test exercises this code even without the concurrent >> hashtable.? When the hashtable is concurrent, a zero-ed Symbol could >> be deallocated so we don't want to return it. >> > I think the following should be added as a comment in > increment_refcount(). >> In the case where you call increment_refcount() not during lookup, it >> is assumed that you have a symbol with a non-zero refcount and it >> can't go away while you are holding it. Ok, added. > >>> >>> If it's always an invalid condition, I think the fatal() should be >>> moved inside try_increment_refcount. >>> >> >> It isn't fatal at lookup.? The lookup must skip a zero-ed entry. >>> Otherwise, I think you need to add comments in all 3 places, to say >>> when it's possible to get a 0 refcount, and when it's not. And, it >>> might be worth expanding on why "zero is bad" :-) >> >> How about this comment to try_increment_refcount: >> >> // Increment refcount while checking for zero.? If the Symbol's >> refcount becomes zero >> // a thread could be concurrently removing the Symbol.? This is used >> during SymbolTable >> // lookup to avoid reviving a dead Symbol. > Sounds good. Thanks, Ioi. Coleen > > Thanks > - Ioi > >>> >>> My guess is: >>> + if you're doing a lookup, you might be seeing Symbols that have >>> already been marked for deletion, which is indicated by a 0 >>> refcount. You want to skip such Symbols. >>> >>> + if you're incrementing the refcount, that means you're holding a >>> valid Symbol, which means this Symbol should have never been marked >>> for deletion. >>> >>> Is this correct? >> >> Yes, both true. >> >> Thanks, >> Coleen >>> >>> Thanks >>> - Ioi >>> >>> >>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Gerard, thank you for the code review. >>>> >>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>>> Thank you Coleen (and Kim)! >>>>> >>>>> #1 Need copyright year updates: >>>>> >>>>> src/hotspot/share/oops/symbol.cpp >>>>> src/hotspot/share/classfile/symbolTable.cpp >>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>>> >>>> Yes, I'll update with my commit. >>>>> >>>>> #2 What?s the purpose of this code in >>>>> src/hotspot/share/oops/symbol.cpp >>>>> >>>>> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>>> >>>>> when we have: >>>>> >>>>> ? 117?? enum { >>>>> ? 118???? // max_symbol_length is constrained by type of _length >>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>> ? 120?? }; >>>>> >>>>> Wouldn?t that always be true?? Is it to make sure that nobody >>>>> changes max_symbol_length, because the implementation needs it to >>>>> be that? If so, should we add comment to: >>>>> >>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>> >>>>> with a big warning of some sorts? >>>> >>>> Yes, it's so that we can store the length of the symbol into 16 bits. >>>> >>>> How I change the comment above max_symbol_length from: >>>> >>>> ??? // max_symbol_length is constrained by type of _length >>>> >>>> to >>>> >>>> ??? // max_symbol_length must fit into the top 16 bits of >>>> _length_and_refcount >>>> >>>>> >>>>> #3 If we have: >>>>> >>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>>> >>>>> then why not >>>>> >>>>> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>>> >>>>> or >>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>>> ? 101 #define PERM_REFCOUNT 0xffff >>>>> >>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>>> >>>>> #4 We have: >>>>> >>>>> ? 221 void Symbol::increment_refcount() { >>>>> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>> ? 223???? if (!try_increment_refcount()) { >>>>> ? 224 #ifdef ASSERT >>>>> ? 225?????? print(); >>>>> ? 226 #endif >>>>> ? 227?????? fatal("refcount has gone to zero"); >>>>> >>>>> but >>>>> >>>>> ? 233 void Symbol::decrement_refcount() { >>>>> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>> ? 235???? int new_value = Atomic::sub((uint32_t)1, >>>>> &_length_and_refcount); >>>>> ? 236 #ifdef ASSERT >>>>> ? 237???? // Check if we have transitioned to 0xffff >>>>> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>>> ? 239?????? print(); >>>>> ? 240?????? fatal("refcount underflow"); >>>>> ? 241???? } >>>>> ? 242 #endif >>>>> >>>>> Where the line: >>>>> >>>>> ? 240?????? fatal("refcount underflow?); >>>>> >>>>> is inside #ifdef ASSERT, but: >>>>> >>>>> 227?????? fatal("refcount has gone to zero?); >>>>> >>>>> is outside. Shouldn't ?fatal" be consistent in both? >>>>> >>>> >>>> I was thought that looked strange too.? I'll move the #endif from >>>> 226 to after 227. >>>> >>>> Thank you for reviewing the code! >>>> Coleen >>>> >>>>> cheers >>>>> >>>>> >>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and >>>>>> pack refcount and length into an int. >>>>>> >>>>>> This is a precurser change to the concurrent SymbolTable change. >>>>>> Zeroed refcounted entries can be deleted at anytime so they >>>>>> cannot be allowed to be zero in runtime code. Thanks to Kim for >>>>>> writing the packing function and helping me avoid undefined >>>>>> behavior. >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>>> >>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including >>>>>> solaris. Added multithreaded gtest which exercises the code. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>> >>> >> > From kim.barrett at oracle.com Thu Jul 19 03:59:07 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 18 Jul 2018 23:59:07 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <410d56a6-3602-485e-f17f-90a143ec2cd1@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <410d56a6-3602-485e-f17f-90a143ec2cd1@oracle.com> Message-ID: <107D1071-7285-47CD-B233-9C7D8D54B6B9@oracle.com> > On Jul 18, 2018, at 10:48 PM, coleen.phillimore at oracle.com wrote: > > On 7/18/18 6:14 PM, Kim Barrett wrote: >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java >> 78 public long getLength() { >> 79 int i = (int)length.getValue(this.addr); >> 80 return (i >> 16); >> 81 } >> >> Consider a symbol whose length is in the range [2^15, 2^16). The cast >> result is implementation defined. The right shift of a negative value >> is implemention defined, and might (or might not) return a negative value. > > There is no unsigned int in Java, so I added back the & 0xffff which seems to work, maybe with less undefined behavior (?) > > public long getLength() { > long i = length.getValue(this.addr); > return (i >> 16) & 0xffff; > } Oh, right, this is Java, so it?s all portably defined, and yes, the mask is needed and sufficient. From felix.yang at huawei.com Thu Jul 19 07:39:14 2018 From: felix.yang at huawei.com (Yangfei (Felix)) Date: Thu, 19 Jul 2018 07:39:14 +0000 Subject: RFR: 8207838: AArch64: fix the order in which float registers are restored in restore_args Message-ID: Hi, JIRA: https://bugs.openjdk.java.net/browse/JDK-8207838 JIT code snippet of the native wrapper: 178 0x0000007f7857f438: str d0, [sp,#-16]! <==== save_args 179 0x0000007f7857f43c: str d1, [sp,#-16]! 180 0x0000007f7857f440: str d2, [sp,#-16]! 181 0x0000007f7857f444: str d3, [sp,#-16]! 182 0x0000007f7857f448: stp x1, xzr, [sp,#-16]! 183 0x0000007f7857f44c: mov x0, x19 184 0x0000007f7857f450: mov x1, x13 185 0x0000007f7857f454: mov x2, x28 186 0x0000007f7857f458: stp x8, x12, [sp,#-16]! 187 0x0000007f7857f45c: mov x8, #0xc560 // #50528 188 0x0000007f7857f460: movk x8, #0x8dc8, lsl #16 189 0x0000007f7857f464: movk x8, #0x7f, lsl #32 190 0x0000007f7857f468: blr x8 191 0x0000007f7857f46c: ldp x8, x12, [sp],#16 192 0x0000007f7857f470: isb 193 0x0000007f7857f474: ldp x1, xzr, [sp],#16 194 0x0000007f7857f478: ldr d0, [sp],#16 <==== restore_args 195 0x0000007f7857f47c: ldr d1, [sp],#16 196 0x0000007f7857f480: ldr d2, [sp],#16 197 0x0000007f7857f484: ldr d3, [sp],#16 Here the order in which float registers are restored in restore_args does not match save_args in sharedRuntime_aarch64.cpp. Patch contributed by guoge1 at huawei.com: diff -r a25c48c0a1ab src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp --- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp Mon Jul 16 15:09:19 2018 -0700 +++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp Thu Jul 19 15:14:08 2018 +0800 @@ -1107,7 +1107,7 @@ } } __ pop(x, sp); - for ( int i = first_arg ; i < arg_count ; i++ ) { + for ( int i = arg_count - 1 ; i >= first_arg ; i-- ) { if (args[i].first()->is_Register()) { ; } else if (args[i].first()->is_FloatRegister()) { Tested with jtreg hotspot. Will post a valid webrev later. Is it OK for jdk/jdk11? Thanks, Felix From rahul.v.raghavan at oracle.com Thu Jul 19 07:48:26 2018 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Thu, 19 Jul 2018 13:18:26 +0530 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled (just adding + hotspot-compiler-dev also) On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: Subject Was: Re: RFR (S): C1 still does eden allocations when TLAB is enabled + serviceability-dev Hi all, Could anyone else give me a review of this webrev and check/test the various architecture changes? http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ Thanks for all your help! Jc > On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: > >> Hi all, >> >> Here is a webrev that does all the architectures in the same way: >> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >> Could anyone review the other architectures and test? >> - arm, sparc & aarch64 are also modified now to follow the same "if no >> tlab, then consider eden space allocation" logic. >> >> Thanks for your help! >> Jc >> >> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: >> >>> Hi Kim, >>> >>> I opened this bug >>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>> >>> and now I've done an update: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>> >>> I basically have done your nits but also removed the try_eden (it was >>> used to bind a label but was not used). I updated the comments to use the >>> one you preferred. >>> >>> I still have to do the other architectures though but at least we seem to >>> have a consensus on this architecture, correct? >>> >>> Thanks for the review, >>> Jc >>> >>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>> wrote: >>> >>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>>>> >>>>> Yes, you are right, I did those changes due to: >>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>> >>>>> If Robbin agrees to this change, and if no one sees an issue, I'll go >>>> ahead >>>>> and propagate the change across architectures. >>>>> >>>>> Thanks for the review, I'll wait for Robbin (or anyone else's comment >>>> and >>>>> review) :) >>>>> Jc >>>>> >>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>> wrote: >>>>> >>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: >>>>>> >>>>>> >>>>>> I'm not sure if we had left this case intentionally or not but, if we >>>> want >>>>>> it all to be consistent, we should perhaps fix it. >>>>>> >>>>>> >>>>>> Well, you put in that logic last February, so unless somebody speaks >>>> up >>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>> >>>>>> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" >>>>>> suggests that the GC group is most active in touching this feature. >>>>>> If Robbin is OK with it, there's your reviewer. >>>>>> >>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>> working on the GC to OK it. >>>>>> >>>>>> ? John >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks, >>>>> Jc >>>> >>>> Robbin is on vacation; you might not hear from him for a while. >>>> >>>> I'm assuming you'll open a new bug for this? >>>> >>>> Except for a few minor nits (below), this looks okay to me. >>>> >>>> The comment at line 1052 needs updating. >>>> >>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>> >>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>> line 1058, but unreferenced. >>>> >>>> I like the wording of the comment at 1139 better than the wording at >>>> 1016. >>>> >>>> >>> >>> -- >>> >>> Thanks, >>> Jc >>> >> >> >> -- >> >> Thanks, >> Jc >> > > From aph at redhat.com Thu Jul 19 08:23:51 2018 From: aph at redhat.com (Andrew Haley) Date: Thu, 19 Jul 2018 09:23:51 +0100 Subject: RFR: 8207838: AArch64: fix the order in which float registers are restored in restore_args In-Reply-To: References: Message-ID: On 07/19/2018 08:39 AM, Yangfei (Felix) wrote: > Here the order in which float registers are restored in restore_args does not match save_args in sharedRuntime_aarch64.cpp. > Patch contributed by guoge1 at huawei.com: OK, thanks. I'm investigating. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From gunter.haug at sap.com Thu Jul 19 10:53:00 2018 From: gunter.haug at sap.com (Haug, Gunter) Date: Thu, 19 Jul 2018 10:53:00 +0000 Subject: PPC64: jfr profiling doesn't work (PPC64 only) Message-ID: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> Hi all, can I please have reviews and a sponsor for the following fix: https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8207392?filter=allopenissues http://cr.openjdk.java.net/~ghaug/webrevs/8207392/ JFR profiling on linux PPC64 has not been implemented correctly so far, the VM crashes when it is turned on. Therefore hotspot/jtreg/runtime/appcds/TestWithProfiler.java fails. With this fix the test succeeds. I've analyzed a couple of benchmarks with JMC and results look plausible when compared to linux x86. Thanks and best regards, Gunter From coleen.phillimore at oracle.com Thu Jul 19 12:34:56 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 19 Jul 2018 08:34:56 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> Message-ID: <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> Please review the revision to this change.?? Summary: * made decrement_refcount() use CAS loop. * fixed duplicated logic in try_increment_refcount() thanks to Kim * added gtest case for decrement_refcount. * fixed SA code. * added a bunch of comments open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev Retested with hs-tier1-3. Thanks, Coleen On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote: > > > On 7/18/18 6:35 PM, Ioi Lam wrote: >> >> >> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/18/18 5:14 PM, Ioi Lam wrote: >>>> Hi Coleen, >>>> >>>> The changes look good! The new operations on _length_and_refcount >>>> are much cleaner than my old ATOMIC_SHORT_PAIR hack. >>> >>> Yes, this makes more sense to me. >>>> >>>> symbolTable.cpp: >>>> >>>> ?SymbolTable::lookup_dynamic() { >>>> ?... >>>> ?214?????? Symbol* sym = e->literal(); >>>> ?215?????? if (sym->equals(name, len) && >>>> sym->try_increment_refcount()) { >>>> ?216???????? // something is referencing this symbol now. >>>> ?217???????? return sym; >>>> ?218?????? } >>>> >>>> >>>> symbol.cpp: >>>> >>>> ?221 void Symbol::increment_refcount() { >>>> ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>> ?223???? if (!try_increment_refcount()) { >>>> ?224 #ifdef ASSERT >>>> ?225?????? print(); >>>> ?226 #endif >>>> ?227?????? fatal("refcount has gone to zero"); >>>> ?228???? } >>>> ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) >>>> ?230?? } >>>> ?231 } >>>> >>>> ?246 // Atomically increment while checking for zero, zero is bad. >>>> ?247 bool Symbol::try_increment_refcount() { >>>> ?248?? uint32_t old_value = _length_and_refcount;? // fetch once >>>> ?249?? int refc = extract_refcount(old_value); >>>> ?250 >>>> ?251?? if (refc == PERM_REFCOUNT) { >>>> ?252???? return true; >>>> ?253?? } else if (refc == 0) { >>>> ?254???? return false; // effectively dead, can't revive >>>> ?255?? } >>>> ?256 >>>> ?257?? uint32_t now; >>>> ?258?? while ((now = Atomic::cmpxchg(old_value + 1, >>>> &_length_and_refcount, old_value)) != old_value) { >>>> ?259???? // failed to increment, check refcount again. >>>> ?260???? refc = extract_refcount(now); >>>> ?261???? if (refc == 0) { >>>> ?262?????? return false; // just died >>>> ?263???? } else if (refc == PERM_REFCOUNT) { >>>> ?264?????? return true; // just became permanent >>>> ?265???? } >>>> ?266???? old_value = now; // refcount changed, try again >>>> ?267?? } >>>> ?268?? return true; >>>> ?269 } >>>> >>>> >>>> So is it valid for Symbol::try_increment_refcount() to return >>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but >>>> Symbol::increment_refcount() seems to suggest NO. >>> >>> True.? If you are looking up a symbol and someone other thread has >>> decremented the refcount to zero, this symbol should not be >>> returned.? My test exercises this code even without the concurrent >>> hashtable.? When the hashtable is concurrent, a zero-ed Symbol could >>> be deallocated so we don't want to return it. >>> >> I think the following should be added as a comment in >> increment_refcount(). >>> In the case where you call increment_refcount() not during lookup, >>> it is assumed that you have a symbol with a non-zero refcount and it >>> can't go away while you are holding it. > > Ok, added. >> >>>> >>>> If it's always an invalid condition, I think the fatal() should be >>>> moved inside try_increment_refcount. >>>> >>> >>> It isn't fatal at lookup.? The lookup must skip a zero-ed entry. >>>> Otherwise, I think you need to add comments in all 3 places, to say >>>> when it's possible to get a 0 refcount, and when it's not. And, it >>>> might be worth expanding on why "zero is bad" :-) >>> >>> How about this comment to try_increment_refcount: >>> >>> // Increment refcount while checking for zero.? If the Symbol's >>> refcount becomes zero >>> // a thread could be concurrently removing the Symbol.? This is used >>> during SymbolTable >>> // lookup to avoid reviving a dead Symbol. >> Sounds good. > > Thanks, Ioi. > Coleen >> >> Thanks >> - Ioi >> >>>> >>>> My guess is: >>>> + if you're doing a lookup, you might be seeing Symbols that have >>>> already been marked for deletion, which is indicated by a 0 >>>> refcount. You want to skip such Symbols. >>>> >>>> + if you're incrementing the refcount, that means you're holding a >>>> valid Symbol, which means this Symbol should have never been marked >>>> for deletion. >>>> >>>> Is this correct? >>> >>> Yes, both true. >>> >>> Thanks, >>> Coleen >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Gerard, thank you for the code review. >>>>> >>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>>>> Thank you Coleen (and Kim)! >>>>>> >>>>>> #1 Need copyright year updates: >>>>>> >>>>>> src/hotspot/share/oops/symbol.cpp >>>>>> src/hotspot/share/classfile/symbolTable.cpp >>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>>>> >>>>> Yes, I'll update with my commit. >>>>>> >>>>>> #2 What?s the purpose of this code in >>>>>> src/hotspot/share/oops/symbol.cpp >>>>>> >>>>>> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>>>> >>>>>> when we have: >>>>>> >>>>>> ? 117?? enum { >>>>>> ? 118???? // max_symbol_length is constrained by type of _length >>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>> ? 120?? }; >>>>>> >>>>>> Wouldn?t that always be true?? Is it to make sure that nobody >>>>>> changes max_symbol_length, because the implementation needs it to >>>>>> be that? If so, should we add comment to: >>>>>> >>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>> >>>>>> with a big warning of some sorts? >>>>> >>>>> Yes, it's so that we can store the length of the symbol into 16 bits. >>>>> >>>>> How I change the comment above max_symbol_length from: >>>>> >>>>> ??? // max_symbol_length is constrained by type of _length >>>>> >>>>> to >>>>> >>>>> ??? // max_symbol_length must fit into the top 16 bits of >>>>> _length_and_refcount >>>>> >>>>>> >>>>>> #3 If we have: >>>>>> >>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>>>> >>>>>> then why not >>>>>> >>>>>> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>>>> >>>>>> or >>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>>>> ? 101 #define PERM_REFCOUNT 0xffff >>>>>> >>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>>>> >>>>>> #4 We have: >>>>>> >>>>>> ? 221 void Symbol::increment_refcount() { >>>>>> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>> ? 223???? if (!try_increment_refcount()) { >>>>>> ? 224 #ifdef ASSERT >>>>>> ? 225?????? print(); >>>>>> ? 226 #endif >>>>>> ? 227?????? fatal("refcount has gone to zero"); >>>>>> >>>>>> but >>>>>> >>>>>> ? 233 void Symbol::decrement_refcount() { >>>>>> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>> ? 235???? int new_value = Atomic::sub((uint32_t)1, >>>>>> &_length_and_refcount); >>>>>> ? 236 #ifdef ASSERT >>>>>> ? 237???? // Check if we have transitioned to 0xffff >>>>>> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>>>> ? 239?????? print(); >>>>>> ? 240?????? fatal("refcount underflow"); >>>>>> ? 241???? } >>>>>> ? 242 #endif >>>>>> >>>>>> Where the line: >>>>>> >>>>>> ? 240?????? fatal("refcount underflow?); >>>>>> >>>>>> is inside #ifdef ASSERT, but: >>>>>> >>>>>> 227?????? fatal("refcount has gone to zero?); >>>>>> >>>>>> is outside. Shouldn't ?fatal" be consistent in both? >>>>>> >>>>> >>>>> I was thought that looked strange too.? I'll move the #endif from >>>>> 226 to after 227. >>>>> >>>>> Thank you for reviewing the code! >>>>> Coleen >>>>> >>>>>> cheers >>>>>> >>>>>> >>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and >>>>>>> pack refcount and length into an int. >>>>>>> >>>>>>> This is a precurser change to the concurrent SymbolTable change. >>>>>>> Zeroed refcounted entries can be deleted at anytime so they >>>>>>> cannot be allowed to be zero in runtime code. Thanks to Kim for >>>>>>> writing the packing function and helping me avoid undefined >>>>>>> behavior. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>>>> >>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including >>>>>>> solaris. Added multithreaded gtest which exercises the code. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>> >>>> >>> >> > From vaibhav.x.choudhary at oracle.com Thu Jul 19 15:54:05 2018 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Thu, 19 Jul 2018 21:24:05 +0530 Subject: RFR:8189762: [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration In-Reply-To: <6D1F8A5A-769C-499C-B647-26DE01D072EA@oracle.com> References: <6D1F8A5A-769C-499C-B647-26DE01D072EA@oracle.com> Message-ID: <6EFB0C80-4104-4C79-B121-D037C1047220@oracle.com> ping ! > On 17-Jul-2018, at 8:01 PM, Vaibhav Choudhary wrote: > > Hi, > > Please review the following backport test enhancement for JDK8u written for container awareness. > Webrev : http://cr.openjdk.java.net/~rpatil/8189762/webrev.00/ > > Bug https://bugs.openjdk.java.net/browse/JDK-8189762 > [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration > > Its a backport from JDK10. > > JDK10 changeset: http://hg.openjdk.java.net/jdk/jdk/rev/d6d00f785f39 > JDK10 review thread : http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-November/025086.html > > Description: Tests are very similar to JDK10, but differs in logging mechanism. -XX options like UseContainerSupport, PrintContainerInfo has been used in place of -Xlog. Few changes has been done in the Util files to make the test compatible. > > Testing: Testing has been done on Ubuntu with and without Docker environment. > > Thanks, > Vaibhav C From kim.barrett at oracle.com Thu Jul 19 16:47:59 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 19 Jul 2018 12:47:59 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> Message-ID: > On Jul 19, 2018, at 8:34 AM, coleen.phillimore at oracle.com wrote: > > Please review the revision to this change. Summary: > > * made decrement_refcount() use CAS loop. > * fixed duplicated logic in try_increment_refcount() thanks to Kim > * added gtest case for decrement_refcount. > * fixed SA code. > * added a bunch of comments > > open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev > > Retested with hs-tier1-3. > Thanks, > Coleen Looks good. A few minor nits for which I don't need another webrev. ------------------------------------------------------------------------------ test/hotspot/gtest/classfile/test_symbolTable.cpp 87 for (int i = 0; i < PERM_REFCOUNT + 100; i++) { 88 bigsym->decrement_refcount(); 89 } 90 ASSERT_EQ(bigsym->refcount(), PERM_REFCOUNT) << "should be sticky"; I think one decrement is enough; no need for the loop. ------------------------------------------------------------------------------ src/hotspot/os/solaris/dtrace/jhelper.d The comment added for nameSymbolLength also applies to the other symbol lengths. ------------------------------------------------------------------------------ src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java 78 public long getLength() { 79 long i = length.getValue(this.addr); 80 return (i >> 16) & 0xffff; I forgot to mention this before, but I think the length field should be renamed to lengthAndRefcount. ------------------------------------------------------------------------------ From coleen.phillimore at oracle.com Thu Jul 19 17:11:04 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 19 Jul 2018 13:11:04 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> Message-ID: On 7/19/18 12:47 PM, Kim Barrett wrote: >> On Jul 19, 2018, at 8:34 AM, coleen.phillimore at oracle.com wrote: >> >> Please review the revision to this change. Summary: >> >> * made decrement_refcount() use CAS loop. >> * fixed duplicated logic in try_increment_refcount() thanks to Kim >> * added gtest case for decrement_refcount. >> * fixed SA code. >> * added a bunch of comments >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev >> >> Retested with hs-tier1-3. >> Thanks, >> Coleen > Looks good. > > A few minor nits for which I don't need another webrev. > > ------------------------------------------------------------------------------ > test/hotspot/gtest/classfile/test_symbolTable.cpp > 87 for (int i = 0; i < PERM_REFCOUNT + 100; i++) { > 88 bigsym->decrement_refcount(); > 89 } > 90 ASSERT_EQ(bigsym->refcount(), PERM_REFCOUNT) << "should be sticky"; > > I think one decrement is enough; no need for the loop. > > ------------------------------------------------------------------------------ > src/hotspot/os/solaris/dtrace/jhelper.d > > The comment added for nameSymbolLength also applies to the other > symbol lengths. > > ------------------------------------------------------------------------------ > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java > 78 public long getLength() { > 79 long i = length.getValue(this.addr); > 80 return (i >> 16) & 0xffff; > > I forgot to mention this before, but I think the length field should > be renamed to lengthAndRefcount. > > ------------------------------------------------------------------------------ > Okay, I'll fix all these.? Thank you for the code review! Coleen From bob.vandette at oracle.com Thu Jul 19 18:45:25 2018 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 19 Jul 2018 14:45:25 -0400 Subject: RFR:8189762: [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration In-Reply-To: <6D1F8A5A-769C-499C-B647-26DE01D072EA@oracle.com> References: <6D1F8A5A-769C-499C-B647-26DE01D072EA@oracle.com> Message-ID: <6CF7A513-90FA-4BC9-8652-A889A687BF50@oracle.com> Could you try to ask Misha to review these changes (mikhailo.seledtsov at oracle.com ) since he wrote these tests? It would be helpful to have a webrev comparing the JDK11 test sources against yours. In JDK 10, we are using @requires docker.support. Is this not possible in JDK8? There have been a few fixes to the docker tests in JDK 11. You should make sure to get the latest versions of these tests. We have also re-worked some of these tests during the addition of the Container Metrics API and associated tests in JDK 11 to move out common utility classes. I try to add the ?docker? label to any tests and improvements related to cgroups or docker. Here?s a query for JDK11 AND Label == docker. https://bugs.openjdk.java.net/issues/?filter=33939&jql=project%20%3D%20JDK%20AND%20fixVersion%20%3D%20%2211%22%20AND%20labels%20%3D%20docker%20ORDER%20BY%20priority%20DESC Bob. > On Jul 17, 2018, at 10:31 AM, Vaibhav Choudhary wrote: > > Hi, > > Please review the following backport test enhancement for JDK8u written for container awareness. > Webrev : http://cr.openjdk.java.net/~rpatil/8189762/webrev.00/ > > Bug https://bugs.openjdk.java.net/browse/JDK-8189762 > [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration > > Its a backport from JDK10. > > JDK10 changeset: http://hg.openjdk.java.net/jdk/jdk/rev/d6d00f785f39 > JDK10 review thread : http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-November/025086.html > > Description: Tests are very similar to JDK10, but differs in logging mechanism. -XX options like UseContainerSupport, PrintContainerInfo has been used in place of -Xlog. Few changes has been done in the Util files to make the test compatible. > > Testing: Testing has been done on Ubuntu with and without Docker environment. > > Thanks, > Vaibhav C From ioi.lam at oracle.com Thu Jul 19 19:07:31 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 19 Jul 2018 12:07:31 -0700 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> Message-ID: <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> Looks good! Thanks - Ioi On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote: > Please review the revision to this change.?? Summary: > > * made decrement_refcount() use CAS loop. > * fixed duplicated logic in try_increment_refcount() thanks to Kim > * added gtest case for decrement_refcount. > * fixed SA code. > * added a bunch of comments > > open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev > > Retested with hs-tier1-3. > Thanks, > Coleen > > On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/18/18 6:35 PM, Ioi Lam wrote: >>> >>> >>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 7/18/18 5:14 PM, Ioi Lam wrote: >>>>> Hi Coleen, >>>>> >>>>> The changes look good! The new operations on _length_and_refcount >>>>> are much cleaner than my old ATOMIC_SHORT_PAIR hack. >>>> >>>> Yes, this makes more sense to me. >>>>> >>>>> symbolTable.cpp: >>>>> >>>>> ?SymbolTable::lookup_dynamic() { >>>>> ?... >>>>> ?214?????? Symbol* sym = e->literal(); >>>>> ?215?????? if (sym->equals(name, len) && >>>>> sym->try_increment_refcount()) { >>>>> ?216???????? // something is referencing this symbol now. >>>>> ?217???????? return sym; >>>>> ?218?????? } >>>>> >>>>> >>>>> symbol.cpp: >>>>> >>>>> ?221 void Symbol::increment_refcount() { >>>>> ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>> ?223???? if (!try_increment_refcount()) { >>>>> ?224 #ifdef ASSERT >>>>> ?225?????? print(); >>>>> ?226 #endif >>>>> ?227?????? fatal("refcount has gone to zero"); >>>>> ?228???? } >>>>> ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) >>>>> ?230?? } >>>>> ?231 } >>>>> >>>>> ?246 // Atomically increment while checking for zero, zero is bad. >>>>> ?247 bool Symbol::try_increment_refcount() { >>>>> ?248?? uint32_t old_value = _length_and_refcount;? // fetch once >>>>> ?249?? int refc = extract_refcount(old_value); >>>>> ?250 >>>>> ?251?? if (refc == PERM_REFCOUNT) { >>>>> ?252???? return true; >>>>> ?253?? } else if (refc == 0) { >>>>> ?254???? return false; // effectively dead, can't revive >>>>> ?255?? } >>>>> ?256 >>>>> ?257?? uint32_t now; >>>>> ?258?? while ((now = Atomic::cmpxchg(old_value + 1, >>>>> &_length_and_refcount, old_value)) != old_value) { >>>>> ?259???? // failed to increment, check refcount again. >>>>> ?260???? refc = extract_refcount(now); >>>>> ?261???? if (refc == 0) { >>>>> ?262?????? return false; // just died >>>>> ?263???? } else if (refc == PERM_REFCOUNT) { >>>>> ?264?????? return true; // just became permanent >>>>> ?265???? } >>>>> ?266???? old_value = now; // refcount changed, try again >>>>> ?267?? } >>>>> ?268?? return true; >>>>> ?269 } >>>>> >>>>> >>>>> So is it valid for Symbol::try_increment_refcount() to return >>>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but >>>>> Symbol::increment_refcount() seems to suggest NO. >>>> >>>> True.? If you are looking up a symbol and someone other thread has >>>> decremented the refcount to zero, this symbol should not be >>>> returned.? My test exercises this code even without the concurrent >>>> hashtable.? When the hashtable is concurrent, a zero-ed Symbol >>>> could be deallocated so we don't want to return it. >>>> >>> I think the following should be added as a comment in >>> increment_refcount(). >>>> In the case where you call increment_refcount() not during lookup, >>>> it is assumed that you have a symbol with a non-zero refcount and >>>> it can't go away while you are holding it. >> >> Ok, added. >>> >>>>> >>>>> If it's always an invalid condition, I think the fatal() should be >>>>> moved inside try_increment_refcount. >>>>> >>>> >>>> It isn't fatal at lookup.? The lookup must skip a zero-ed entry. >>>>> Otherwise, I think you need to add comments in all 3 places, to >>>>> say when it's possible to get a 0 refcount, and when it's not. >>>>> And, it might be worth expanding on why "zero is bad" :-) >>>> >>>> How about this comment to try_increment_refcount: >>>> >>>> // Increment refcount while checking for zero.? If the Symbol's >>>> refcount becomes zero >>>> // a thread could be concurrently removing the Symbol.? This is >>>> used during SymbolTable >>>> // lookup to avoid reviving a dead Symbol. >>> Sounds good. >> >> Thanks, Ioi. >> Coleen >>> >>> Thanks >>> - Ioi >>> >>>>> >>>>> My guess is: >>>>> + if you're doing a lookup, you might be seeing Symbols that have >>>>> already been marked for deletion, which is indicated by a 0 >>>>> refcount. You want to skip such Symbols. >>>>> >>>>> + if you're incrementing the refcount, that means you're holding a >>>>> valid Symbol, which means this Symbol should have never been >>>>> marked for deletion. >>>>> >>>>> Is this correct? >>>> >>>> Yes, both true. >>>> >>>> Thanks, >>>> Coleen >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> >>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Gerard, thank you for the code review. >>>>>> >>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>>>>> Thank you Coleen (and Kim)! >>>>>>> >>>>>>> #1 Need copyright year updates: >>>>>>> >>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>> src/hotspot/share/classfile/symbolTable.cpp >>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>>>>> >>>>>> Yes, I'll update with my commit. >>>>>>> >>>>>>> #2 What?s the purpose of this code in >>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>> >>>>>>> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>>>>> >>>>>>> when we have: >>>>>>> >>>>>>> ? 117?? enum { >>>>>>> ? 118???? // max_symbol_length is constrained by type of _length >>>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>>> ? 120?? }; >>>>>>> >>>>>>> Wouldn?t that always be true?? Is it to make sure that nobody >>>>>>> changes max_symbol_length, because the implementation needs it >>>>>>> to be that? If so, should we add comment to: >>>>>>> >>>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>>> >>>>>>> with a big warning of some sorts? >>>>>> >>>>>> Yes, it's so that we can store the length of the symbol into 16 >>>>>> bits. >>>>>> >>>>>> How I change the comment above max_symbol_length from: >>>>>> >>>>>> ??? // max_symbol_length is constrained by type of _length >>>>>> >>>>>> to >>>>>> >>>>>> ??? // max_symbol_length must fit into the top 16 bits of >>>>>> _length_and_refcount >>>>>> >>>>>>> >>>>>>> #3 If we have: >>>>>>> >>>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>>>>> >>>>>>> then why not >>>>>>> >>>>>>> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>>>>> >>>>>>> or >>>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>>>>> ? 101 #define PERM_REFCOUNT 0xffff >>>>>>> >>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>>>>> >>>>>>> #4 We have: >>>>>>> >>>>>>> ? 221 void Symbol::increment_refcount() { >>>>>>> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent >>>>>>> symbol >>>>>>> ? 223???? if (!try_increment_refcount()) { >>>>>>> ? 224 #ifdef ASSERT >>>>>>> ? 225?????? print(); >>>>>>> ? 226 #endif >>>>>>> ? 227?????? fatal("refcount has gone to zero"); >>>>>>> >>>>>>> but >>>>>>> >>>>>>> ? 233 void Symbol::decrement_refcount() { >>>>>>> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent >>>>>>> symbol >>>>>>> ? 235???? int new_value = Atomic::sub((uint32_t)1, >>>>>>> &_length_and_refcount); >>>>>>> ? 236 #ifdef ASSERT >>>>>>> ? 237???? // Check if we have transitioned to 0xffff >>>>>>> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>>>>> ? 239?????? print(); >>>>>>> ? 240?????? fatal("refcount underflow"); >>>>>>> ? 241???? } >>>>>>> ? 242 #endif >>>>>>> >>>>>>> Where the line: >>>>>>> >>>>>>> ? 240?????? fatal("refcount underflow?); >>>>>>> >>>>>>> is inside #ifdef ASSERT, but: >>>>>>> >>>>>>> 227?????? fatal("refcount has gone to zero?); >>>>>>> >>>>>>> is outside. Shouldn't ?fatal" be consistent in both? >>>>>>> >>>>>> >>>>>> I was thought that looked strange too.? I'll move the #endif from >>>>>> 226 to after 227. >>>>>> >>>>>> Thank you for reviewing the code! >>>>>> Coleen >>>>>> >>>>>>> cheers >>>>>>> >>>>>>> >>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and >>>>>>>> pack refcount and length into an int. >>>>>>>> >>>>>>>> This is a precurser change to the concurrent SymbolTable >>>>>>>> change. Zeroed refcounted entries can be deleted at anytime so >>>>>>>> they cannot be allowed to be zero in runtime code. Thanks to >>>>>>>> Kim for writing the packing function and helping me avoid >>>>>>>> undefined behavior. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>>>>> >>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including >>>>>>>> solaris. Added multithreaded gtest which exercises the code. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>> >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Thu Jul 19 19:09:40 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 19 Jul 2018 15:09:40 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> Message-ID: <3efca5db-b37a-0afd-e560-50ebfd93e638@oracle.com> On 7/19/18 3:07 PM, Ioi Lam wrote: > Looks good! Thanks, Ioi! Coleen > > Thanks > > - Ioi > > > On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote: >> Please review the revision to this change.?? Summary: >> >> * made decrement_refcount() use CAS loop. >> * fixed duplicated logic in try_increment_refcount() thanks to Kim >> * added gtest case for decrement_refcount. >> * fixed SA code. >> * added a bunch of comments >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev >> >> Retested with hs-tier1-3. >> Thanks, >> Coleen >> >> On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/18/18 6:35 PM, Ioi Lam wrote: >>>> >>>> >>>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 7/18/18 5:14 PM, Ioi Lam wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> The changes look good! The new operations on _length_and_refcount >>>>>> are much cleaner than my old ATOMIC_SHORT_PAIR hack. >>>>> >>>>> Yes, this makes more sense to me. >>>>>> >>>>>> symbolTable.cpp: >>>>>> >>>>>> ?SymbolTable::lookup_dynamic() { >>>>>> ?... >>>>>> ?214?????? Symbol* sym = e->literal(); >>>>>> ?215?????? if (sym->equals(name, len) && >>>>>> sym->try_increment_refcount()) { >>>>>> ?216???????? // something is referencing this symbol now. >>>>>> ?217???????? return sym; >>>>>> ?218?????? } >>>>>> >>>>>> >>>>>> symbol.cpp: >>>>>> >>>>>> ?221 void Symbol::increment_refcount() { >>>>>> ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>> ?223???? if (!try_increment_refcount()) { >>>>>> ?224 #ifdef ASSERT >>>>>> ?225?????? print(); >>>>>> ?226 #endif >>>>>> ?227?????? fatal("refcount has gone to zero"); >>>>>> ?228???? } >>>>>> ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) >>>>>> ?230?? } >>>>>> ?231 } >>>>>> >>>>>> ?246 // Atomically increment while checking for zero, zero is bad. >>>>>> ?247 bool Symbol::try_increment_refcount() { >>>>>> ?248?? uint32_t old_value = _length_and_refcount;? // fetch once >>>>>> ?249?? int refc = extract_refcount(old_value); >>>>>> ?250 >>>>>> ?251?? if (refc == PERM_REFCOUNT) { >>>>>> ?252???? return true; >>>>>> ?253?? } else if (refc == 0) { >>>>>> ?254???? return false; // effectively dead, can't revive >>>>>> ?255?? } >>>>>> ?256 >>>>>> ?257?? uint32_t now; >>>>>> ?258?? while ((now = Atomic::cmpxchg(old_value + 1, >>>>>> &_length_and_refcount, old_value)) != old_value) { >>>>>> ?259???? // failed to increment, check refcount again. >>>>>> ?260???? refc = extract_refcount(now); >>>>>> ?261???? if (refc == 0) { >>>>>> ?262?????? return false; // just died >>>>>> ?263???? } else if (refc == PERM_REFCOUNT) { >>>>>> ?264?????? return true; // just became permanent >>>>>> ?265???? } >>>>>> ?266???? old_value = now; // refcount changed, try again >>>>>> ?267?? } >>>>>> ?268?? return true; >>>>>> ?269 } >>>>>> >>>>>> >>>>>> So is it valid for Symbol::try_increment_refcount() to return >>>>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but >>>>>> Symbol::increment_refcount() seems to suggest NO. >>>>> >>>>> True.? If you are looking up a symbol and someone other thread has >>>>> decremented the refcount to zero, this symbol should not be >>>>> returned.? My test exercises this code even without the concurrent >>>>> hashtable.? When the hashtable is concurrent, a zero-ed Symbol >>>>> could be deallocated so we don't want to return it. >>>>> >>>> I think the following should be added as a comment in >>>> increment_refcount(). >>>>> In the case where you call increment_refcount() not during lookup, >>>>> it is assumed that you have a symbol with a non-zero refcount and >>>>> it can't go away while you are holding it. >>> >>> Ok, added. >>>> >>>>>> >>>>>> If it's always an invalid condition, I think the fatal() should >>>>>> be moved inside try_increment_refcount. >>>>>> >>>>> >>>>> It isn't fatal at lookup.? The lookup must skip a zero-ed entry. >>>>>> Otherwise, I think you need to add comments in all 3 places, to >>>>>> say when it's possible to get a 0 refcount, and when it's not. >>>>>> And, it might be worth expanding on why "zero is bad" :-) >>>>> >>>>> How about this comment to try_increment_refcount: >>>>> >>>>> // Increment refcount while checking for zero.? If the Symbol's >>>>> refcount becomes zero >>>>> // a thread could be concurrently removing the Symbol. This is >>>>> used during SymbolTable >>>>> // lookup to avoid reviving a dead Symbol. >>>> Sounds good. >>> >>> Thanks, Ioi. >>> Coleen >>>> >>>> Thanks >>>> - Ioi >>>> >>>>>> >>>>>> My guess is: >>>>>> + if you're doing a lookup, you might be seeing Symbols that have >>>>>> already been marked for deletion, which is indicated by a 0 >>>>>> refcount. You want to skip such Symbols. >>>>>> >>>>>> + if you're incrementing the refcount, that means you're holding >>>>>> a valid Symbol, which means this Symbol should have never been >>>>>> marked for deletion. >>>>>> >>>>>> Is this correct? >>>>> >>>>> Yes, both true. >>>>> >>>>> Thanks, >>>>> Coleen >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> >>>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Gerard, thank you for the code review. >>>>>>> >>>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>>>>>> Thank you Coleen (and Kim)! >>>>>>>> >>>>>>>> #1 Need copyright year updates: >>>>>>>> >>>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>>> src/hotspot/share/classfile/symbolTable.cpp >>>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>>>>>> >>>>>>> Yes, I'll update with my commit. >>>>>>>> >>>>>>>> #2 What?s the purpose of this code in >>>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>>> >>>>>>>> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>>>>>> >>>>>>>> when we have: >>>>>>>> >>>>>>>> ? 117?? enum { >>>>>>>> ? 118???? // max_symbol_length is constrained by type of _length >>>>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>>>> ? 120?? }; >>>>>>>> >>>>>>>> Wouldn?t that always be true?? Is it to make sure that nobody >>>>>>>> changes max_symbol_length, because the implementation needs it >>>>>>>> to be that? If so, should we add comment to: >>>>>>>> >>>>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>>>> >>>>>>>> with a big warning of some sorts? >>>>>>> >>>>>>> Yes, it's so that we can store the length of the symbol into 16 >>>>>>> bits. >>>>>>> >>>>>>> How I change the comment above max_symbol_length from: >>>>>>> >>>>>>> ??? // max_symbol_length is constrained by type of _length >>>>>>> >>>>>>> to >>>>>>> >>>>>>> ??? // max_symbol_length must fit into the top 16 bits of >>>>>>> _length_and_refcount >>>>>>> >>>>>>>> >>>>>>>> #3 If we have: >>>>>>>> >>>>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>>>>>> >>>>>>>> then why not >>>>>>>> >>>>>>>> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>>>>>> >>>>>>>> or >>>>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>>>>>> ? 101 #define PERM_REFCOUNT 0xffff >>>>>>>> >>>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>>>>>> >>>>>>>> #4 We have: >>>>>>>> >>>>>>>> ? 221 void Symbol::increment_refcount() { >>>>>>>> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent >>>>>>>> symbol >>>>>>>> ? 223???? if (!try_increment_refcount()) { >>>>>>>> ? 224 #ifdef ASSERT >>>>>>>> ? 225?????? print(); >>>>>>>> ? 226 #endif >>>>>>>> ? 227?????? fatal("refcount has gone to zero"); >>>>>>>> >>>>>>>> but >>>>>>>> >>>>>>>> ? 233 void Symbol::decrement_refcount() { >>>>>>>> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent >>>>>>>> symbol >>>>>>>> ? 235???? int new_value = Atomic::sub((uint32_t)1, >>>>>>>> &_length_and_refcount); >>>>>>>> ? 236 #ifdef ASSERT >>>>>>>> ? 237???? // Check if we have transitioned to 0xffff >>>>>>>> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>>>>>> ? 239?????? print(); >>>>>>>> ? 240?????? fatal("refcount underflow"); >>>>>>>> ? 241???? } >>>>>>>> ? 242 #endif >>>>>>>> >>>>>>>> Where the line: >>>>>>>> >>>>>>>> ? 240?????? fatal("refcount underflow?); >>>>>>>> >>>>>>>> is inside #ifdef ASSERT, but: >>>>>>>> >>>>>>>> 227?????? fatal("refcount has gone to zero?); >>>>>>>> >>>>>>>> is outside. Shouldn't ?fatal" be consistent in both? >>>>>>>> >>>>>>> >>>>>>> I was thought that looked strange too.? I'll move the #endif >>>>>>> from 226 to after 227. >>>>>>> >>>>>>> Thank you for reviewing the code! >>>>>>> Coleen >>>>>>> >>>>>>>> cheers >>>>>>>> >>>>>>>> >>>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and >>>>>>>>> pack refcount and length into an int. >>>>>>>>> >>>>>>>>> This is a precurser change to the concurrent SymbolTable >>>>>>>>> change. Zeroed refcounted entries can be deleted at anytime so >>>>>>>>> they cannot be allowed to be zero in runtime code. Thanks to >>>>>>>>> Kim for writing the packing function and helping me avoid >>>>>>>>> undefined behavior. >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>>>>>> >>>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including >>>>>>>>> solaris. Added multithreaded gtest which exercises the code. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Thu Jul 19 22:14:58 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 19 Jul 2018 18:14:58 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <3efca5db-b37a-0afd-e560-50ebfd93e638@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> <3efca5db-b37a-0afd-e560-50ebfd93e638@oracle.com> Message-ID: <8a37d7da-5547-75b3-aec3-fd3bbe8e6a78@oracle.com> Hi, There is a closed test that does 100,000 lookups on a class that fails resolution, so creates 100,000 Symbols with TempNewSymbol. This results in many zeroed refcounted Symbols in the table which increases lookup time with the current SymbolTable.? With the new concurrent symbol table, which this change is intended to support, the zero refcount symbols are cleaned up on insert and concurrently. I have a workaround so that this test doesn't time out.?? These are the times for this test on my machine. old hashtable no patch: 7.32 seconds without workaround: 367 seconds (which can time out on a slow machine) with workaround:? 61.075 seconds with new hashtable: 9.135 seconds There are several ways to fix the old hashtable so that it cleans more frequently for this situation but it's not worth doing with the new concurrent hashtable coming. open webrev at http://cr.openjdk.java.net/~coleenp/03.incr/webrev Thanks, Coleen On 7/19/18 3:09 PM, coleen.phillimore at oracle.com wrote: > > > On 7/19/18 3:07 PM, Ioi Lam wrote: >> Looks good! > > Thanks, Ioi! > Coleen >> >> Thanks >> >> - Ioi >> >> >> On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote: >>> Please review the revision to this change.?? Summary: >>> >>> * made decrement_refcount() use CAS loop. >>> * fixed duplicated logic in try_increment_refcount() thanks to Kim >>> * added gtest case for decrement_refcount. >>> * fixed SA code. >>> * added a bunch of comments >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev >>> >>> Retested with hs-tier1-3. >>> Thanks, >>> Coleen >>> >>> On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 7/18/18 6:35 PM, Ioi Lam wrote: >>>>> >>>>> >>>>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> >>>>>> On 7/18/18 5:14 PM, Ioi Lam wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> The changes look good! The new operations on >>>>>>> _length_and_refcount are much cleaner than my old >>>>>>> ATOMIC_SHORT_PAIR hack. >>>>>> >>>>>> Yes, this makes more sense to me. >>>>>>> >>>>>>> symbolTable.cpp: >>>>>>> >>>>>>> ?SymbolTable::lookup_dynamic() { >>>>>>> ?... >>>>>>> ?214?????? Symbol* sym = e->literal(); >>>>>>> ?215?????? if (sym->equals(name, len) && >>>>>>> sym->try_increment_refcount()) { >>>>>>> ?216???????? // something is referencing this symbol now. >>>>>>> ?217???????? return sym; >>>>>>> ?218?????? } >>>>>>> >>>>>>> >>>>>>> symbol.cpp: >>>>>>> >>>>>>> ?221 void Symbol::increment_refcount() { >>>>>>> ?222?? if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>>> ?223???? if (!try_increment_refcount()) { >>>>>>> ?224 #ifdef ASSERT >>>>>>> ?225?????? print(); >>>>>>> ?226 #endif >>>>>>> ?227?????? fatal("refcount has gone to zero"); >>>>>>> ?228???? } >>>>>>> ?229???? NOT_PRODUCT(Atomic::inc(&_total_count);) >>>>>>> ?230?? } >>>>>>> ?231 } >>>>>>> >>>>>>> ?246 // Atomically increment while checking for zero, zero is bad. >>>>>>> ?247 bool Symbol::try_increment_refcount() { >>>>>>> ?248?? uint32_t old_value = _length_and_refcount;? // fetch once >>>>>>> ?249?? int refc = extract_refcount(old_value); >>>>>>> ?250 >>>>>>> ?251?? if (refc == PERM_REFCOUNT) { >>>>>>> ?252???? return true; >>>>>>> ?253?? } else if (refc == 0) { >>>>>>> ?254???? return false; // effectively dead, can't revive >>>>>>> ?255?? } >>>>>>> ?256 >>>>>>> ?257?? uint32_t now; >>>>>>> ?258?? while ((now = Atomic::cmpxchg(old_value + 1, >>>>>>> &_length_and_refcount, old_value)) != old_value) { >>>>>>> ?259???? // failed to increment, check refcount again. >>>>>>> ?260???? refc = extract_refcount(now); >>>>>>> ?261???? if (refc == 0) { >>>>>>> ?262?????? return false; // just died >>>>>>> ?263???? } else if (refc == PERM_REFCOUNT) { >>>>>>> ?264?????? return true; // just became permanent >>>>>>> ?265???? } >>>>>>> ?266???? old_value = now; // refcount changed, try again >>>>>>> ?267?? } >>>>>>> ?268?? return true; >>>>>>> ?269 } >>>>>>> >>>>>>> >>>>>>> So is it valid for Symbol::try_increment_refcount() to return >>>>>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but >>>>>>> Symbol::increment_refcount() seems to suggest NO. >>>>>> >>>>>> True.? If you are looking up a symbol and someone other thread >>>>>> has decremented the refcount to zero, this symbol should not be >>>>>> returned.? My test exercises this code even without the >>>>>> concurrent hashtable.? When the hashtable is concurrent, a >>>>>> zero-ed Symbol could be deallocated so we don't want to return it. >>>>>> >>>>> I think the following should be added as a comment in >>>>> increment_refcount(). >>>>>> In the case where you call increment_refcount() not during >>>>>> lookup, it is assumed that you have a symbol with a non-zero >>>>>> refcount and it can't go away while you are holding it. >>>> >>>> Ok, added. >>>>> >>>>>>> >>>>>>> If it's always an invalid condition, I think the fatal() should >>>>>>> be moved inside try_increment_refcount. >>>>>>> >>>>>> >>>>>> It isn't fatal at lookup.? The lookup must skip a zero-ed entry. >>>>>>> Otherwise, I think you need to add comments in all 3 places, to >>>>>>> say when it's possible to get a 0 refcount, and when it's not. >>>>>>> And, it might be worth expanding on why "zero is bad" :-) >>>>>> >>>>>> How about this comment to try_increment_refcount: >>>>>> >>>>>> // Increment refcount while checking for zero.? If the Symbol's >>>>>> refcount becomes zero >>>>>> // a thread could be concurrently removing the Symbol. This is >>>>>> used during SymbolTable >>>>>> // lookup to avoid reviving a dead Symbol. >>>>> Sounds good. >>>> >>>> Thanks, Ioi. >>>> Coleen >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>>>> >>>>>>> My guess is: >>>>>>> + if you're doing a lookup, you might be seeing Symbols that >>>>>>> have already been marked for deletion, which is indicated by a 0 >>>>>>> refcount. You want to skip such Symbols. >>>>>>> >>>>>>> + if you're incrementing the refcount, that means you're holding >>>>>>> a valid Symbol, which means this Symbol should have never been >>>>>>> marked for deletion. >>>>>>> >>>>>>> Is this correct? >>>>>> >>>>>> Yes, both true. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Gerard, thank you for the code review. >>>>>>>> >>>>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>>>>>>> Thank you Coleen (and Kim)! >>>>>>>>> >>>>>>>>> #1 Need copyright year updates: >>>>>>>>> >>>>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>>>> src/hotspot/share/classfile/symbolTable.cpp >>>>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>>>>>>> >>>>>>>> Yes, I'll update with my commit. >>>>>>>>> >>>>>>>>> #2 What?s the purpose of this code in >>>>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>>>> >>>>>>>>> ?? 38?? STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>>>>>>> >>>>>>>>> when we have: >>>>>>>>> >>>>>>>>> ? 117?? enum { >>>>>>>>> ? 118???? // max_symbol_length is constrained by type of _length >>>>>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>>>>> ? 120?? }; >>>>>>>>> >>>>>>>>> Wouldn?t that always be true?? Is it to make sure that nobody >>>>>>>>> changes max_symbol_length, because the implementation needs it >>>>>>>>> to be that? If so, should we add comment to: >>>>>>>>> >>>>>>>>> ? 119???? max_symbol_length = (1 << 16) -1 >>>>>>>>> >>>>>>>>> with a big warning of some sorts? >>>>>>>> >>>>>>>> Yes, it's so that we can store the length of the symbol into 16 >>>>>>>> bits. >>>>>>>> >>>>>>>> How I change the comment above max_symbol_length from: >>>>>>>> >>>>>>>> ??? // max_symbol_length is constrained by type of _length >>>>>>>> >>>>>>>> to >>>>>>>> >>>>>>>> ??? // max_symbol_length must fit into the top 16 bits of >>>>>>>> _length_and_refcount >>>>>>>> >>>>>>>>> >>>>>>>>> #3 If we have: >>>>>>>>> >>>>>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>>>>>>> >>>>>>>>> then why not >>>>>>>>> >>>>>>>>> ? 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>>>>>>> >>>>>>>>> or >>>>>>>>> ?? 39?? STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>>>>>>> ? 101 #define PERM_REFCOUNT 0xffff >>>>>>>>> >>>>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>>>>>>> >>>>>>>>> #4 We have: >>>>>>>>> >>>>>>>>> ? 221 void Symbol::increment_refcount() { >>>>>>>>> ? 222?? if (refcount() != PERM_REFCOUNT) { // not a permanent >>>>>>>>> symbol >>>>>>>>> ? 223???? if (!try_increment_refcount()) { >>>>>>>>> ? 224 #ifdef ASSERT >>>>>>>>> ? 225?????? print(); >>>>>>>>> ? 226 #endif >>>>>>>>> ? 227?????? fatal("refcount has gone to zero"); >>>>>>>>> >>>>>>>>> but >>>>>>>>> >>>>>>>>> ? 233 void Symbol::decrement_refcount() { >>>>>>>>> ? 234?? if (refcount() != PERM_REFCOUNT) { // not a permanent >>>>>>>>> symbol >>>>>>>>> ? 235???? int new_value = Atomic::sub((uint32_t)1, >>>>>>>>> &_length_and_refcount); >>>>>>>>> ? 236 #ifdef ASSERT >>>>>>>>> ? 237???? // Check if we have transitioned to 0xffff >>>>>>>>> ? 238???? if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>>>>>>> ? 239?????? print(); >>>>>>>>> ? 240?????? fatal("refcount underflow"); >>>>>>>>> ? 241???? } >>>>>>>>> ? 242 #endif >>>>>>>>> >>>>>>>>> Where the line: >>>>>>>>> >>>>>>>>> ? 240?????? fatal("refcount underflow?); >>>>>>>>> >>>>>>>>> is inside #ifdef ASSERT, but: >>>>>>>>> >>>>>>>>> 227?????? fatal("refcount has gone to zero?); >>>>>>>>> >>>>>>>>> is outside. Shouldn't ?fatal" be consistent in both? >>>>>>>>> >>>>>>>> >>>>>>>> I was thought that looked strange too.? I'll move the #endif >>>>>>>> from 226 to after 227. >>>>>>>> >>>>>>>> Thank you for reviewing the code! >>>>>>>> Coleen >>>>>>>> >>>>>>>>> cheers >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, >>>>>>>>>> and pack refcount and length into an int. >>>>>>>>>> >>>>>>>>>> This is a precurser change to the concurrent SymbolTable >>>>>>>>>> change. Zeroed refcounted entries can be deleted at anytime >>>>>>>>>> so they cannot be allowed to be zero in runtime code. Thanks >>>>>>>>>> to Kim for writing the packing function and helping me avoid >>>>>>>>>> undefined behavior. >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>>>>>>> >>>>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including >>>>>>>>>> solaris. Added multithreaded gtest which exercises the code. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Thu Jul 19 23:32:38 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Jul 2018 16:32:38 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> Thanks, Rahul! In fact, there no good experts for this area in the serviceability team. It would be much better if anyone from the Compiler team could do it. Vladimir K., Is there anyone from the Compiler team available to review this? Otherwise, I could try to review it but am not sure about my review quality. Thanks, Serguei On 7/19/18 00:48, Rahul Raghavan wrote: > RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > > (just adding + hotspot-compiler-dev also) > > > On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > Subject Was: > Re: RFR (S): C1 still does eden allocations when TLAB is enabled > > + serviceability-dev > > Hi all, > > Could anyone else give me a review of this webrev and check/test the > various architecture changes? > > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > > > Thanks for all your help! > Jc > > >> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: >> >>> Hi all, >>> >>> Here is a webrev that does all the architectures in the same way: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> Could anyone review the other architectures and test? >>> ?? - arm, sparc & aarch64 are also modified now to follow the same >>> "if no >>> tlab, then consider eden space allocation" logic. >>> >>> Thanks for your help! >>> Jc >>> >>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: >>> >>>> Hi Kim, >>>> >>>> I opened this bug >>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>> >>>> and now I've done an update: >>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>> >>>> I basically have done your nits but also removed the try_eden (it was >>>> used to bind a label but was not used). I updated the comments to >>>> use the >>>> one you preferred. >>>> >>>> I still have to do the other architectures though but at least we >>>> seem to >>>> have a consensus on this architecture, correct? >>>> >>>> Thanks for the review, >>>> Jc >>>> >>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>>> wrote: >>>> >>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>>>>> >>>>>> Yes, you are right, I did those changes due to: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>> >>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>> I'll go >>>>> ahead >>>>>> and propagate the change across architectures. >>>>>> >>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>> comment >>>>> and >>>>>> review) :) >>>>>> Jc >>>>>> >>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>>> wrote: >>>>>> >>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> I'm not sure if we had left this case intentionally or not but, >>>>>>> if we >>>>> want >>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>> >>>>>>> >>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>> speaks >>>>> up >>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>> >>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>> src/hotspot/share" >>>>>>> suggests that the GC group is most active in touching this feature. >>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>> >>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>> working on the GC to OK it. >>>>>>> >>>>>>> ? John >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks, >>>>>> Jc >>>>> >>>>> Robbin is on vacation; you might not hear from him for a while. >>>>> >>>>> I'm assuming you'll open a new bug for this? >>>>> >>>>> Except for a few minor nits (below), this looks okay to me. >>>>> >>>>> The comment at line 1052 needs updating. >>>>> >>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>> >>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>> line 1058, but unreferenced. >>>>> >>>>> I like the wording of the comment at 1139 better than the wording at >>>>> 1016. >>>>> >>>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Jc >>>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc >>> >> >> From navy.xliu at gmail.com Fri Jul 20 07:16:05 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Fri, 20 Jul 2018 00:16:05 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Message-ID: Hello, Vladimir, Could you run on other platform on behalf of Martin? I locally tested on x86_64. I hope the Reviewer can help me verify it works on other platforms. Furthermore, I am sure if we should add this additional patch. Label class is not POD, we should properly call constructor /destructor even though those labels are allocated on arena. thanks, --lx On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin wrote: > Hi Liu Xin, > > > > thanks for understanding my point and checking other places. > > > > The templateTable_x86.cpp was reviewed by me. > > I can?t review the label assertion before my vacation. If other reviewers > are convinced that the it is correct, ok. > > > > Would be great if somebody could assist with testing other platforms. > > > > Best regards, > > Martin > > > > > > *From:* Liu Xin [mailto:navy.xliu at gmail.com] > *Sent:* Dienstag, 17. Juli 2018 19:17 > > *To:* Doerr, Martin > *Cc:* hotspot-runtime-dev at openjdk.java.net > *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler > Labels for x86 > > > > Hi, Martin, > > > > Thank you for the feedback. > > > > I totally agree with you that we shouldn?t introduce false positive > assertion. Let?s insist on the high bar here. > > I browsed many sources in hotspot recently. Hotspot is the most monolithic > software I ever seen. I am glad to be directed by a guidance and clear > target. > > > > I think I dealt with c1 bailout case. This case triggers "codebuffer > overflow" in middle of c1 compilation. > > compiler/codegen/TestCharVect2.java > > > > I am still not sure about c2 bailout case. Let me try to make one. > > > > For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp > contains many emits methods for MachNode. I will double-check if they could > leave unused labels. > > > > Thanks, > > ?lx > > > > > > On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: > > > > Hi, List, > > > > Could you review this new revision? > > https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ > jdk/label_bugfix/index.html > > > > > > i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I > don?t understand all the assemblies, but I think they are guarded > for UseOnStackReplacement > > in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). > > > > TemplateTable_arm.cpp is a little different. It explicitly binds it later. > > if (!UseOnStackReplacement) { > > __ bind(backedge_counter_overflow); > > } > > > > i) I checked the Compile::scratch_emit_size. It only uses the label fakeL > for those MachBranch nodes. > > Because fakeL will be bound to a trivial address if the nodes are > MachBranch, It?s also safe for the assertion. > > > > bool is_branch = n->is_MachBranch(); > > if (is_branch) { > > MacroAssembler masm(&buf); > > masm.bind(fakeL); > > n->as_MachBranch()->save_label(&saveL, &save_bnum); > > n->as_MachBranch()->label_set(&fakeL, 0); > > } > > > > Thanks, > > ?lx > > > > > > > > On Jul 16, 2018, at 1:30 AM, Doerr, Martin wrote: > > > > Hi Liu Xin, > > > > thanks for changing. > > > > > The background of this Assertion is that our engineer used to spend many > hour to trace down a corner case. > > > it's trivial if fastdebug/slowdebug stop and tell you immediately. > > > > I understand that. But an assertion should only get added when we are > convinced that it won?t produce false positives. > > It?s very annoying if long running tests break due to an incorrect > assertion after running many days. > > > > > I am curious about this "We also may generate code with the purpose to > determine its size.". > > > Could you tell me where is it? it looks quite slow to get buffer size in > this way. > > > > C2 Compiler does that in Compile::scratch_emit_size. > > > > Please note that I?ll be on vacation soon, so other people will have to > review. > > Thanks again for fixing the -XX:-UseOnStackReplacement issue. > > > > Best regards, > > Martin > > > > > > *From:* Liu Xin [mailto:navy.xliu at gmail.com ] > *Sent:* Freitag, 13. Juli 2018 22:30 > *To:* Doerr, Martin > *Cc:* hotspot-runtime-dev at openjdk.java.net > *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler > Labels for x86 > > > > Hello, Martin, > > > > Thanks for reviewing it. > > > > I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" > and is running tests. > > > > The background of this Assertion is that our engineer used to spend many > hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop > and tell you immediately. > > > > I am curious about this "We also may generate code with the purpose to > determine its size.". Could you tell me where is it? it looks quite slow > to get buffer size in this way. > > > > thanks, > > --lx > > > > > > On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > wrote: > > Hi, > > thanks for fixing the issue in templateTable_x86. It looks correct. > I think even better would be > "UseOnStackReplacement ? &backedge_counter_overflow : NULL" > and > "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. > But I leave it up to you if you want to change it. I'm also ok with your > version. > > I'm not convinced that the label assertion is reliable. There may be many > more places in hotspot where we bail out having an unbound label. Running a > few tests on x86 is by far not sufficient. The assertion may fire > sporadically in large scenarios on some platforms. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Liu Xin > Sent: Donnerstag, 12. Juli 2018 22:51 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels > for x86 > > Could you review this patch again? > > Revision #2. > Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 ttps://bugs.openjdk.java.net/browse/JDK-8206075> > CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ > openjdk8u/webrev/index.html com/openjdk-webrevs/openjdk8u/webrev/index.html> > > > > The idea is simple. I just reset the problematic label when c1 compilation > bailout happen. > I manually ran tier1 on my laptop. it can pass all of them. > Paul help me submit the patch to submit and here is the run result. > Build Details: 2018-07-12-1736388.hohensee.source > > 0 Failed Tests > > Mach5 Tasks Results Summary > > PASSED: 75 > UNABLE_TO_RUN: 0 > KILLED: 0 > NA: 0 > FAILED: 0 > EXECUTED_WITH_FAILURE: 0 > > > Thanks, > ?lx > > On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: > > > > Thank you for your reviews. Indeed, I didn?t deal with bailout > situation. "compiler/codegen/TestCharVect2.java? is the case of > codeBuffer overflow and leave a unbound label behind. > > I made another revision. I will run tests thoroughly. > > > > Thanks, > > ?lx > > > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > wrote: > >> > >> Imo it's still good hygiene to require that Labels be bound if they're > used, even if the generated code will never be executed. E.g., code that > generates code for sizing purposes may be repurposed to generate executable > code, in which case an unbound label may be a lurking bug. Also, I'm > unaware (I may be corrected!) of any situation where bailing out happens in > such a way as to both leave a Label unbound and execute its destructor. > Even if there are, I'd say that'd be indicative of another real problem, > such as code buffer overflow, so no harm would result. > >> > >> Thanks, > >> > >> Paul > >> > >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" < > hotspot-runtime-dev-bounces at openjdk.java.net on behalf of > martin.doerr at sap.com> wrote: > >> > >> Hi, > >> > >> I think the idea is good, but doesn't work in all cases. > >> We may bail out from code generation and discard the generated code > leaving the label unbound. > >> We also may generate code with the purpose to determine its size. We > don't need to bind labels because the code will never get executed. > >> > >> Best regards, > >> Martin > >> > >> > >> -----Original Message----- > >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov > >> Sent: Mittwoch, 11. Juli 2018 03:34 > >> To: Liu Xin ; hotspot > -runtime-dev at openjdk.java.net > >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler > Labels for x86 > >> > >> I hit new assert in few other tests: > >> > >> compiler/codegen/TestCharVect2.java > >> compiler/c2/cr6340864/* > >> > >> Regards, > >> Vladimir > >> > >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > >>> Fix looks reasonable. I will test it in our framework. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> On 7/10/18 9:50 AM, Liu Xin wrote: > >>>> Hi, Community, > >>>> Could you please review this small patch? > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > >>>> > >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ > >>>> > >>>> Problem: > >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is OFF. > >>>> This patch align up x86 with other architectures(ppc, arm). > >>>> Add an assertion to the destructor of Label. It will be wiped out in > release build. > >>>> Previously, hotspot cannot pass this test with assertion on x86-64. > >>>> make run-test TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java > >>>> If this CR is approved, Paul Hohensee will push it. > >>>> Thanks, > >>>> --lx > >>>> > >> > >> > > > > > > > From goetz.lindenmaier at sap.com Fri Jul 20 07:29:41 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 20 Jul 2018 07:29:41 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Message-ID: Hi Liu, Martin had put the patch into our testing queue. All the platforms we build are fine. This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, aix ppc64, solaris sparcv9, mac. Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Liu Xin > Sent: Freitag, 20. Juli 2018 09:16 > To: Vladimir Kozlov > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for > x86 > > Hello, Vladimir, > Could you run on other platform on behalf of Martin? > I locally tested on x86_64. I hope the Reviewer can help me verify it works > on other platforms. > > > Furthermore, I am sure if we should add this additional patch. > Label class is not POD, we should properly call constructor /destructor > even though those labels are allocated on arena. > > > thanks, > --lx > > On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin > wrote: > > > Hi Liu Xin, > > > > > > > > thanks for understanding my point and checking other places. > > > > > > > > The templateTable_x86.cpp was reviewed by me. > > > > I can?t review the label assertion before my vacation. If other reviewers > > are convinced that the it is correct, ok. > > > > > > > > Would be great if somebody could assist with testing other platforms. > > > > > > > > Best regards, > > > > Martin > > > > > > > > > > > > *From:* Liu Xin [mailto:navy.xliu at gmail.com] > > *Sent:* Dienstag, 17. Juli 2018 19:17 > > > > *To:* Doerr, Martin > > *Cc:* hotspot-runtime-dev at openjdk.java.net > > *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler > > Labels for x86 > > > > > > > > Hi, Martin, > > > > > > > > Thank you for the feedback. > > > > > > > > I totally agree with you that we shouldn?t introduce false positive > > assertion. Let?s insist on the high bar here. > > > > I browsed many sources in hotspot recently. Hotspot is the most monolithic > > software I ever seen. I am glad to be directed by a guidance and clear > > target. > > > > > > > > I think I dealt with c1 bailout case. This case triggers "codebuffer > > overflow" in middle of c1 compilation. > > > > compiler/codegen/TestCharVect2.java > > > > > > > > I am still not sure about c2 bailout case. Let me try to make one. > > > > > > > > For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp > > contains many emits methods for MachNode. I will double-check if they > could > > leave unused labels. > > > > > > > > Thanks, > > > > ?lx > > > > > > > > > > > > On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: > > > > > > > > Hi, List, > > > > > > > > Could you review this new revision? > > > > https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ > > jdk/label_bugfix/index.html > > > > > > > > > > > > i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I > > don?t understand all the assemblies, but I think they are guarded > > for UseOnStackReplacement > > > > in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). > > > > > > > > TemplateTable_arm.cpp is a little different. It explicitly binds it later. > > > > if (!UseOnStackReplacement) { > > > > __ bind(backedge_counter_overflow); > > > > } > > > > > > > > i) I checked the Compile::scratch_emit_size. It only uses the label fakeL > > for those MachBranch nodes. > > > > Because fakeL will be bound to a trivial address if the nodes are > > MachBranch, It?s also safe for the assertion. > > > > > > > > bool is_branch = n->is_MachBranch(); > > > > if (is_branch) { > > > > MacroAssembler masm(&buf); > > > > masm.bind(fakeL); > > > > n->as_MachBranch()->save_label(&saveL, &save_bnum); > > > > n->as_MachBranch()->label_set(&fakeL, 0); > > > > } > > > > > > > > Thanks, > > > > ?lx > > > > > > > > > > > > > > > > On Jul 16, 2018, at 1:30 AM, Doerr, Martin wrote: > > > > > > > > Hi Liu Xin, > > > > > > > > thanks for changing. > > > > > > > > > The background of this Assertion is that our engineer used to spend > many > > hour to trace down a corner case. > > > > > it's trivial if fastdebug/slowdebug stop and tell you immediately. > > > > > > > > I understand that. But an assertion should only get added when we are > > convinced that it won?t produce false positives. > > > > It?s very annoying if long running tests break due to an incorrect > > assertion after running many days. > > > > > > > > > I am curious about this "We also may generate code with the purpose to > > determine its size.". > > > > > Could you tell me where is it? it looks quite slow to get buffer size in > > this way. > > > > > > > > C2 Compiler does that in Compile::scratch_emit_size. > > > > > > > > Please note that I?ll be on vacation soon, so other people will have to > > review. > > > > Thanks again for fixing the -XX:-UseOnStackReplacement issue. > > > > > > > > Best regards, > > > > Martin > > > > > > > > > > > > *From:* Liu Xin [mailto:navy.xliu at gmail.com ] > > *Sent:* Freitag, 13. Juli 2018 22:30 > > *To:* Doerr, Martin > > *Cc:* hotspot-runtime-dev at openjdk.java.net > > *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler > > Labels for x86 > > > > > > > > Hello, Martin, > > > > > > > > Thanks for reviewing it. > > > > > > > > I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" > > and is running tests. > > > > > > > > The background of this Assertion is that our engineer used to spend many > > hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop > > and tell you immediately. > > > > > > > > I am curious about this "We also may generate code with the purpose to > > determine its size.". Could you tell me where is it? it looks quite slow > > to get buffer size in this way. > > > > > > > > thanks, > > > > --lx > > > > > > > > > > > > On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > > wrote: > > > > Hi, > > > > thanks for fixing the issue in templateTable_x86. It looks correct. > > I think even better would be > > "UseOnStackReplacement ? &backedge_counter_overflow : NULL" > > and > > "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. > > But I leave it up to you if you want to change it. I'm also ok with your > > version. > > > > I'm not convinced that the label assertion is reliable. There may be many > > more places in hotspot where we bail out having an unbound label. > Running a > > few tests on x86 is by far not sufficient. The assertion may fire > > sporadically in large scenarios on some platforms. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > bounces at openjdk.java.net] On Behalf Of Liu Xin > > Sent: Donnerstag, 12. Juli 2018 22:51 > > To: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels > > for x86 > > > > Could you review this patch again? > > > > Revision #2. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > ttps://bugs.openjdk.java.net/browse/JDK-8206075> > > CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ > > openjdk8u/webrev/index.html > com/openjdk-webrevs/openjdk8u/webrev/index.html> > > > > > > > > The idea is simple. I just reset the problematic label when c1 compilation > > bailout happen. > > I manually ran tier1 on my laptop. it can pass all of them. > > Paul help me submit the patch to submit and here is the run result. > > Build Details: 2018-07-12-1736388.hohensee.source > > > > 0 Failed Tests > > > > Mach5 Tasks Results Summary > > > > PASSED: 75 > > UNABLE_TO_RUN: 0 > > KILLED: 0 > > NA: 0 > > FAILED: 0 > > EXECUTED_WITH_FAILURE: 0 > > > > > > Thanks, > > ?lx > > > On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: > > > > > > Thank you for your reviews. Indeed, I didn?t deal with bailout > > situation. "compiler/codegen/TestCharVect2.java? is the case of > > codeBuffer overflow and leave a unbound label behind. > > > I made another revision. I will run tests thoroughly. > > > > > > Thanks, > > > ?lx > > > > > >> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > > wrote: > > >> > > >> Imo it's still good hygiene to require that Labels be bound if they're > > used, even if the generated code will never be executed. E.g., code that > > generates code for sizing purposes may be repurposed to generate > executable > > code, in which case an unbound label may be a lurking bug. Also, I'm > > unaware (I may be corrected!) of any situation where bailing out happens > in > > such a way as to both leave a Label unbound and execute its destructor. > > Even if there are, I'd say that'd be indicative of another real problem, > > such as code buffer overflow, so no harm would result. > > >> > > >> Thanks, > > >> > > >> Paul > > >> > > >> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" > < > > hotspot-runtime-dev-bounces at openjdk.java.net on behalf of > > martin.doerr at sap.com> wrote: > > >> > > >> Hi, > > >> > > >> I think the idea is good, but doesn't work in all cases. > > >> We may bail out from code generation and discard the generated code > > leaving the label unbound. > > >> We also may generate code with the purpose to determine its size. We > > don't need to bind labels because the code will never get executed. > > >> > > >> Best regards, > > >> Martin > > >> > > >> > > >> -----Original Message----- > > >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov > > >> Sent: Mittwoch, 11. Juli 2018 03:34 > > >> To: Liu Xin ; hotspot > > -runtime-dev at openjdk.java.net > > >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler > > Labels for x86 > > >> > > >> I hit new assert in few other tests: > > >> > > >> compiler/codegen/TestCharVect2.java > > >> compiler/c2/cr6340864/* > > >> > > >> Regards, > > >> Vladimir > > >> > > >> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > > >>> Fix looks reasonable. I will test it in our framework. > > >>> > > >>> Thanks, > > >>> Vladimir > > >>> > > >>> On 7/10/18 9:50 AM, Liu Xin wrote: > > >>>> Hi, Community, > > >>>> Could you please review this small patch? > > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > > >>>> > > >>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ > > >>>> > > >>>> Problem: > > >>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is > OFF. > > >>>> This patch align up x86 with other architectures(ppc, arm). > > >>>> Add an assertion to the destructor of Label. It will be wiped out in > > release build. > > >>>> Previously, hotspot cannot pass this test with assertion on x86-64. > > >>>> make run-test > TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java > > >>>> If this CR is approved, Paul Hohensee will push it. > > >>>> Thanks, > > >>>> --lx > > >>>> > > >> > > >> > > > > > > > > > > > > > From goetz.lindenmaier at sap.com Fri Jul 20 08:24:13 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 20 Jul 2018 08:24:13 +0000 Subject: PPC64: jfr profiling doesn't work (PPC64 only) In-Reply-To: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> References: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> Message-ID: <51f19327d29c4ce089706c8ec7e3aed9@sap.com> Hi Gunter, thanks for fixing these issues. frame_ppc.cpp: I think instead of JavaThread::stack_red_zone_size() + JavaThread::stack_yellow_zone_size() you should use JavaThread::stack_red_zone_size() + JavaThread::stack_yellow_reserved_zone_size() Minor stuff: Please add RFR(S): to the Subject of your mail. Also, ususally the bug title is prefixed with [ppc]. Please remove redundant spaces from code like address sp = (address)_sp; (fp <= thread->stack_base()) && (fp > sp) as well as double newlines. No space before ) please: if (_cb != NULL ) { Also please break some of the comments to shorter lines. thread_linux_ppc.cpp also just minor stuff: Please fix indentations. Please indent by two with spaces, no tabs. There is an empty if (ProfileInterpreter) { }. Why? I can sponsor this for you. Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Haug, Gunter > Sent: Donnerstag, 19. Juli 2018 12:53 > To: hotspot-runtime-dev at openjdk.java.net > Subject: [CAUTION] PPC64: jfr profiling doesn't work (PPC64 only) > > Hi all, > > can I please have reviews and a sponsor for the following fix: > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK- > 8207392?filter=allopenissues > http://cr.openjdk.java.net/~ghaug/webrevs/8207392/ > > JFR profiling on linux PPC64 has not been implemented correctly so far, the > VM crashes when it is turned on. Therefore > hotspot/jtreg/runtime/appcds/TestWithProfiler.java fails. With this fix the > test succeeds. I've analyzed a couple of benchmarks with JMC and results > look plausible when compared to linux x86. > > Thanks and best regards, > Gunter > > > > From thomas.schatzl at oracle.com Fri Jul 20 10:22:56 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 20 Jul 2018 12:22:56 +0200 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Hi, On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote: > Hi all, > > Here is a webrev that does all the architectures in the same way: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > > Could anyone review the other architectures and test? > - arm, sparc & aarch64 are also modified now to follow the same "if > no > tlab, then consider eden space allocation" logic. > > Thanks for your help! > Jc > looks good. I ran the change through hs-tier1-3 with no issues. It only tests on sparc and x64 though. I do not expect issues on the other platforms though :) Thanks, Thomas From volker.simonis at gmail.com Fri Jul 20 12:27:47 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 20 Jul 2018 14:27:47 +0200 Subject: PPC64: jfr profiling doesn't work (PPC64 only) In-Reply-To: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> References: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> Message-ID: Hi Gunter, thanks for fixing this! The change looks god in general. Please find some comments questions below: src/hotspot/cpu/ppc/frame_ppc.cpp ========================== 78 // an fp must be within the stack and above (but not equal) sp 79 bool fp_safe = (fp <= thread->stack_base()) && (fp > sp) && ((fp - sp) >= (ijava_state_size + top_ijava_frame_abi_size)); Is this check for interpreter frames only? Then better name it 'fp_interp_safe' and adapt the comment. Otherwise, why does the 'fp - sp' have to be larger than the java interpreter state? Also the line is quite long. Better break it after '&&' 81 // We know sp/unextended_sp are safe only fp is questionable here Better put a comma (or period) after 'safe' to make it more readable. 88 // First check if frame is complete and tester is reliable 89 // Unfortunately we can only check frame complete for runtime stubs and nmethod 90 // other generic buffer blobs are more problematic so we just assume they are 91 // ok. adapter blobs never have a frame complete and are never ok. Better: "First check if the frame is complete and the test is reliable. Unfortunately we can only check frame completeness for runtime stubs and nmethods. Other generic buffer blobs are more problematic so we just assume they are OK. Adapter blobs never have a complete frame and are never OK." In general please start comments with an uppercase letter and use a period at the end of sentences. 98 // Could just be some random pointer within the codeBlob 99 if (!_cb->code_contains(_pc)) { Shouldn't this be the first, basic check after we know that '_cb != NULL' (i.e. even before we check for frame completeness)? 103 // Entry frame checks 104 if (is_entry_frame()) { 105 // an entry frame must have a valid fp. 106 return fp_safe && is_entry_frame_valid(thread); 107 } An entry frame is not an interpreter frame but you use 'fp_safe' as computed for interpreter frames which is probably too conservative. Maybe the check in 'is_entry_frame_valid()' is sufficient already? 118 CodeBlob* sender_blob = CodeCache::find_blob_unsafe(sender_pc); 119 if (sender_pc == NULL || sender_blob == NULL) { 120 return false; 121 } 'find_blob_unsafe()' returns NULL if the 'sender_pc' is NULL so there's no need for the extra 'sender_pc == NULL' check in the if-clause. 135 // an fp must be within the stack and above (but not equal) current frame's _FP 136 137 bool sender_fp_safe = (sender_fp <= thread->stack_base()) && (sender_fp > fp); 138 139 if (!sender_fp_safe) { 140 return false; 141 } Shorter: 135 // sender_fp must be within the stack and above (but not equal) current frame's fp 137 if (sender_fp > thread->stack_base() || sender_fp <= fp) { 140 return false; 141 } 158 if (sender.is_entry_frame()) { 159 // Validate the JavaCallWrapper an entry frame must have 160 161 address jcw = (address)sender.entry_frame_call_wrapper(); 162 163 bool jcw_safe = (jcw <= thread->stack_base()) && (jcw > sender_fp); 164 return jcw_safe; 165 } Why don't you use 'sender.is_entry_frame_valid()' valid instead of duplicating that code here? 173 // Could put some more validation for the potential non-interpreted sender 174 // frame we'd create by calling sender if I could think of any. Wait for next crash in forte... 175 176 // One idea is seeing if the sender_pc we have is one that we'd expect to call to current cb I think these comments are leftovers from other architectures which can be removed (we don't support 'forte' on ppc :) 184 // Must be native-compiled frame. Since sender will try and use fp to find 185 // linkages it must be safe 186 187 if (!fp_safe) return false; If it's a native compiled frame the 'fp_safe' check is too strict because it was computed for interpreter frames. 189 // could try and do some more potential verification of native frame if we could think of some... Useless comment - can be removed. src/hotspot/os_cpu/linux_ppc/thread_linux_ppc.cpp ===================================== 45 assert(this->is_Java_thread(), "must be JavaThread"); 46 JavaThread* jt = (JavaThread *)this; 'this' is already a 'JavaThread' so no need for the assertion and the new local variable 'jt'. 81 if (ProfileInterpreter) { 82 } Unused - can be deleted. Regards, Volker On Thu, Jul 19, 2018 at 12:53 PM, Haug, Gunter wrote: > Hi all, > > can I please have reviews and a sponsor for the following fix: > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8207392?filter=allopenissues > http://cr.openjdk.java.net/~ghaug/webrevs/8207392/ > > JFR profiling on linux PPC64 has not been implemented correctly so far, the VM crashes when it is turned on. Therefore hotspot/jtreg/runtime/appcds/TestWithProfiler.java fails. With this fix the test succeeds. I've analyzed a couple of benchmarks with JMC and results look plausible when compared to linux x86. > > Thanks and best regards, > Gunter > > > > > From gerard.ziemski at oracle.com Fri Jul 20 14:51:03 2018 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Fri, 20 Jul 2018 09:51:03 -0500 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <8a37d7da-5547-75b3-aec3-fd3bbe8e6a78@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> <3efca5db-b37a-0afd-e560-50ebfd93e638@oracle.com> <8a37d7da-5547-75b3-aec3-fd3bbe8e6a78@oracle.com> Message-ID: <48D0BDFB-5CB0-4D86-8BBB-D3A56E33E98F@oracle.com> hi Coleen, I agree with the principle of the workaround, but shouldn?t it be more something like: int count = 0; for (HashtableEntry* e = bucket(index); e != NULL; e = e->next()) { count++; // count all entries in this bucket, not just ones with same hash if (e->hash() == hash) { Symbol* sym = e->literal(); - if (sym->equals(name, len) && sym->try_increment_refcount()) { + // Skip checking already dead symbols in the bucket. + if (sym->refcount() == 0) { + count--; // Don't count this symbol towards rehashing. + } else if (sym->equals(name, len) { + if (sym->try_increment_refcount()) { // something is referencing this symbol now. return sym; + } else { + count--; // Don't count this symbol towards rehashing. + } } } } cheers > On Jul 19, 2018, at 5:14 PM, coleen.phillimore at oracle.com wrote: > > > Hi, There is a closed test that does 100,000 lookups on a class that fails resolution, so creates 100,000 Symbols with TempNewSymbol. This results in many zeroed refcounted Symbols in the table which increases lookup time with the current SymbolTable. With the new concurrent symbol table, which this change is intended to support, the zero refcount symbols are cleaned up on insert and concurrently. > > I have a workaround so that this test doesn't time out. These are the times for this test on my machine. > > old hashtable no patch: 7.32 seconds > without workaround: 367 seconds (which can time out on a slow machine) > with workaround: 61.075 seconds > with new hashtable: 9.135 seconds > > There are several ways to fix the old hashtable so that it cleans more frequently for this situation but it's not worth doing with the new concurrent hashtable coming. > > open webrev at http://cr.openjdk.java.net/~coleenp/03.incr/webrev > > Thanks, > Coleen > > On 7/19/18 3:09 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/19/18 3:07 PM, Ioi Lam wrote: >>> Looks good! >> >> Thanks, Ioi! >> Coleen >>> >>> Thanks >>> >>> - Ioi >>> >>> >>> On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote: >>>> Please review the revision to this change. Summary: >>>> >>>> * made decrement_refcount() use CAS loop. >>>> * fixed duplicated logic in try_increment_refcount() thanks to Kim >>>> * added gtest case for decrement_refcount. >>>> * fixed SA code. >>>> * added a bunch of comments >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev >>>> >>>> Retested with hs-tier1-3. >>>> Thanks, >>>> Coleen >>>> >>>> On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 7/18/18 6:35 PM, Ioi Lam wrote: >>>>>> >>>>>> >>>>>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> >>>>>>> On 7/18/18 5:14 PM, Ioi Lam wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> The changes look good! The new operations on _length_and_refcount are much cleaner than my old ATOMIC_SHORT_PAIR hack. >>>>>>> >>>>>>> Yes, this makes more sense to me. >>>>>>>> >>>>>>>> symbolTable.cpp: >>>>>>>> >>>>>>>> SymbolTable::lookup_dynamic() { >>>>>>>> ... >>>>>>>> 214 Symbol* sym = e->literal(); >>>>>>>> 215 if (sym->equals(name, len) && sym->try_increment_refcount()) { >>>>>>>> 216 // something is referencing this symbol now. >>>>>>>> 217 return sym; >>>>>>>> 218 } >>>>>>>> >>>>>>>> >>>>>>>> symbol.cpp: >>>>>>>> >>>>>>>> 221 void Symbol::increment_refcount() { >>>>>>>> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>>>> 223 if (!try_increment_refcount()) { >>>>>>>> 224 #ifdef ASSERT >>>>>>>> 225 print(); >>>>>>>> 226 #endif >>>>>>>> 227 fatal("refcount has gone to zero"); >>>>>>>> 228 } >>>>>>>> 229 NOT_PRODUCT(Atomic::inc(&_total_count);) >>>>>>>> 230 } >>>>>>>> 231 } >>>>>>>> >>>>>>>> 246 // Atomically increment while checking for zero, zero is bad. >>>>>>>> 247 bool Symbol::try_increment_refcount() { >>>>>>>> 248 uint32_t old_value = _length_and_refcount; // fetch once >>>>>>>> 249 int refc = extract_refcount(old_value); >>>>>>>> 250 >>>>>>>> 251 if (refc == PERM_REFCOUNT) { >>>>>>>> 252 return true; >>>>>>>> 253 } else if (refc == 0) { >>>>>>>> 254 return false; // effectively dead, can't revive >>>>>>>> 255 } >>>>>>>> 256 >>>>>>>> 257 uint32_t now; >>>>>>>> 258 while ((now = Atomic::cmpxchg(old_value + 1, &_length_and_refcount, old_value)) != old_value) { >>>>>>>> 259 // failed to increment, check refcount again. >>>>>>>> 260 refc = extract_refcount(now); >>>>>>>> 261 if (refc == 0) { >>>>>>>> 262 return false; // just died >>>>>>>> 263 } else if (refc == PERM_REFCOUNT) { >>>>>>>> 264 return true; // just became permanent >>>>>>>> 265 } >>>>>>>> 266 old_value = now; // refcount changed, try again >>>>>>>> 267 } >>>>>>>> 268 return true; >>>>>>>> 269 } >>>>>>>> >>>>>>>> >>>>>>>> So is it valid for Symbol::try_increment_refcount() to return false? SymbolTable::lookup_dynamic() seems to suggest YES, but Symbol::increment_refcount() seems to suggest NO. >>>>>>> >>>>>>> True. If you are looking up a symbol and someone other thread has decremented the refcount to zero, this symbol should not be returned. My test exercises this code even without the concurrent hashtable. When the hashtable is concurrent, a zero-ed Symbol could be deallocated so we don't want to return it. >>>>>>> >>>>>> I think the following should be added as a comment in increment_refcount(). >>>>>>> In the case where you call increment_refcount() not during lookup, it is assumed that you have a symbol with a non-zero refcount and it can't go away while you are holding it. >>>>> >>>>> Ok, added. >>>>>> >>>>>>>> >>>>>>>> If it's always an invalid condition, I think the fatal() should be moved inside try_increment_refcount. >>>>>>>> >>>>>>> >>>>>>> It isn't fatal at lookup. The lookup must skip a zero-ed entry. >>>>>>>> Otherwise, I think you need to add comments in all 3 places, to say when it's possible to get a 0 refcount, and when it's not. And, it might be worth expanding on why "zero is bad" :-) >>>>>>> >>>>>>> How about this comment to try_increment_refcount: >>>>>>> >>>>>>> // Increment refcount while checking for zero. If the Symbol's refcount becomes zero >>>>>>> // a thread could be concurrently removing the Symbol. This is used during SymbolTable >>>>>>> // lookup to avoid reviving a dead Symbol. >>>>>> Sounds good. >>>>> >>>>> Thanks, Ioi. >>>>> Coleen >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>>>> >>>>>>>> My guess is: >>>>>>>> + if you're doing a lookup, you might be seeing Symbols that have already been marked for deletion, which is indicated by a 0 refcount. You want to skip such Symbols. >>>>>>>> >>>>>>>> + if you're incrementing the refcount, that means you're holding a valid Symbol, which means this Symbol should have never been marked for deletion. >>>>>>>> >>>>>>>> Is this correct? >>>>>>> >>>>>>> Yes, both true. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>> >>>>>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> Gerard, thank you for the code review. >>>>>>>>> >>>>>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>>>>>>>> Thank you Coleen (and Kim)! >>>>>>>>>> >>>>>>>>>> #1 Need copyright year updates: >>>>>>>>>> >>>>>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>>>>> src/hotspot/share/classfile/symbolTable.cpp >>>>>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>>>>>>>> >>>>>>>>> Yes, I'll update with my commit. >>>>>>>>>> >>>>>>>>>> #2 What?s the purpose of this code in src/hotspot/share/oops/symbol.cpp >>>>>>>>>> >>>>>>>>>> 38 STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>>>>>>>> >>>>>>>>>> when we have: >>>>>>>>>> >>>>>>>>>> 117 enum { >>>>>>>>>> 118 // max_symbol_length is constrained by type of _length >>>>>>>>>> 119 max_symbol_length = (1 << 16) -1 >>>>>>>>>> 120 }; >>>>>>>>>> >>>>>>>>>> Wouldn?t that always be true? Is it to make sure that nobody changes max_symbol_length, because the implementation needs it to be that? If so, should we add comment to: >>>>>>>>>> >>>>>>>>>> 119 max_symbol_length = (1 << 16) -1 >>>>>>>>>> >>>>>>>>>> with a big warning of some sorts? >>>>>>>>> >>>>>>>>> Yes, it's so that we can store the length of the symbol into 16 bits. >>>>>>>>> >>>>>>>>> How I change the comment above max_symbol_length from: >>>>>>>>> >>>>>>>>> // max_symbol_length is constrained by type of _length >>>>>>>>> >>>>>>>>> to >>>>>>>>> >>>>>>>>> // max_symbol_length must fit into the top 16 bits of _length_and_refcount >>>>>>>>> >>>>>>>>>> >>>>>>>>>> #3 If we have: >>>>>>>>>> >>>>>>>>>> 39 STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>>>>>>>> >>>>>>>>>> then why not >>>>>>>>>> >>>>>>>>>> 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>>>>>>>> >>>>>>>>>> or >>>>>>>>>> 39 STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>>>>>>>> 101 #define PERM_REFCOUNT 0xffff >>>>>>>>>> >>>>>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>>>>>>>> >>>>>>>>>> #4 We have: >>>>>>>>>> >>>>>>>>>> 221 void Symbol::increment_refcount() { >>>>>>>>>> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>>>>>> 223 if (!try_increment_refcount()) { >>>>>>>>>> 224 #ifdef ASSERT >>>>>>>>>> 225 print(); >>>>>>>>>> 226 #endif >>>>>>>>>> 227 fatal("refcount has gone to zero"); >>>>>>>>>> >>>>>>>>>> but >>>>>>>>>> >>>>>>>>>> 233 void Symbol::decrement_refcount() { >>>>>>>>>> 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>>>>>> 235 int new_value = Atomic::sub((uint32_t)1, &_length_and_refcount); >>>>>>>>>> 236 #ifdef ASSERT >>>>>>>>>> 237 // Check if we have transitioned to 0xffff >>>>>>>>>> 238 if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>>>>>>>> 239 print(); >>>>>>>>>> 240 fatal("refcount underflow"); >>>>>>>>>> 241 } >>>>>>>>>> 242 #endif >>>>>>>>>> >>>>>>>>>> Where the line: >>>>>>>>>> >>>>>>>>>> 240 fatal("refcount underflow?); >>>>>>>>>> >>>>>>>>>> is inside #ifdef ASSERT, but: >>>>>>>>>> >>>>>>>>>> 227 fatal("refcount has gone to zero?); >>>>>>>>>> >>>>>>>>>> is outside. Shouldn't ?fatal" be consistent in both? >>>>>>>>>> >>>>>>>>> >>>>>>>>> I was thought that looked strange too. I'll move the #endif from 226 to after 227. >>>>>>>>> >>>>>>>>> Thank you for reviewing the code! >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>> cheers >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> >>>>>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and pack refcount and length into an int. >>>>>>>>>>> >>>>>>>>>>> This is a precurser change to the concurrent SymbolTable change. Zeroed refcounted entries can be deleted at anytime so they cannot be allowed to be zero in runtime code. Thanks to Kim for writing the packing function and helping me avoid undefined behavior. >>>>>>>>>>> >>>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>>>>>>>> >>>>>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including solaris. Added multithreaded gtest which exercises the code. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Fri Jul 20 14:53:25 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Jul 2018 10:53:25 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <48D0BDFB-5CB0-4D86-8BBB-D3A56E33E98F@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> <3efca5db-b37a-0afd-e560-50ebfd93e638@oracle.com> <8a37d7da-5547-75b3-aec3-fd3bbe8e6a78@oracle.com> <48D0BDFB-5CB0-4D86-8BBB-D3A56E33E98F@oracle.com> Message-ID: On 7/20/18 10:51 AM, Gerard Ziemski wrote: > hi Coleen, > > I agree with the principle of the workaround, but shouldn?t it be more something like: > > int count = 0; > for (HashtableEntry* e = bucket(index); e != NULL; e = e->next()) { > count++; // count all entries in this bucket, not just ones with same hash > if (e->hash() == hash) { > Symbol* sym = e->literal(); > > - if (sym->equals(name, len) && sym->try_increment_refcount()) { > + // Skip checking already dead symbols in the bucket. > + if (sym->refcount() == 0) { > + count--; // Don't count this symbol towards rehashing. > + } else if (sym->equals(name, len) { > + if (sym->try_increment_refcount()) { > // something is referencing this symbol now. > return sym; > + } else { > + count--; // Don't count this symbol towards rehashing. > + } > } > } > } Yes, you're right.? I'll change it to exactly this. Thanks! Coelen > > cheers > > >> On Jul 19, 2018, at 5:14 PM, coleen.phillimore at oracle.com wrote: >> >> >> Hi, There is a closed test that does 100,000 lookups on a class that fails resolution, so creates 100,000 Symbols with TempNewSymbol. This results in many zeroed refcounted Symbols in the table which increases lookup time with the current SymbolTable. With the new concurrent symbol table, which this change is intended to support, the zero refcount symbols are cleaned up on insert and concurrently. >> >> I have a workaround so that this test doesn't time out. These are the times for this test on my machine. >> >> old hashtable no patch: 7.32 seconds >> without workaround: 367 seconds (which can time out on a slow machine) >> with workaround: 61.075 seconds >> with new hashtable: 9.135 seconds >> >> There are several ways to fix the old hashtable so that it cleans more frequently for this situation but it's not worth doing with the new concurrent hashtable coming. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/03.incr/webrev >> >> Thanks, >> Coleen >> >> On 7/19/18 3:09 PM, coleen.phillimore at oracle.com wrote: >>> >>> On 7/19/18 3:07 PM, Ioi Lam wrote: >>>> Looks good! >>> Thanks, Ioi! >>> Coleen >>>> Thanks >>>> >>>> - Ioi >>>> >>>> >>>> On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote: >>>>> Please review the revision to this change. Summary: >>>>> >>>>> * made decrement_refcount() use CAS loop. >>>>> * fixed duplicated logic in try_increment_refcount() thanks to Kim >>>>> * added gtest case for decrement_refcount. >>>>> * fixed SA code. >>>>> * added a bunch of comments >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev >>>>> >>>>> Retested with hs-tier1-3. >>>>> Thanks, >>>>> Coleen >>>>> >>>>> On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> On 7/18/18 6:35 PM, Ioi Lam wrote: >>>>>>> >>>>>>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> On 7/18/18 5:14 PM, Ioi Lam wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> The changes look good! The new operations on _length_and_refcount are much cleaner than my old ATOMIC_SHORT_PAIR hack. >>>>>>>> Yes, this makes more sense to me. >>>>>>>>> symbolTable.cpp: >>>>>>>>> >>>>>>>>> SymbolTable::lookup_dynamic() { >>>>>>>>> ... >>>>>>>>> 214 Symbol* sym = e->literal(); >>>>>>>>> 215 if (sym->equals(name, len) && sym->try_increment_refcount()) { >>>>>>>>> 216 // something is referencing this symbol now. >>>>>>>>> 217 return sym; >>>>>>>>> 218 } >>>>>>>>> >>>>>>>>> >>>>>>>>> symbol.cpp: >>>>>>>>> >>>>>>>>> 221 void Symbol::increment_refcount() { >>>>>>>>> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>>>>> 223 if (!try_increment_refcount()) { >>>>>>>>> 224 #ifdef ASSERT >>>>>>>>> 225 print(); >>>>>>>>> 226 #endif >>>>>>>>> 227 fatal("refcount has gone to zero"); >>>>>>>>> 228 } >>>>>>>>> 229 NOT_PRODUCT(Atomic::inc(&_total_count);) >>>>>>>>> 230 } >>>>>>>>> 231 } >>>>>>>>> >>>>>>>>> 246 // Atomically increment while checking for zero, zero is bad. >>>>>>>>> 247 bool Symbol::try_increment_refcount() { >>>>>>>>> 248 uint32_t old_value = _length_and_refcount; // fetch once >>>>>>>>> 249 int refc = extract_refcount(old_value); >>>>>>>>> 250 >>>>>>>>> 251 if (refc == PERM_REFCOUNT) { >>>>>>>>> 252 return true; >>>>>>>>> 253 } else if (refc == 0) { >>>>>>>>> 254 return false; // effectively dead, can't revive >>>>>>>>> 255 } >>>>>>>>> 256 >>>>>>>>> 257 uint32_t now; >>>>>>>>> 258 while ((now = Atomic::cmpxchg(old_value + 1, &_length_and_refcount, old_value)) != old_value) { >>>>>>>>> 259 // failed to increment, check refcount again. >>>>>>>>> 260 refc = extract_refcount(now); >>>>>>>>> 261 if (refc == 0) { >>>>>>>>> 262 return false; // just died >>>>>>>>> 263 } else if (refc == PERM_REFCOUNT) { >>>>>>>>> 264 return true; // just became permanent >>>>>>>>> 265 } >>>>>>>>> 266 old_value = now; // refcount changed, try again >>>>>>>>> 267 } >>>>>>>>> 268 return true; >>>>>>>>> 269 } >>>>>>>>> >>>>>>>>> >>>>>>>>> So is it valid for Symbol::try_increment_refcount() to return false? SymbolTable::lookup_dynamic() seems to suggest YES, but Symbol::increment_refcount() seems to suggest NO. >>>>>>>> True. If you are looking up a symbol and someone other thread has decremented the refcount to zero, this symbol should not be returned. My test exercises this code even without the concurrent hashtable. When the hashtable is concurrent, a zero-ed Symbol could be deallocated so we don't want to return it. >>>>>>>> >>>>>>> I think the following should be added as a comment in increment_refcount(). >>>>>>>> In the case where you call increment_refcount() not during lookup, it is assumed that you have a symbol with a non-zero refcount and it can't go away while you are holding it. >>>>>> Ok, added. >>>>>>>>> If it's always an invalid condition, I think the fatal() should be moved inside try_increment_refcount. >>>>>>>>> >>>>>>>> It isn't fatal at lookup. The lookup must skip a zero-ed entry. >>>>>>>>> Otherwise, I think you need to add comments in all 3 places, to say when it's possible to get a 0 refcount, and when it's not. And, it might be worth expanding on why "zero is bad" :-) >>>>>>>> How about this comment to try_increment_refcount: >>>>>>>> >>>>>>>> // Increment refcount while checking for zero. If the Symbol's refcount becomes zero >>>>>>>> // a thread could be concurrently removing the Symbol. This is used during SymbolTable >>>>>>>> // lookup to avoid reviving a dead Symbol. >>>>>>> Sounds good. >>>>>> Thanks, Ioi. >>>>>> Coleen >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>>>> My guess is: >>>>>>>>> + if you're doing a lookup, you might be seeing Symbols that have already been marked for deletion, which is indicated by a 0 refcount. You want to skip such Symbols. >>>>>>>>> >>>>>>>>> + if you're incrementing the refcount, that means you're holding a valid Symbol, which means this Symbol should have never been marked for deletion. >>>>>>>>> >>>>>>>>> Is this correct? >>>>>>>> Yes, both true. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Gerard, thank you for the code review. >>>>>>>>>> >>>>>>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote: >>>>>>>>>>> Thank you Coleen (and Kim)! >>>>>>>>>>> >>>>>>>>>>> #1 Need copyright year updates: >>>>>>>>>>> >>>>>>>>>>> src/hotspot/share/oops/symbol.cpp >>>>>>>>>>> src/hotspot/share/classfile/symbolTable.cpp >>>>>>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp >>>>>>>>>> Yes, I'll update with my commit. >>>>>>>>>>> #2 What?s the purpose of this code in src/hotspot/share/oops/symbol.cpp >>>>>>>>>>> >>>>>>>>>>> 38 STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1)); >>>>>>>>>>> >>>>>>>>>>> when we have: >>>>>>>>>>> >>>>>>>>>>> 117 enum { >>>>>>>>>>> 118 // max_symbol_length is constrained by type of _length >>>>>>>>>>> 119 max_symbol_length = (1 << 16) -1 >>>>>>>>>>> 120 }; >>>>>>>>>>> >>>>>>>>>>> Wouldn?t that always be true? Is it to make sure that nobody changes max_symbol_length, because the implementation needs it to be that? If so, should we add comment to: >>>>>>>>>>> >>>>>>>>>>> 119 max_symbol_length = (1 << 16) -1 >>>>>>>>>>> >>>>>>>>>>> with a big warning of some sorts? >>>>>>>>>> Yes, it's so that we can store the length of the symbol into 16 bits. >>>>>>>>>> >>>>>>>>>> How I change the comment above max_symbol_length from: >>>>>>>>>> >>>>>>>>>> // max_symbol_length is constrained by type of _length >>>>>>>>>> >>>>>>>>>> to >>>>>>>>>> >>>>>>>>>> // max_symbol_length must fit into the top 16 bits of _length_and_refcount >>>>>>>>>> >>>>>>>>>>> #3 If we have: >>>>>>>>>>> >>>>>>>>>>> 39 STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1)); >>>>>>>>>>> >>>>>>>>>>> then why not >>>>>>>>>>> >>>>>>>>>>> 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff >>>>>>>>>>> >>>>>>>>>>> or >>>>>>>>>>> 39 STATIC_ASSERT(PERM_REFCOUNT == 0xffff; >>>>>>>>>>> 101 #define PERM_REFCOUNT 0xffff >>>>>>>>>>> >>>>>>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent. >>>>>>>>>> >>>>>>>>>>> #4 We have: >>>>>>>>>>> >>>>>>>>>>> 221 void Symbol::increment_refcount() { >>>>>>>>>>> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>>>>>>> 223 if (!try_increment_refcount()) { >>>>>>>>>>> 224 #ifdef ASSERT >>>>>>>>>>> 225 print(); >>>>>>>>>>> 226 #endif >>>>>>>>>>> 227 fatal("refcount has gone to zero"); >>>>>>>>>>> >>>>>>>>>>> but >>>>>>>>>>> >>>>>>>>>>> 233 void Symbol::decrement_refcount() { >>>>>>>>>>> 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol >>>>>>>>>>> 235 int new_value = Atomic::sub((uint32_t)1, &_length_and_refcount); >>>>>>>>>>> 236 #ifdef ASSERT >>>>>>>>>>> 237 // Check if we have transitioned to 0xffff >>>>>>>>>>> 238 if (extract_refcount(new_value) == PERM_REFCOUNT) { >>>>>>>>>>> 239 print(); >>>>>>>>>>> 240 fatal("refcount underflow"); >>>>>>>>>>> 241 } >>>>>>>>>>> 242 #endif >>>>>>>>>>> >>>>>>>>>>> Where the line: >>>>>>>>>>> >>>>>>>>>>> 240 fatal("refcount underflow?); >>>>>>>>>>> >>>>>>>>>>> is inside #ifdef ASSERT, but: >>>>>>>>>>> >>>>>>>>>>> 227 fatal("refcount has gone to zero?); >>>>>>>>>>> >>>>>>>>>>> is outside. Shouldn't ?fatal" be consistent in both? >>>>>>>>>>> >>>>>>>>>> I was thought that looked strange too. I'll move the #endif from 226 to after 227. >>>>>>>>>> >>>>>>>>>> Thank you for reviewing the code! >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>>> cheers >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> >>>>>>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and pack refcount and length into an int. >>>>>>>>>>>> >>>>>>>>>>>> This is a precurser change to the concurrent SymbolTable change. Zeroed refcounted entries can be deleted at anytime so they cannot be allowed to be zero in runtime code. Thanks to Kim for writing the packing function and helping me avoid undefined behavior. >>>>>>>>>>>> >>>>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359 >>>>>>>>>>>> >>>>>>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including solaris. Added multithreaded gtest which exercises the code. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen From jcbeyler at google.com Fri Jul 20 15:30:37 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 20 Jul 2018 08:30:37 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Awesome thanks Thomas! Here is the webrev with the extra information then: http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ Thanks again for all the reviews everyone! Jc On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl wrote: > Hi, > > On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote: > > Hi all, > > > > Here is a webrev that does all the architectures in the same way: > > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > > > > Could anyone review the other architectures and test? > > - arm, sparc & aarch64 are also modified now to follow the same "if > > no > > tlab, then consider eden space allocation" logic. > > > > Thanks for your help! > > Jc > > > > looks good. > > I ran the change through hs-tier1-3 with no issues. It only tests on > sparc and x64 though. > > I do not expect issues on the other platforms though :) > > Thanks, > Thomas > > -- Thanks, Jc From aph at redhat.com Fri Jul 20 15:30:51 2018 From: aph at redhat.com (Andrew Haley) Date: Fri, 20 Jul 2018 16:30:51 +0100 Subject: [aarch64-port-dev ] RFR: 8207838: AArch64: fix the order in which float registers are restored in restore_args In-Reply-To: References: Message-ID: On 07/19/2018 08:39 AM, Yangfei (Felix) wrote: > Is it OK for jdk/jdk11? Great catch! That bug was committed by me on on Tue Apr 30 2013, which makes it more than five years old. I think that's a record for AArch64. I like the patch, but I think it'll need a proper jtreg test case. It's useful to test the slow JNI locking path on all arches, not just AArch64. You can make the test case fail more reliably by increasing the contention like this: public void run() { for (int i = 0; i < 1000; i++) { float d = JniStaticContextFloat.staticMethodFloat1((float) (1), (float) (2), (float) (4), (float) (8)); } Thanks. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From coleen.phillimore at oracle.com Fri Jul 20 16:12:40 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Jul 2018 12:12:40 -0400 Subject: [11] RFR 8203820: [TESTBUG] vmTestbase/metaspace/staticReferences/StaticReferences.java timed out Message-ID: Summary: Moved InMemoryJavaCompiler out of loops or reduced loops with InMemoryJavaCompiler I also reformatted StressRedefine.java which had the same problem as the two in the bug report. These test were timing out in test runs in the javac compiler.? See bug for more detail. open webrev at http://cr.openjdk.java.net/~coleenp/8203820.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8203820 Thanks, Coleen From vladimir.kozlov at oracle.com Fri Jul 20 16:18:52 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 20 Jul 2018 09:18:52 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> Message-ID: <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> My testing also passed clean. I tested next patch: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/label_bugfix/index.html Please, post it on cr.openjdk server for final review. We can't review and use patches from other places. Thanks, Vladimir On 7/20/18 12:29 AM, Lindenmaier, Goetz wrote: > Hi Liu, > > Martin had put the patch into our testing queue. > All the platforms we build are fine. > This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, > aix ppc64, solaris sparcv9, mac. > > Best regards, > Goetz. > >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> bounces at openjdk.java.net] On Behalf Of Liu Xin >> Sent: Freitag, 20. Juli 2018 09:16 >> To: Vladimir Kozlov >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for >> x86 >> >> Hello, Vladimir, >> Could you run on other platform on behalf of Martin? >> I locally tested on x86_64. I hope the Reviewer can help me verify it works >> on other platforms. >> >> >> Furthermore, I am sure if we should add this additional patch. >> Label class is not POD, we should properly call constructor /destructor >> even though those labels are allocated on arena. >> >> >> thanks, >> --lx >> >> On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin >> wrote: >> >>> Hi Liu Xin, >>> >>> >>> >>> thanks for understanding my point and checking other places. >>> >>> >>> >>> The templateTable_x86.cpp was reviewed by me. >>> >>> I can?t review the label assertion before my vacation. If other reviewers >>> are convinced that the it is correct, ok. >>> >>> >>> >>> Would be great if somebody could assist with testing other platforms. >>> >>> >>> >>> Best regards, >>> >>> Martin >>> >>> >>> >>> >>> >>> *From:* Liu Xin [mailto:navy.xliu at gmail.com] >>> *Sent:* Dienstag, 17. Juli 2018 19:17 >>> >>> *To:* Doerr, Martin >>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>> Labels for x86 >>> >>> >>> >>> Hi, Martin, >>> >>> >>> >>> Thank you for the feedback. >>> >>> >>> >>> I totally agree with you that we shouldn?t introduce false positive >>> assertion. Let?s insist on the high bar here. >>> >>> I browsed many sources in hotspot recently. Hotspot is the most monolithic >>> software I ever seen. I am glad to be directed by a guidance and clear >>> target. >>> >>> >>> >>> I think I dealt with c1 bailout case. This case triggers "codebuffer >>> overflow" in middle of c1 compilation. >>> >>> compiler/codegen/TestCharVect2.java >>> >>> >>> >>> I am still not sure about c2 bailout case. Let me try to make one. >>> >>> >>> >>> For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp >>> contains many emits methods for MachNode. I will double-check if they >> could >>> leave unused labels. >>> >>> >>> >>> Thanks, >>> >>> ?lx >>> >>> >>> >>> >>> >>> On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: >>> >>> >>> >>> Hi, List, >>> >>> >>> >>> Could you review this new revision? >>> >>> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>> jdk/label_bugfix/index.html >>> >>> >>> >>> >>> >>> i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I >>> don?t understand all the assemblies, but I think they are guarded >>> for UseOnStackReplacement >>> >>> in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). >>> >>> >>> >>> TemplateTable_arm.cpp is a little different. It explicitly binds it later. >>> >>> if (!UseOnStackReplacement) { >>> >>> __ bind(backedge_counter_overflow); >>> >>> } >>> >>> >>> >>> i) I checked the Compile::scratch_emit_size. It only uses the label fakeL >>> for those MachBranch nodes. >>> >>> Because fakeL will be bound to a trivial address if the nodes are >>> MachBranch, It?s also safe for the assertion. >>> >>> >>> >>> bool is_branch = n->is_MachBranch(); >>> >>> if (is_branch) { >>> >>> MacroAssembler masm(&buf); >>> >>> masm.bind(fakeL); >>> >>> n->as_MachBranch()->save_label(&saveL, &save_bnum); >>> >>> n->as_MachBranch()->label_set(&fakeL, 0); >>> >>> } >>> >>> >>> >>> Thanks, >>> >>> ?lx >>> >>> >>> >>> >>> >>> >>> >>> On Jul 16, 2018, at 1:30 AM, Doerr, Martin wrote: >>> >>> >>> >>> Hi Liu Xin, >>> >>> >>> >>> thanks for changing. >>> >>> >>> >>>> The background of this Assertion is that our engineer used to spend >> many >>> hour to trace down a corner case. >>> >>>> it's trivial if fastdebug/slowdebug stop and tell you immediately. >>> >>> >>> >>> I understand that. But an assertion should only get added when we are >>> convinced that it won?t produce false positives. >>> >>> It?s very annoying if long running tests break due to an incorrect >>> assertion after running many days. >>> >>> >>> >>>> I am curious about this "We also may generate code with the purpose to >>> determine its size.". >>> >>>> Could you tell me where is it? it looks quite slow to get buffer size in >>> this way. >>> >>> >>> >>> C2 Compiler does that in Compile::scratch_emit_size. >>> >>> >>> >>> Please note that I?ll be on vacation soon, so other people will have to >>> review. >>> >>> Thanks again for fixing the -XX:-UseOnStackReplacement issue. >>> >>> >>> >>> Best regards, >>> >>> Martin >>> >>> >>> >>> >>> >>> *From:* Liu Xin [mailto:navy.xliu at gmail.com ] >>> *Sent:* Freitag, 13. Juli 2018 22:30 >>> *To:* Doerr, Martin >>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>> Labels for x86 >>> >>> >>> >>> Hello, Martin, >>> >>> >>> >>> Thanks for reviewing it. >>> >>> >>> >>> I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" >>> and is running tests. >>> >>> >>> >>> The background of this Assertion is that our engineer used to spend many >>> hour to trace down a corner case. it's trivial if fastdebug/slowdebug stop >>> and tell you immediately. >>> >>> >>> >>> I am curious about this "We also may generate code with the purpose to >>> determine its size.". Could you tell me where is it? it looks quite slow >>> to get buffer size in this way. >>> >>> >>> >>> thanks, >>> >>> --lx >>> >>> >>> >>> >>> >>> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin >>> wrote: >>> >>> Hi, >>> >>> thanks for fixing the issue in templateTable_x86. It looks correct. >>> I think even better would be >>> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" >>> and >>> "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. >>> But I leave it up to you if you want to change it. I'm also ok with your >>> version. >>> >>> I'm not convinced that the label assertion is reliable. There may be many >>> more places in hotspot where we bail out having an unbound label. >> Running a >>> few tests on x86 is by far not sufficient. The assertion may fire >>> sporadically in large scenarios on some platforms. >>> >>> Best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>> Sent: Donnerstag, 12. Juli 2018 22:51 >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>> for x86 >>> >>> Could you review this patch again? >>> >>> Revision #2. >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >> ttps://bugs.openjdk.java.net/browse/JDK-8206075> >>> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>> openjdk8u/webrev/index.html >> com/openjdk-webrevs/openjdk8u/webrev/index.html> >>> >>> >>> >>> The idea is simple. I just reset the problematic label when c1 compilation >>> bailout happen. >>> I manually ran tier1 on my laptop. it can pass all of them. >>> Paul help me submit the patch to submit and here is the run result. >>> Build Details: 2018-07-12-1736388.hohensee.source >>> >>> 0 Failed Tests >>> >>> Mach5 Tasks Results Summary >>> >>> PASSED: 75 >>> UNABLE_TO_RUN: 0 >>> KILLED: 0 >>> NA: 0 >>> FAILED: 0 >>> EXECUTED_WITH_FAILURE: 0 >>> >>> >>> Thanks, >>> ?lx >>>> On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: >>>> >>>> Thank you for your reviews. Indeed, I didn?t deal with bailout >>> situation. "compiler/codegen/TestCharVect2.java? is the case of >>> codeBuffer overflow and leave a unbound label behind. >>>> I made another revision. I will run tests thoroughly. >>>> >>>> Thanks, >>>> ?lx >>>> >>>>> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul >>> wrote: >>>>> >>>>> Imo it's still good hygiene to require that Labels be bound if they're >>> used, even if the generated code will never be executed. E.g., code that >>> generates code for sizing purposes may be repurposed to generate >> executable >>> code, in which case an unbound label may be a lurking bug. Also, I'm >>> unaware (I may be corrected!) of any situation where bailing out happens >> in >>> such a way as to both leave a Label unbound and execute its destructor. >>> Even if there are, I'd say that'd be indicative of another real problem, >>> such as code buffer overflow, so no harm would result. >>>>> >>>>> Thanks, >>>>> >>>>> Paul >>>>> >>>>> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" >> < >>> hotspot-runtime-dev-bounces at openjdk.java.net on behalf of >>> martin.doerr at sap.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I think the idea is good, but doesn't work in all cases. >>>>> We may bail out from code generation and discard the generated code >>> leaving the label unbound. >>>>> We also may generate code with the purpose to determine its size. We >>> don't need to bind labels because the code will never get executed. >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >>>>> Sent: Mittwoch, 11. Juli 2018 03:34 >>>>> To: Liu Xin ; hotspot >>> -runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler >>> Labels for x86 >>>>> >>>>> I hit new assert in few other tests: >>>>> >>>>> compiler/codegen/TestCharVect2.java >>>>> compiler/c2/cr6340864/* >>>>> >>>>> Regards, >>>>> Vladimir >>>>> >>>>> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>>>>> Fix looks reasonable. I will test it in our framework. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>>>>> Hi, Community, >>>>>>> Could you please review this small patch? >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>>>>> >>>>>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>>>>> >>>>>>> Problem: >>>>>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is >> OFF. >>>>>>> This patch align up x86 with other architectures(ppc, arm). >>>>>>> Add an assertion to the destructor of Label. It will be wiped out in >>> release build. >>>>>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>>>>> make run-test >> TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>>>>> If this CR is approved, Paul Hohensee will push it. >>>>>>> Thanks, >>>>>>> --lx >>>>>>> >>>>> >>>>> >>>> >>> >>> >>> >>> >>> From coleen.phillimore at oracle.com Fri Jul 20 16:28:11 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Jul 2018 12:28:11 -0400 Subject: [11] RFR 8203820: [TESTBUG] vmTestbase/metaspace/staticReferences/StaticReferences.java timed out In-Reply-To: <84318482-cd2d-9643-2605-e6ebc63f988a@oracle.com> References: <84318482-cd2d-9643-2605-e6ebc63f988a@oracle.com> Message-ID: <74f74ca9-dfd6-903f-3b2a-52cbf78db2f2@oracle.com> Thank you for the quick review! Coleen On 7/20/18 12:24 PM, Vicente Romero wrote: > looks good to me, > > Thanks, > Vicente > > On 07/20/2018 12:12 PM, coleen.phillimore at oracle.com wrote: >> Summary: Moved InMemoryJavaCompiler out of loops or reduced loops >> with InMemoryJavaCompiler >> >> I also reformatted StressRedefine.java which had the same problem as >> the two in the bug report. >> >> These test were timing out in test runs in the javac compiler. See >> bug for more detail. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8203820.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8203820 >> >> Thanks, >> Coleen > From vaibhav.x.choudhary at oracle.com Thu Jul 12 15:40:31 2018 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Thu, 12 Jul 2018 21:10:31 +0530 Subject: RFR:8189762: [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration Message-ID: <1AC35B08-0127-469D-B2B4-0EDDC8B2CEFF@oracle.com> Hi, Please review the following backport test enhancement for JDK8u written for container awareness. Webrev : http://cr.openjdk.java.net/~rpatil/8189762/webrev.00/ Bug https://bugs.openjdk.java.net/browse/JDK-8189762 [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration Its a backport from JDK10. JDK10 changeset: http://hg.openjdk.java.net/jdk/jdk/rev/d6d00f785f39 JDK10 review thread : http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-November/025086.html Description: Tests are very similar to JDK10, but differs in logging mechanism. -XX options like UseContainerSupport, PrintContainerInfo has been used in place of -Xlog. Few changes has been done in the Util files to make the test compatible. Testing: Testing has been done on Ubuntu with and without Docker environment. From vicente.romero at oracle.com Fri Jul 20 16:24:03 2018 From: vicente.romero at oracle.com (Vicente Romero) Date: Fri, 20 Jul 2018 12:24:03 -0400 Subject: [11] RFR 8203820: [TESTBUG] vmTestbase/metaspace/staticReferences/StaticReferences.java timed out In-Reply-To: References: Message-ID: <84318482-cd2d-9643-2605-e6ebc63f988a@oracle.com> looks good to me, Thanks, Vicente On 07/20/2018 12:12 PM, coleen.phillimore at oracle.com wrote: > Summary: Moved InMemoryJavaCompiler out of loops or reduced loops with > InMemoryJavaCompiler > > I also reformatted StressRedefine.java which had the same problem as > the two in the bug report. > > These test were timing out in test runs in the javac compiler. See bug > for more detail. > > open webrev at http://cr.openjdk.java.net/~coleenp/8203820.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8203820 > > Thanks, > Coleen From ioi.lam at oracle.com Fri Jul 20 17:44:36 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 20 Jul 2018 10:44:36 -0700 Subject: RFR 12 (XXS) 8203382 Rename SystemDictionary::initialize_wk_klass to resolve_wk_klass Message-ID: <733ed870-a7c1-b0c5-ec03-5b5bdae478e8@oracle.com> Hi, Please review this very simple renaming change: https://bugs.openjdk.java.net/browse/JDK-8203382 http://cr.openjdk.java.net/~iklam/jdk12/8203382_rename_initialize_wk_klass.v01/ ??? initialize_wk_klass ->resolve_wk_klass ??? initialize_preloaded_classes-> resolve_preloaded_classes because Java class initialization is not actually happening inside these functions. Thanks - Ioi From vladimir.kozlov at oracle.com Fri Jul 20 17:52:56 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 20 Jul 2018 10:52:56 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> Message-ID: <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> I asked Igor V. to look. Seems like review is done in an other thread which does not have bug id in subject. Currently webrev.03 Vladimir On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > Thanks, Rahul! > In fact, there no good experts for this area in the serviceability team. > It would be much better if anyone from the Compiler team could do it. > > Vladimir K., > > Is there anyone from the Compiler team available to review this? > Otherwise, I could try to review it but am not sure about my review > quality. > > Thanks, > Serguei > > > On 7/19/18 00:48, Rahul Raghavan wrote: >> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >> >> (just adding + hotspot-compiler-dev also) >> >> >> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >> Subject Was: >> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >> >> + serviceability-dev >> >> Hi all, >> >> Could anyone else give me a review of this webrev and check/test the >> various architecture changes? >> >> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >> >> Thanks for all your help! >> Jc >> >> >>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: >>> >>>> Hi all, >>>> >>>> Here is a webrev that does all the architectures in the same way: >>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>> >>>> Could anyone review the other architectures and test? >>>> ?? - arm, sparc & aarch64 are also modified now to follow the same >>>> "if no >>>> tlab, then consider eden space allocation" logic. >>>> >>>> Thanks for your help! >>>> Jc >>>> >>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: >>>> >>>>> Hi Kim, >>>>> >>>>> I opened this bug >>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>>> >>>>> and now I've done an update: >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>>> >>>>> I basically have done your nits but also removed the try_eden (it was >>>>> used to bind a label but was not used). I updated the comments to >>>>> use the >>>>> one you preferred. >>>>> >>>>> I still have to do the other architectures though but at least we >>>>> seem to >>>>> have a consensus on this architecture, correct? >>>>> >>>>> Thanks for the review, >>>>> Jc >>>>> >>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>>>> wrote: >>>>> >>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>>>>>> >>>>>>> Yes, you are right, I did those changes due to: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>>> >>>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>>> I'll go >>>>>> ahead >>>>>>> and propagate the change across architectures. >>>>>>> >>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>>> comment >>>>>> and >>>>>>> review) :) >>>>>>> Jc >>>>>>> >>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>>>> wrote: >>>>>>> >>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> I'm not sure if we had left this case intentionally or not but, >>>>>>>> if we >>>>>> want >>>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>>> >>>>>>>> >>>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>>> speaks >>>>>> up >>>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>>> >>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>>> src/hotspot/share" >>>>>>>> suggests that the GC group is most active in touching this feature. >>>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>>> >>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>>> working on the GC to OK it. >>>>>>>> >>>>>>>> ? John >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks, >>>>>>> Jc >>>>>> >>>>>> Robbin is on vacation; you might not hear from him for a while. >>>>>> >>>>>> I'm assuming you'll open a new bug for this? >>>>>> >>>>>> Except for a few minor nits (below), this looks okay to me. >>>>>> >>>>>> The comment at line 1052 needs updating. >>>>>> >>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>>> >>>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>>> line 1058, but unreferenced. >>>>>> >>>>>> I like the wording of the comment at 1139 better than the wording at >>>>>> 1016. >>>>>> >>>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks, >>>>> Jc >>>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Jc >>>> >>> >>> > From vladimir.kozlov at oracle.com Fri Jul 20 17:57:11 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 20 Jul 2018 10:57:11 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: <22211468-5b15-e6a8-be6b-7ce5d2fbdf27@oracle.com> Please, don't do review in 2 mailing threads. Thanks, Vladimir On 7/20/18 8:30 AM, JC Beyler wrote: > Awesome thanks Thomas! > > Here is the webrev with the extra information then: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > Thanks again for all the reviews everyone! > Jc > > On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl > wrote: > >> Hi, >> >> On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote: >>> Hi all, >>> >>> Here is a webrev that does all the architectures in the same way: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> Could anyone review the other architectures and test? >>> - arm, sparc & aarch64 are also modified now to follow the same "if >>> no >>> tlab, then consider eden space allocation" logic. >>> >>> Thanks for your help! >>> Jc >>> >> >> looks good. >> >> I ran the change through hs-tier1-3 with no issues. It only tests on >> sparc and x64 though. >> >> I do not expect issues on the other platforms though :) >> >> Thanks, >> Thomas >> >> > From jiangli.zhou at oracle.com Fri Jul 20 17:57:16 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 20 Jul 2018 10:57:16 -0700 Subject: RFR 12 (XXS) 8203382 Rename SystemDictionary::initialize_wk_klass to resolve_wk_klass In-Reply-To: <733ed870-a7c1-b0c5-ec03-5b5bdae478e8@oracle.com> References: <733ed870-a7c1-b0c5-ec03-5b5bdae478e8@oracle.com> Message-ID: Looks good and trivial. Thanks, Jiangli On 7/20/18 10:44 AM, Ioi Lam wrote: > Hi, > > Please review this very simple renaming change: > > https://bugs.openjdk.java.net/browse/JDK-8203382 > http://cr.openjdk.java.net/~iklam/jdk12/8203382_rename_initialize_wk_klass.v01/ > > > ??? initialize_wk_klass ->resolve_wk_klass > ??? initialize_preloaded_classes-> resolve_preloaded_classes > > because Java class initialization is not actually happening inside these > functions. > > Thanks > - Ioi > > From navy.xliu at gmail.com Fri Jul 20 18:11:50 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Fri, 20 Jul 2018 11:11:50 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> Message-ID: Thanks, Vladimir and Goetz. Could yo approve what you tested? For the patch, I think it's another story. I am *NOT* sure if we should need it. It's about C++ object model. I feel hotspot is using C++ in non-standard way. I am confusing about C++ in hotspot. In regular C++ , we should manage the life cycle of objects carefully. If you take a look at usage of this macro, some non-pod classes don't construct but use directly. #define NEW_RESOURCE_ARRAY(type, size)\ (type*) resource_allocate_bytes((size) * sizeof(type)) eg. VMRegPair* out_regs = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args); May I create a new RFR to enhance it? I want to introduce a meta-programming template like boost's is_pod. https://www.boost.org/doc/libs/1_44_0/libs/type_traits/doc/html/boost_typetraits/reference/is_pod.html NEW_RESOURCE_ARRAY should call constructors for those classes which are not pod. thanks, --lx On Fri, Jul 20, 2018 at 9:18 AM, Vladimir Kozlov wrote: > My testing also passed clean. I tested next patch: > > https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/ > label_bugfix/index.html > > Please, post it on cr.openjdk server for final review. We can't review and > use patches from other places. > > Thanks, > Vladimir > > > On 7/20/18 12:29 AM, Lindenmaier, Goetz wrote: > >> Hi Liu, >> >> Martin had put the patch into our testing queue. >> All the platforms we build are fine. >> This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, >> aix ppc64, solaris sparcv9, mac. >> >> Best regards, >> Goetz. >> >> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>> Sent: Freitag, 20. Juli 2018 09:16 >>> To: Vladimir Kozlov >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>> for >>> x86 >>> >>> Hello, Vladimir, >>> Could you run on other platform on behalf of Martin? >>> I locally tested on x86_64. I hope the Reviewer can help me verify it >>> works >>> on other platforms. >>> >>> >>> Furthermore, I am sure if we should add this additional patch. >>> Label class is not POD, we should properly call constructor /destructor >>> even though those labels are allocated on arena. >>> >>> >>> thanks, >>> --lx >>> >>> On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin >>> wrote: >>> >>> Hi Liu Xin, >>>> >>>> >>>> >>>> thanks for understanding my point and checking other places. >>>> >>>> >>>> >>>> The templateTable_x86.cpp was reviewed by me. >>>> >>>> I can?t review the label assertion before my vacation. If other >>>> reviewers >>>> are convinced that the it is correct, ok. >>>> >>>> >>>> >>>> Would be great if somebody could assist with testing other platforms. >>>> >>>> >>>> >>>> Best regards, >>>> >>>> Martin >>>> >>>> >>>> >>>> >>>> >>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com] >>>> *Sent:* Dienstag, 17. Juli 2018 19:17 >>>> >>>> *To:* Doerr, Martin >>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>> Labels for x86 >>>> >>>> >>>> >>>> Hi, Martin, >>>> >>>> >>>> >>>> Thank you for the feedback. >>>> >>>> >>>> >>>> I totally agree with you that we shouldn?t introduce false positive >>>> assertion. Let?s insist on the high bar here. >>>> >>>> I browsed many sources in hotspot recently. Hotspot is the most >>>> monolithic >>>> software I ever seen. I am glad to be directed by a guidance and clear >>>> target. >>>> >>>> >>>> >>>> I think I dealt with c1 bailout case. This case triggers "codebuffer >>>> overflow" in middle of c1 compilation. >>>> >>>> compiler/codegen/TestCharVect2.java >>>> >>>> >>>> >>>> I am still not sure about c2 bailout case. Let me try to make one. >>>> >>>> >>>> >>>> For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp >>>> contains many emits methods for MachNode. I will double-check if they >>>> >>> could >>> >>>> leave unused labels. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> ?lx >>>> >>>> >>>> >>>> >>>> >>>> On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: >>>> >>>> >>>> >>>> Hi, List, >>>> >>>> >>>> >>>> Could you review this new revision? >>>> >>>> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>> jdk/label_bugfix/index.html >>>> >>>> >>>> >>>> >>>> >>>> i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I >>>> don?t understand all the assemblies, but I think they are guarded >>>> for UseOnStackReplacement >>>> >>>> in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). >>>> >>>> >>>> >>>> TemplateTable_arm.cpp is a little different. It explicitly binds it >>>> later. >>>> >>>> if (!UseOnStackReplacement) { >>>> >>>> __ bind(backedge_counter_overflow); >>>> >>>> } >>>> >>>> >>>> >>>> i) I checked the Compile::scratch_emit_size. It only uses the label >>>> fakeL >>>> for those MachBranch nodes. >>>> >>>> Because fakeL will be bound to a trivial address if the nodes are >>>> MachBranch, It?s also safe for the assertion. >>>> >>>> >>>> >>>> bool is_branch = n->is_MachBranch(); >>>> >>>> if (is_branch) { >>>> >>>> MacroAssembler masm(&buf); >>>> >>>> masm.bind(fakeL); >>>> >>>> n->as_MachBranch()->save_label(&saveL, &save_bnum); >>>> >>>> n->as_MachBranch()->label_set(&fakeL, 0); >>>> >>>> } >>>> >>>> >>>> >>>> Thanks, >>>> >>>> ?lx >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Jul 16, 2018, at 1:30 AM, Doerr, Martin >>>> wrote: >>>> >>>> >>>> >>>> Hi Liu Xin, >>>> >>>> >>>> >>>> thanks for changing. >>>> >>>> >>>> >>>> The background of this Assertion is that our engineer used to spend >>>>> >>>> many >>> >>>> hour to trace down a corner case. >>>> >>>> it's trivial if fastdebug/slowdebug stop and tell you immediately. >>>>> >>>> >>>> >>>> >>>> I understand that. But an assertion should only get added when we are >>>> convinced that it won?t produce false positives. >>>> >>>> It?s very annoying if long running tests break due to an incorrect >>>> assertion after running many days. >>>> >>>> >>>> >>>> I am curious about this "We also may generate code with the purpose to >>>>> >>>> determine its size.". >>>> >>>> Could you tell me where is it? it looks quite slow to get buffer size in >>>>> >>>> this way. >>>> >>>> >>>> >>>> C2 Compiler does that in Compile::scratch_emit_size. >>>> >>>> >>>> >>>> Please note that I?ll be on vacation soon, so other people will have to >>>> review. >>>> >>>> Thanks again for fixing the -XX:-UseOnStackReplacement issue. >>>> >>>> >>>> >>>> Best regards, >>>> >>>> Martin >>>> >>>> >>>> >>>> >>>> >>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com ] >>>> *Sent:* Freitag, 13. Juli 2018 22:30 >>>> *To:* Doerr, Martin >>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>> Labels for x86 >>>> >>>> >>>> >>>> Hello, Martin, >>>> >>>> >>>> >>>> Thanks for reviewing it. >>>> >>>> >>>> >>>> I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" >>>> and is running tests. >>>> >>>> >>>> >>>> The background of this Assertion is that our engineer used to spend many >>>> hour to trace down a corner case. it's trivial if fastdebug/slowdebug >>>> stop >>>> and tell you immediately. >>>> >>>> >>>> >>>> I am curious about this "We also may generate code with the purpose to >>>> determine its size.". Could you tell me where is it? it looks quite >>>> slow >>>> to get buffer size in this way. >>>> >>>> >>>> >>>> thanks, >>>> >>>> --lx >>>> >>>> >>>> >>>> >>>> >>>> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin >>>> wrote: >>>> >>>> Hi, >>>> >>>> thanks for fixing the issue in templateTable_x86. It looks correct. >>>> I think even better would be >>>> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" >>>> and >>>> "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. >>>> But I leave it up to you if you want to change it. I'm also ok with your >>>> version. >>>> >>>> I'm not convinced that the label assertion is reliable. There may be >>>> many >>>> more places in hotspot where we bail out having an unbound label. >>>> >>> Running a >>> >>>> few tests on x86 is by far not sufficient. The assertion may fire >>>> sporadically in large scenarios on some platforms. >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>> -----Original Message----- >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>>> Sent: Donnerstag, 12. Juli 2018 22:51 >>>> To: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>>> for x86 >>>> >>>> Could you review this patch again? >>>> >>>> Revision #2. >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>> ttps://bugs.openjdk.java.net/browse/JDK-8206075> >>>> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>> openjdk8u/webrev/index.html >>> com/openjdk-webrevs/openjdk8u/webrev/index.html> >>>> >>>> >>>> >>>> The idea is simple. I just reset the problematic label when c1 >>>> compilation >>>> bailout happen. >>>> I manually ran tier1 on my laptop. it can pass all of them. >>>> Paul help me submit the patch to submit and here is the run result. >>>> Build Details: 2018-07-12-1736388.hohensee.source >>>> >>>> 0 Failed Tests >>>> >>>> Mach5 Tasks Results Summary >>>> >>>> PASSED: 75 >>>> UNABLE_TO_RUN: 0 >>>> KILLED: 0 >>>> NA: 0 >>>> FAILED: 0 >>>> EXECUTED_WITH_FAILURE: 0 >>>> >>>> >>>> Thanks, >>>> ?lx >>>> >>>>> On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: >>>>> >>>>> Thank you for your reviews. Indeed, I didn?t deal with bailout >>>>> >>>> situation. "compiler/codegen/TestCharVect2.java? is the case of >>>> codeBuffer overflow and leave a unbound label behind. >>>> >>>>> I made another revision. I will run tests thoroughly. >>>>> >>>>> Thanks, >>>>> ?lx >>>>> >>>>> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul >>>>>> >>>>> wrote: >>>> >>>>> >>>>>> Imo it's still good hygiene to require that Labels be bound if they're >>>>>> >>>>> used, even if the generated code will never be executed. E.g., code >>>> that >>>> generates code for sizing purposes may be repurposed to generate >>>> >>> executable >>> >>>> code, in which case an unbound label may be a lurking bug. Also, I'm >>>> unaware (I may be corrected!) of any situation where bailing out happens >>>> >>> in >>> >>>> such a way as to both leave a Label unbound and execute its destructor. >>>> Even if there are, I'd say that'd be indicative of another real problem, >>>> such as code buffer overflow, so no harm would result. >>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Paul >>>>>> >>>>>> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" >>>>>> >>>>> < >>> >>>> hotspot-runtime-dev-bounces at openjdk.java.net on behalf of >>>> martin.doerr at sap.com> wrote: >>>> >>>>> >>>>>> Hi, >>>>>> >>>>>> I think the idea is good, but doesn't work in all cases. >>>>>> We may bail out from code generation and discard the generated code >>>>>> >>>>> leaving the label unbound. >>>> >>>>> We also may generate code with the purpose to determine its size. We >>>>>> >>>>> don't need to bind labels because the code will never get executed. >>>> >>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>>> >>>>> bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >>>> >>>>> Sent: Mittwoch, 11. Juli 2018 03:34 >>>>>> To: Liu Xin ; hotspot >>>>>> >>>>> -runtime-dev at openjdk.java.net >>>> >>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler >>>>>> >>>>> Labels for x86 >>>> >>>>> >>>>>> I hit new assert in few other tests: >>>>>> >>>>>> compiler/codegen/TestCharVect2.java >>>>>> compiler/c2/cr6340864/* >>>>>> >>>>>> Regards, >>>>>> Vladimir >>>>>> >>>>>> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>>>>> >>>>>>> Fix looks reasonable. I will test it in our framework. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>>>>> >>>>>>>> Hi, Community, >>>>>>>> Could you please review this small patch? >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>>>>>> >>>>>>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>>>>>> >>>>>>>> Problem: >>>>>>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is >>>>>>>> >>>>>>> OFF. >>> >>>> This patch align up x86 with other architectures(ppc, arm). >>>>>>>> Add an assertion to the destructor of Label. It will be wiped out >>>>>>>> in >>>>>>>> >>>>>>> release build. >>>> >>>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>>>>>> make run-test >>>>>>>> >>>>>>> TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>> >>>> If this CR is approved, Paul Hohensee will push it. >>>>>>>> Thanks, >>>>>>>> --lx >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>>> >>>> From serguei.spitsyn at oracle.com Fri Jul 20 18:18:20 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Jul 2018 11:18:20 -0700 Subject: RFR (S): 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Restored the bug number and added back the hotspot-dev and serviceability-dev mailing lists. Thanks, Serguei On 7/20/18 08:30, JC Beyler wrote: > Awesome thanks Thomas! > > Here is the webrev with the extra information then: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > Thanks again for all the reviews everyone! > Jc > > On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl > wrote: > >> Hi, >> >> On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote: >>> Hi all, >>> >>> Here is a webrev that does all the architectures in the same way: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> Could anyone review the other architectures and test? >>> - arm, sparc & aarch64 are also modified now to follow the same "if >>> no >>> tlab, then consider eden space allocation" logic. >>> >>> Thanks for your help! >>> Jc >>> >> looks good. >> >> I ran the change through hs-tier1-3 with no issues. It only tests on >> sparc and x64 though. >> >> I do not expect issues on the other platforms though :) >> >> Thanks, >> Thomas >> >> From ioi.lam at oracle.com Fri Jul 20 18:18:37 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 20 Jul 2018 11:18:37 -0700 Subject: RFR 12 (XXS) 8203382 Rename SystemDictionary::initialize_wk_klass to resolve_wk_klass In-Reply-To: References: <733ed870-a7c1-b0c5-ec03-5b5bdae478e8@oracle.com> Message-ID: <4472c545-310c-70db-9e26-5b7174126e16@oracle.com> Thanks Jiangli! - Ioi On 7/20/18 10:57 AM, Jiangli Zhou wrote: > Looks good and trivial. > > Thanks, > > Jiangli > > > On 7/20/18 10:44 AM, Ioi Lam wrote: >> Hi, >> >> Please review this very simple renaming change: >> >> https://bugs.openjdk.java.net/browse/JDK-8203382 >> http://cr.openjdk.java.net/~iklam/jdk12/8203382_rename_initialize_wk_klass.v01/ >> >> >> ??? initialize_wk_klass ->resolve_wk_klass >> ??? initialize_preloaded_classes-> resolve_preloaded_classes >> >> because Java class initialization is not actually happening inside these >> functions. >> >> Thanks >> - Ioi >> >> > From serguei.spitsyn at oracle.com Fri Jul 20 18:21:56 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Jul 2018 11:21:56 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> Message-ID: Thank you a lot, Vladimir! Yes, the webrev.03 is the latest. Jc, will correct us if it is not right. Thanks, Serguei On 7/20/18 10:52, Vladimir Kozlov wrote: > I asked Igor V. to look. > > Seems like review is done in an other thread which does not have bug > id in subject. Currently webrev.03 > > Vladimir > > On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >> Thanks, Rahul! >> In fact, there no good experts for this area in the serviceability team. >> It would be much better if anyone from the Compiler team could do it. >> >> Vladimir K., >> >> Is there anyone from the Compiler team available to review this? >> Otherwise, I could try to review it but am not sure about my review >> quality. >> >> Thanks, >> Serguei >> >> >> On 7/19/18 00:48, Rahul Raghavan wrote: >>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >>> >>> (just adding + hotspot-compiler-dev also) >>> >>> >>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >>> Subject Was: >>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >>> >>> + serviceability-dev >>> >>> Hi all, >>> >>> Could anyone else give me a review of this webrev and check/test the >>> various architecture changes? >>> >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> >>> Thanks for all your help! >>> Jc >>> >>> >>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: >>>> >>>>> Hi all, >>>>> >>>>> Here is a webrev that does all the architectures in the same way: >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>>> >>>>> Could anyone review the other architectures and test? >>>>> ?? - arm, sparc & aarch64 are also modified now to follow the same >>>>> "if no >>>>> tlab, then consider eden space allocation" logic. >>>>> >>>>> Thanks for your help! >>>>> Jc >>>>> >>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler >>>>> wrote: >>>>> >>>>>> Hi Kim, >>>>>> >>>>>> I opened this bug >>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>>>> >>>>>> and now I've done an update: >>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>>>> >>>>>> I basically have done your nits but also removed the try_eden (it >>>>>> was >>>>>> used to bind a label but was not used). I updated the comments to >>>>>> use the >>>>>> one you preferred. >>>>>> >>>>>> I still have to do the other architectures though but at least we >>>>>> seem to >>>>>> have a consensus on this architecture, correct? >>>>>> >>>>>> Thanks for the review, >>>>>> Jc >>>>>> >>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>>>>> wrote: >>>>>> >>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler >>>>>>>> wrote: >>>>>>>> >>>>>>>> Yes, you are right, I did those changes due to: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>>>> >>>>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>>>> I'll go >>>>>>> ahead >>>>>>>> and propagate the change across architectures. >>>>>>>> >>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>>>> comment >>>>>>> and >>>>>>>> review) :) >>>>>>>> Jc >>>>>>>> >>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>>>>> wrote: >>>>>>>> >>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm not sure if we had left this case intentionally or not >>>>>>>>> but, if we >>>>>>> want >>>>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>>>> >>>>>>>>> >>>>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>>>> speaks >>>>>>> up >>>>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>>>> >>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>>>> src/hotspot/share" >>>>>>>>> suggests that the GC group is most active in touching this >>>>>>>>> feature. >>>>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>>>> >>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>>>> working on the GC to OK it. >>>>>>>>> >>>>>>>>> ? John >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jc >>>>>>> >>>>>>> Robbin is on vacation; you might not hear from him for a while. >>>>>>> >>>>>>> I'm assuming you'll open a new bug for this? >>>>>>> >>>>>>> Except for a few minor nits (below), this looks okay to me. >>>>>>> >>>>>>> The comment at line 1052 needs updating. >>>>>>> >>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>>>> >>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>>>> line 1058, but unreferenced. >>>>>>> >>>>>>> I like the wording of the comment at 1139 better than the >>>>>>> wording at >>>>>>> 1016. >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks, >>>>>> Jc >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks, >>>>> Jc >>>>> >>>> >>>> >> From kim.barrett at oracle.com Fri Jul 20 18:24:37 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 20 Jul 2018 14:24:37 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: <8a37d7da-5547-75b3-aec3-fd3bbe8e6a78@oracle.com> References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> <3efca5db-b37a-0afd-e560-50ebfd93e638@oracle.com> <8a37d7da-5547-75b3-aec3-fd3bbe8e6a78@oracle.com> Message-ID: > On Jul 19, 2018, at 6:14 PM, coleen.phillimore at oracle.com wrote: > > > Hi, There is a closed test that does 100,000 lookups on a class that fails resolution, so creates 100,000 Symbols with TempNewSymbol. This results in many zeroed refcounted Symbols in the table which increases lookup time with the current SymbolTable. With the new concurrent symbol table, which this change is intended to support, the zero refcount symbols are cleaned up on insert and concurrently. > > I have a workaround so that this test doesn't time out. These are the times for this test on my machine. > > old hashtable no patch: 7.32 seconds > without workaround: 367 seconds (which can time out on a slow machine) > with workaround: 61.075 seconds > with new hashtable: 9.135 seconds > > There are several ways to fix the old hashtable so that it cleans more frequently for this situation but it's not worth doing with the new concurrent hashtable coming. > > open webrev at http://cr.openjdk.java.net/~coleenp/03.incr/webrev > > Thanks, > Coleen Looks good with Gerard?s suggested improvement. From vladimir.kozlov at oracle.com Fri Jul 20 18:31:56 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 20 Jul 2018 11:31:56 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> Message-ID: On 7/20/18 11:11 AM, Liu Xin wrote: > Thanks, Vladimir and Goetz. Could yo approve what you tested? I am fine with your latest changes but you need to post webrev on cr.openjdk. I will review it then. > > > For the patch, I think it's another story. I am *NOT* sure if we should > need it. It's about C++ object model. I feel hotspot is using C++ in > non-standard way. I am confusing about C++ in hotspot. > In regular C++ , we should manage the life cycle of objects carefully. > > If you take a look at usage of this macro, some non-pod classes don't > construct but use directly. > #define NEW_RESOURCE_ARRAY(type, size)\ > (type*) resource_allocate_bytes((size) * sizeof(type)) > > eg. > VMRegPair* out_regs = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args); > > May I create a new RFR to enhance it? > I want to introduce a meta-programming template like boost's is_pod. > https://www.boost.org/doc/libs/1_44_0/libs/type_traits/doc/html/boost_typetraits/reference/is_pod.html Be careful. Hotspot have to be compiled by big variety of C++ compilers and not all of them support latest features. Regards, Vladimir > > NEW_RESOURCE_ARRAY should call constructors for those classes which are not > pod. > > thanks, > --lx > > > > > On Fri, Jul 20, 2018 at 9:18 AM, Vladimir Kozlov > wrote: > >> My testing also passed clean. I tested next patch: >> >> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/ >> label_bugfix/index.html >> >> Please, post it on cr.openjdk server for final review. We can't review and >> use patches from other places. >> >> Thanks, >> Vladimir >> >> >> On 7/20/18 12:29 AM, Lindenmaier, Goetz wrote: >> >>> Hi Liu, >>> >>> Martin had put the patch into our testing queue. >>> All the platforms we build are fine. >>> This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, >>> aix ppc64, solaris sparcv9, mac. >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>>> Sent: Freitag, 20. Juli 2018 09:16 >>>> To: Vladimir Kozlov >>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>>> for >>>> x86 >>>> >>>> Hello, Vladimir, >>>> Could you run on other platform on behalf of Martin? >>>> I locally tested on x86_64. I hope the Reviewer can help me verify it >>>> works >>>> on other platforms. >>>> >>>> >>>> Furthermore, I am sure if we should add this additional patch. >>>> Label class is not POD, we should properly call constructor /destructor >>>> even though those labels are allocated on arena. >>>> >>>> >>>> thanks, >>>> --lx >>>> >>>> On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin >>>> wrote: >>>> >>>> Hi Liu Xin, >>>>> >>>>> >>>>> >>>>> thanks for understanding my point and checking other places. >>>>> >>>>> >>>>> >>>>> The templateTable_x86.cpp was reviewed by me. >>>>> >>>>> I can?t review the label assertion before my vacation. If other >>>>> reviewers >>>>> are convinced that the it is correct, ok. >>>>> >>>>> >>>>> >>>>> Would be great if somebody could assist with testing other platforms. >>>>> >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Martin >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com] >>>>> *Sent:* Dienstag, 17. Juli 2018 19:17 >>>>> >>>>> *To:* Doerr, Martin >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>>> Labels for x86 >>>>> >>>>> >>>>> >>>>> Hi, Martin, >>>>> >>>>> >>>>> >>>>> Thank you for the feedback. >>>>> >>>>> >>>>> >>>>> I totally agree with you that we shouldn?t introduce false positive >>>>> assertion. Let?s insist on the high bar here. >>>>> >>>>> I browsed many sources in hotspot recently. Hotspot is the most >>>>> monolithic >>>>> software I ever seen. I am glad to be directed by a guidance and clear >>>>> target. >>>>> >>>>> >>>>> >>>>> I think I dealt with c1 bailout case. This case triggers "codebuffer >>>>> overflow" in middle of c1 compilation. >>>>> >>>>> compiler/codegen/TestCharVect2.java >>>>> >>>>> >>>>> >>>>> I am still not sure about c2 bailout case. Let me try to make one. >>>>> >>>>> >>>>> >>>>> For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp >>>>> contains many emits methods for MachNode. I will double-check if they >>>>> >>>> could >>>> >>>>> leave unused labels. >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> ?lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: >>>>> >>>>> >>>>> >>>>> Hi, List, >>>>> >>>>> >>>>> >>>>> Could you review this new revision? >>>>> >>>>> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>>> jdk/label_bugfix/index.html >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I >>>>> don?t understand all the assemblies, but I think they are guarded >>>>> for UseOnStackReplacement >>>>> >>>>> in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). >>>>> >>>>> >>>>> >>>>> TemplateTable_arm.cpp is a little different. It explicitly binds it >>>>> later. >>>>> >>>>> if (!UseOnStackReplacement) { >>>>> >>>>> __ bind(backedge_counter_overflow); >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> i) I checked the Compile::scratch_emit_size. It only uses the label >>>>> fakeL >>>>> for those MachBranch nodes. >>>>> >>>>> Because fakeL will be bound to a trivial address if the nodes are >>>>> MachBranch, It?s also safe for the assertion. >>>>> >>>>> >>>>> >>>>> bool is_branch = n->is_MachBranch(); >>>>> >>>>> if (is_branch) { >>>>> >>>>> MacroAssembler masm(&buf); >>>>> >>>>> masm.bind(fakeL); >>>>> >>>>> n->as_MachBranch()->save_label(&saveL, &save_bnum); >>>>> >>>>> n->as_MachBranch()->label_set(&fakeL, 0); >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> ?lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 16, 2018, at 1:30 AM, Doerr, Martin >>>>> wrote: >>>>> >>>>> >>>>> >>>>> Hi Liu Xin, >>>>> >>>>> >>>>> >>>>> thanks for changing. >>>>> >>>>> >>>>> >>>>> The background of this Assertion is that our engineer used to spend >>>>>> >>>>> many >>>> >>>>> hour to trace down a corner case. >>>>> >>>>> it's trivial if fastdebug/slowdebug stop and tell you immediately. >>>>>> >>>>> >>>>> >>>>> >>>>> I understand that. But an assertion should only get added when we are >>>>> convinced that it won?t produce false positives. >>>>> >>>>> It?s very annoying if long running tests break due to an incorrect >>>>> assertion after running many days. >>>>> >>>>> >>>>> >>>>> I am curious about this "We also may generate code with the purpose to >>>>>> >>>>> determine its size.". >>>>> >>>>> Could you tell me where is it? it looks quite slow to get buffer size in >>>>>> >>>>> this way. >>>>> >>>>> >>>>> >>>>> C2 Compiler does that in Compile::scratch_emit_size. >>>>> >>>>> >>>>> >>>>> Please note that I?ll be on vacation soon, so other people will have to >>>>> review. >>>>> >>>>> Thanks again for fixing the -XX:-UseOnStackReplacement issue. >>>>> >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Martin >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com ] >>>>> *Sent:* Freitag, 13. Juli 2018 22:30 >>>>> *To:* Doerr, Martin >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>>> Labels for x86 >>>>> >>>>> >>>>> >>>>> Hello, Martin, >>>>> >>>>> >>>>> >>>>> Thanks for reviewing it. >>>>> >>>>> >>>>> >>>>> I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" >>>>> and is running tests. >>>>> >>>>> >>>>> >>>>> The background of this Assertion is that our engineer used to spend many >>>>> hour to trace down a corner case. it's trivial if fastdebug/slowdebug >>>>> stop >>>>> and tell you immediately. >>>>> >>>>> >>>>> >>>>> I am curious about this "We also may generate code with the purpose to >>>>> determine its size.". Could you tell me where is it? it looks quite >>>>> slow >>>>> to get buffer size in this way. >>>>> >>>>> >>>>> >>>>> thanks, >>>>> >>>>> --lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> thanks for fixing the issue in templateTable_x86. It looks correct. >>>>> I think even better would be >>>>> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" >>>>> and >>>>> "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. >>>>> But I leave it up to you if you want to change it. I'm also ok with your >>>>> version. >>>>> >>>>> I'm not convinced that the label assertion is reliable. There may be >>>>> many >>>>> more places in hotspot where we bail out having an unbound label. >>>>> >>>> Running a >>>> >>>>> few tests on x86 is by far not sufficient. The assertion may fire >>>>> sporadically in large scenarios on some platforms. >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>>>> Sent: Donnerstag, 12. Juli 2018 22:51 >>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>>>> for x86 >>>>> >>>>> Could you review this patch again? >>>>> >>>>> Revision #2. >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>> ttps://bugs.openjdk.java.net/browse/JDK-8206075> >>>>> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>>> openjdk8u/webrev/index.html >>>> com/openjdk-webrevs/openjdk8u/webrev/index.html> >>>>> >>>>> >>>>> >>>>> The idea is simple. I just reset the problematic label when c1 >>>>> compilation >>>>> bailout happen. >>>>> I manually ran tier1 on my laptop. it can pass all of them. >>>>> Paul help me submit the patch to submit and here is the run result. >>>>> Build Details: 2018-07-12-1736388.hohensee.source >>>>> >>>>> 0 Failed Tests >>>>> >>>>> Mach5 Tasks Results Summary >>>>> >>>>> PASSED: 75 >>>>> UNABLE_TO_RUN: 0 >>>>> KILLED: 0 >>>>> NA: 0 >>>>> FAILED: 0 >>>>> EXECUTED_WITH_FAILURE: 0 >>>>> >>>>> >>>>> Thanks, >>>>> ?lx >>>>> >>>>>> On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: >>>>>> >>>>>> Thank you for your reviews. Indeed, I didn?t deal with bailout >>>>>> >>>>> situation. "compiler/codegen/TestCharVect2.java? is the case of >>>>> codeBuffer overflow and leave a unbound label behind. >>>>> >>>>>> I made another revision. I will run tests thoroughly. >>>>>> >>>>>> Thanks, >>>>>> ?lx >>>>>> >>>>>> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul >>>>>>> >>>>>> wrote: >>>>> >>>>>> >>>>>>> Imo it's still good hygiene to require that Labels be bound if they're >>>>>>> >>>>>> used, even if the generated code will never be executed. E.g., code >>>>> that >>>>> generates code for sizing purposes may be repurposed to generate >>>>> >>>> executable >>>> >>>>> code, in which case an unbound label may be a lurking bug. Also, I'm >>>>> unaware (I may be corrected!) of any situation where bailing out happens >>>>> >>>> in >>>> >>>>> such a way as to both leave a Label unbound and execute its destructor. >>>>> Even if there are, I'd say that'd be indicative of another real problem, >>>>> such as code buffer overflow, so no harm would result. >>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Paul >>>>>>> >>>>>>> ?On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" >>>>>>> >>>>>> < >>>> >>>>> hotspot-runtime-dev-bounces at openjdk.java.net on behalf of >>>>> martin.doerr at sap.com> wrote: >>>>> >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I think the idea is good, but doesn't work in all cases. >>>>>>> We may bail out from code generation and discard the generated code >>>>>>> >>>>>> leaving the label unbound. >>>>> >>>>>> We also may generate code with the purpose to determine its size. We >>>>>>> >>>>>> don't need to bind labels because the code will never get executed. >>>>> >>>>>> >>>>>>> Best regards, >>>>>>> Martin >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>>>> >>>>>> bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >>>>> >>>>>> Sent: Mittwoch, 11. Juli 2018 03:34 >>>>>>> To: Liu Xin ; hotspot >>>>>>> >>>>>> -runtime-dev at openjdk.java.net >>>>> >>>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler >>>>>>> >>>>>> Labels for x86 >>>>> >>>>>> >>>>>>> I hit new assert in few other tests: >>>>>>> >>>>>>> compiler/codegen/TestCharVect2.java >>>>>>> compiler/c2/cr6340864/* >>>>>>> >>>>>>> Regards, >>>>>>> Vladimir >>>>>>> >>>>>>> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>>>>>> >>>>>>>> Fix looks reasonable. I will test it in our framework. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>>>>>> >>>>>>>>> Hi, Community, >>>>>>>>> Could you please review this small patch? >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>>>>>>> >>>>>>>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>>>>>>> >>>>>>>>> Problem: >>>>>>>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is >>>>>>>>> >>>>>>>> OFF. >>>> >>>>> This patch align up x86 with other architectures(ppc, arm). >>>>>>>>> Add an assertion to the destructor of Label. It will be wiped out >>>>>>>>> in >>>>>>>>> >>>>>>>> release build. >>>>> >>>>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>>>>>>> make run-test >>>>>>>>> >>>>>>>> TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>> >>>>> If this CR is approved, Paul Hohensee will push it. >>>>>>>>> Thanks, >>>>>>>>> --lx >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From jiangli.zhou at oracle.com Fri Jul 20 18:31:57 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 20 Jul 2018 11:31:57 -0700 Subject: RFR: 8207263: Store the Configuration for system modules into CDS archive Message-ID: Please review the following webrev that archives the system module boot layer Configuration (including all java objects reachable from the Configuration) in CDS archive. This is built on top of the earlier change for JDK-8202035 (https://bugs.openjdk.java.net/browse/JDK-8202035), which provides a framework for object sub-graph archiving. The boot layer Configuration is created in ModuleBootstrap.boot() (similar to the archived system ModuleDescriptor objects, etc) and is unchanged after construction. With archived boot layer Configuration, it allows runtime to bypass the work for creating the configuration. Currently, this is only supported when the initial module is unnamed module. Measurements indicate archiving the boot layer Configuration improves the startup time by 1% ~ 1.5% (on linux-x64) when running HelloWorld from -cp at runtime. Many thanks to Alan and Claes for discussions and contributions to this change! Webrev: http://cr.openjdk.java.net/~jiangli/8207263/webrev.00/ RFE: https://bugs.openjdk.java.net/browse/JDK-8207263 Tested with tier1 - tier5 tests via mach5. Thanks, Jiangli From hohensee at amazon.com Fri Jul 20 18:37:13 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 20 Jul 2018 18:37:13 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> Message-ID: <9EB54488-C95A-4A2F-99E3-410545DB1824@amazon.com> New webrev: http://cr.openjdk.java.net/~phh/8206075/webrev.01/ Thanks, Paul ?On 7/20/18, 11:32 AM, "hotspot-runtime-dev on behalf of Vladimir Kozlov" wrote: On 7/20/18 11:11 AM, Liu Xin wrote: > Thanks, Vladimir and Goetz. Could yo approve what you tested? I am fine with your latest changes but you need to post webrev on cr.openjdk. I will review it then. > > > For the patch, I think it's another story. I am *NOT* sure if we should > need it. It's about C++ object model. I feel hotspot is using C++ in > non-standard way. I am confusing about C++ in hotspot. > In regular C++ , we should manage the life cycle of objects carefully. > > If you take a look at usage of this macro, some non-pod classes don't > construct but use directly. > #define NEW_RESOURCE_ARRAY(type, size)\ > (type*) resource_allocate_bytes((size) * sizeof(type)) > > eg. > VMRegPair* out_regs = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args); > > May I create a new RFR to enhance it? > I want to introduce a meta-programming template like boost's is_pod. > https://www.boost.org/doc/libs/1_44_0/libs/type_traits/doc/html/boost_typetraits/reference/is_pod.html Be careful. Hotspot have to be compiled by big variety of C++ compilers and not all of them support latest features. Regards, Vladimir > > NEW_RESOURCE_ARRAY should call constructors for those classes which are not > pod. > > thanks, > --lx > > > > > On Fri, Jul 20, 2018 at 9:18 AM, Vladimir Kozlov > wrote: > >> My testing also passed clean. I tested next patch: >> >> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/ >> label_bugfix/index.html >> >> Please, post it on cr.openjdk server for final review. We can't review and >> use patches from other places. >> >> Thanks, >> Vladimir >> >> >> On 7/20/18 12:29 AM, Lindenmaier, Goetz wrote: >> >>> Hi Liu, >>> >>> Martin had put the patch into our testing queue. >>> All the platforms we build are fine. >>> This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, >>> aix ppc64, solaris sparcv9, mac. >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>>> Sent: Freitag, 20. Juli 2018 09:16 >>>> To: Vladimir Kozlov >>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>>> for >>>> x86 >>>> >>>> Hello, Vladimir, >>>> Could you run on other platform on behalf of Martin? >>>> I locally tested on x86_64. I hope the Reviewer can help me verify it >>>> works >>>> on other platforms. >>>> >>>> >>>> Furthermore, I am sure if we should add this additional patch. >>>> Label class is not POD, we should properly call constructor /destructor >>>> even though those labels are allocated on arena. >>>> >>>> >>>> thanks, >>>> --lx >>>> >>>> On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin >>>> wrote: >>>> >>>> Hi Liu Xin, >>>>> >>>>> >>>>> >>>>> thanks for understanding my point and checking other places. >>>>> >>>>> >>>>> >>>>> The templateTable_x86.cpp was reviewed by me. >>>>> >>>>> I can?t review the label assertion before my vacation. If other >>>>> reviewers >>>>> are convinced that the it is correct, ok. >>>>> >>>>> >>>>> >>>>> Would be great if somebody could assist with testing other platforms. >>>>> >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Martin >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com] >>>>> *Sent:* Dienstag, 17. Juli 2018 19:17 >>>>> >>>>> *To:* Doerr, Martin >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>>> Labels for x86 >>>>> >>>>> >>>>> >>>>> Hi, Martin, >>>>> >>>>> >>>>> >>>>> Thank you for the feedback. >>>>> >>>>> >>>>> >>>>> I totally agree with you that we shouldn?t introduce false positive >>>>> assertion. Let?s insist on the high bar here. >>>>> >>>>> I browsed many sources in hotspot recently. Hotspot is the most >>>>> monolithic >>>>> software I ever seen. I am glad to be directed by a guidance and clear >>>>> target. >>>>> >>>>> >>>>> >>>>> I think I dealt with c1 bailout case. This case triggers "codebuffer >>>>> overflow" in middle of c1 compilation. >>>>> >>>>> compiler/codegen/TestCharVect2.java >>>>> >>>>> >>>>> >>>>> I am still not sure about c2 bailout case. Let me try to make one. >>>>> >>>>> >>>>> >>>>> For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp >>>>> contains many emits methods for MachNode. I will double-check if they >>>>> >>>> could >>>> >>>>> leave unused labels. >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> ?lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: >>>>> >>>>> >>>>> >>>>> Hi, List, >>>>> >>>>> >>>>> >>>>> Could you review this new revision? >>>>> >>>>> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>>> jdk/label_bugfix/index.html >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I >>>>> don?t understand all the assemblies, but I think they are guarded >>>>> for UseOnStackReplacement >>>>> >>>>> in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). >>>>> >>>>> >>>>> >>>>> TemplateTable_arm.cpp is a little different. It explicitly binds it >>>>> later. >>>>> >>>>> if (!UseOnStackReplacement) { >>>>> >>>>> __ bind(backedge_counter_overflow); >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> i) I checked the Compile::scratch_emit_size. It only uses the label >>>>> fakeL >>>>> for those MachBranch nodes. >>>>> >>>>> Because fakeL will be bound to a trivial address if the nodes are >>>>> MachBranch, It?s also safe for the assertion. >>>>> >>>>> >>>>> >>>>> bool is_branch = n->is_MachBranch(); >>>>> >>>>> if (is_branch) { >>>>> >>>>> MacroAssembler masm(&buf); >>>>> >>>>> masm.bind(fakeL); >>>>> >>>>> n->as_MachBranch()->save_label(&saveL, &save_bnum); >>>>> >>>>> n->as_MachBranch()->label_set(&fakeL, 0); >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> ?lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 16, 2018, at 1:30 AM, Doerr, Martin >>>>> wrote: >>>>> >>>>> >>>>> >>>>> Hi Liu Xin, >>>>> >>>>> >>>>> >>>>> thanks for changing. >>>>> >>>>> >>>>> >>>>> The background of this Assertion is that our engineer used to spend >>>>>> >>>>> many >>>> >>>>> hour to trace down a corner case. >>>>> >>>>> it's trivial if fastdebug/slowdebug stop and tell you immediately. >>>>>> >>>>> >>>>> >>>>> >>>>> I understand that. But an assertion should only get added when we are >>>>> convinced that it won?t produce false positives. >>>>> >>>>> It?s very annoying if long running tests break due to an incorrect >>>>> assertion after running many days. >>>>> >>>>> >>>>> >>>>> I am curious about this "We also may generate code with the purpose to >>>>>> >>>>> determine its size.". >>>>> >>>>> Could you tell me where is it? it looks quite slow to get buffer size in >>>>>> >>>>> this way. >>>>> >>>>> >>>>> >>>>> C2 Compiler does that in Compile::scratch_emit_size. >>>>> >>>>> >>>>> >>>>> Please note that I?ll be on vacation soon, so other people will have to >>>>> review. >>>>> >>>>> Thanks again for fixing the -XX:-UseOnStackReplacement issue. >>>>> >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Martin >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com ] >>>>> *Sent:* Freitag, 13. Juli 2018 22:30 >>>>> *To:* Doerr, Martin >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>>> Labels for x86 >>>>> >>>>> >>>>> >>>>> Hello, Martin, >>>>> >>>>> >>>>> >>>>> Thanks for reviewing it. >>>>> >>>>> >>>>> >>>>> I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" >>>>> and is running tests. >>>>> >>>>> >>>>> >>>>> The background of this Assertion is that our engineer used to spend many >>>>> hour to trace down a corner case. it's trivial if fastdebug/slowdebug >>>>> stop >>>>> and tell you immediately. >>>>> >>>>> >>>>> >>>>> I am curious about this "We also may generate code with the purpose to >>>>> determine its size.". Could you tell me where is it? it looks quite >>>>> slow >>>>> to get buffer size in this way. >>>>> >>>>> >>>>> >>>>> thanks, >>>>> >>>>> --lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> thanks for fixing the issue in templateTable_x86. It looks correct. >>>>> I think even better would be >>>>> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" >>>>> and >>>>> "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. >>>>> But I leave it up to you if you want to change it. I'm also ok with your >>>>> version. >>>>> >>>>> I'm not convinced that the label assertion is reliable. There may be >>>>> many >>>>> more places in hotspot where we bail out having an unbound label. >>>>> >>>> Running a >>>> >>>>> few tests on x86 is by far not sufficient. The assertion may fire >>>>> sporadically in large scenarios on some platforms. >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>>>> Sent: Donnerstag, 12. Juli 2018 22:51 >>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>>>> for x86 >>>>> >>>>> Could you review this patch again? >>>>> >>>>> Revision #2. >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>> ttps://bugs.openjdk.java.net/browse/JDK-8206075> >>>>> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>>> openjdk8u/webrev/index.html >>>> com/openjdk-webrevs/openjdk8u/webrev/index.html> >>>>> >>>>> >>>>> >>>>> The idea is simple. I just reset the problematic label when c1 >>>>> compilation >>>>> bailout happen. >>>>> I manually ran tier1 on my laptop. it can pass all of them. >>>>> Paul help me submit the patch to submit and here is the run result. >>>>> Build Details: 2018-07-12-1736388.hohensee.source >>>>> >>>>> 0 Failed Tests >>>>> >>>>> Mach5 Tasks Results Summary >>>>> >>>>> PASSED: 75 >>>>> UNABLE_TO_RUN: 0 >>>>> KILLED: 0 >>>>> NA: 0 >>>>> FAILED: 0 >>>>> EXECUTED_WITH_FAILURE: 0 >>>>> >>>>> >>>>> Thanks, >>>>> ?lx >>>>> >>>>>> On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: >>>>>> >>>>>> Thank you for your reviews. Indeed, I didn?t deal with bailout >>>>>> >>>>> situation. "compiler/codegen/TestCharVect2.java? is the case of >>>>> codeBuffer overflow and leave a unbound label behind. >>>>> >>>>>> I made another revision. I will run tests thoroughly. >>>>>> >>>>>> Thanks, >>>>>> ?lx >>>>>> >>>>>> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul >>>>>>> >>>>>> wrote: >>>>> >>>>>> >>>>>>> Imo it's still good hygiene to require that Labels be bound if they're >>>>>>> >>>>>> used, even if the generated code will never be executed. E.g., code >>>>> that >>>>> generates code for sizing purposes may be repurposed to generate >>>>> >>>> executable >>>> >>>>> code, in which case an unbound label may be a lurking bug. Also, I'm >>>>> unaware (I may be corrected!) of any situation where bailing out happens >>>>> >>>> in >>>> >>>>> such a way as to both leave a Label unbound and execute its destructor. >>>>> Even if there are, I'd say that'd be indicative of another real problem, >>>>> such as code buffer overflow, so no harm would result. >>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Paul >>>>>>> >>>>>>> On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" >>>>>>> >>>>>> < >>>> >>>>> hotspot-runtime-dev-bounces at openjdk.java.net on behalf of >>>>> martin.doerr at sap.com> wrote: >>>>> >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I think the idea is good, but doesn't work in all cases. >>>>>>> We may bail out from code generation and discard the generated code >>>>>>> >>>>>> leaving the label unbound. >>>>> >>>>>> We also may generate code with the purpose to determine its size. We >>>>>>> >>>>>> don't need to bind labels because the code will never get executed. >>>>> >>>>>> >>>>>>> Best regards, >>>>>>> Martin >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>>>> >>>>>> bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >>>>> >>>>>> Sent: Mittwoch, 11. Juli 2018 03:34 >>>>>>> To: Liu Xin ; hotspot >>>>>>> >>>>>> -runtime-dev at openjdk.java.net >>>>> >>>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler >>>>>>> >>>>>> Labels for x86 >>>>> >>>>>> >>>>>>> I hit new assert in few other tests: >>>>>>> >>>>>>> compiler/codegen/TestCharVect2.java >>>>>>> compiler/c2/cr6340864/* >>>>>>> >>>>>>> Regards, >>>>>>> Vladimir >>>>>>> >>>>>>> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>>>>>> >>>>>>>> Fix looks reasonable. I will test it in our framework. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>>>>>> >>>>>>>>> Hi, Community, >>>>>>>>> Could you please review this small patch? >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>>>>>>> >>>>>>>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>>>>>>> >>>>>>>>> Problem: >>>>>>>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is >>>>>>>>> >>>>>>>> OFF. >>>> >>>>> This patch align up x86 with other architectures(ppc, arm). >>>>>>>>> Add an assertion to the destructor of Label. It will be wiped out >>>>>>>>> in >>>>>>>>> >>>>>>>> release build. >>>>> >>>>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>>>>>>> make run-test >>>>>>>>> >>>>>>>> TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>> >>>>> If this CR is approved, Paul Hohensee will push it. >>>>>>>>> Thanks, >>>>>>>>> --lx >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From serguei.spitsyn at oracle.com Fri Jul 20 18:46:46 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Jul 2018 11:46:46 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: <22211468-5b15-e6a8-be6b-7ce5d2fbdf27@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <22211468-5b15-e6a8-be6b-7ce5d2fbdf27@oracle.com> Message-ID: <76c36ed6-a073-be55-fdfd-36855b4340f4@oracle.com> Vladimir, It was my fault as well sorry. Initially, this review was posted on runtime and hotspot. I also asked to add the serviceability-dev. Probably, the hotspot-dev has to be enough in this case. Thanks, Serguei On 7/20/18 10:57, Vladimir Kozlov wrote: > Please, don't do review in 2 mailing threads. > > Thanks, > Vladimir > > On 7/20/18 8:30 AM, JC Beyler wrote: >> Awesome thanks Thomas! >> >> Here is the webrev with the extra information then: >> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ >> >> Thanks again for all the reviews everyone! >> Jc >> >> On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl >> >> wrote: >> >>> Hi, >>> >>> On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote: >>>> Hi all, >>>> >>>> Here is a webrev that does all the architectures in the same way: >>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>> >>>> Could anyone review the other architectures and test? >>>> ?? - arm, sparc & aarch64 are also modified now to follow the same "if >>>> no >>>> tlab, then consider eden space allocation" logic. >>>> >>>> Thanks for your help! >>>> Jc >>>> >>> >>> ?? looks good. >>> >>> I ran the change through hs-tier1-3 with no issues. It only tests on >>> sparc and x64 though. >>> >>> I do not expect issues on the other platforms though :) >>> >>> Thanks, >>> ?? Thomas >>> >>> >> From vladimir.kozlov at oracle.com Fri Jul 20 18:46:59 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 20 Jul 2018 11:46:59 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <9EB54488-C95A-4A2F-99E3-410545DB1824@amazon.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> <9EB54488-C95A-4A2F-99E3-410545DB1824@amazon.com> Message-ID: <84f48d03-f2a5-de45-dfe8-c971a3389577@oracle.com> This looks good. I will sponsor it. Thanks, Vladimir On 7/20/18 11:37 AM, Hohensee, Paul wrote: > New webrev: http://cr.openjdk.java.net/~phh/8206075/webrev.01/ > > Thanks, > > Paul > > ?On 7/20/18, 11:32 AM, "hotspot-runtime-dev on behalf of Vladimir Kozlov" wrote: > > On 7/20/18 11:11 AM, Liu Xin wrote: > > Thanks, Vladimir and Goetz. Could yo approve what you tested? > > I am fine with your latest changes but you need to post webrev on > cr.openjdk. I will review it then. > > > > > > > For the patch, I think it's another story. I am *NOT* sure if we should > > need it. It's about C++ object model. I feel hotspot is using C++ in > > non-standard way. I am confusing about C++ in hotspot. > > In regular C++ , we should manage the life cycle of objects carefully. > > > > If you take a look at usage of this macro, some non-pod classes don't > > construct but use directly. > > #define NEW_RESOURCE_ARRAY(type, size)\ > > (type*) resource_allocate_bytes((size) * sizeof(type)) > > > > eg. > > VMRegPair* out_regs = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args); > > > > May I create a new RFR to enhance it? > > I want to introduce a meta-programming template like boost's is_pod. > > https://www.boost.org/doc/libs/1_44_0/libs/type_traits/doc/html/boost_typetraits/reference/is_pod.html > > Be careful. Hotspot have to be compiled by big variety of C++ compilers > and not all of them support latest features. > > Regards, > Vladimir > > > > > NEW_RESOURCE_ARRAY should call constructors for those classes which are not > > pod. > > > > thanks, > > --lx > > > > > > > > > > On Fri, Jul 20, 2018 at 9:18 AM, Vladimir Kozlov >> wrote: > > > >> My testing also passed clean. I tested next patch: > >> > >> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/ > >> label_bugfix/index.html > >> > >> Please, post it on cr.openjdk server for final review. We can't review and > >> use patches from other places. > >> > >> Thanks, > >> Vladimir > >> > >> > >> On 7/20/18 12:29 AM, Lindenmaier, Goetz wrote: > >> > >>> Hi Liu, > >>> > >>> Martin had put the patch into our testing queue. > >>> All the platforms we build are fine. > >>> This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, > >>> aix ppc64, solaris sparcv9, mac. > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> -----Original Message----- > >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > >>>> bounces at openjdk.java.net] On Behalf Of Liu Xin > >>>> Sent: Freitag, 20. Juli 2018 09:16 > >>>> To: Vladimir Kozlov > >>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels > >>>> for > >>>> x86 > >>>> > >>>> Hello, Vladimir, > >>>> Could you run on other platform on behalf of Martin? > >>>> I locally tested on x86_64. I hope the Reviewer can help me verify it > >>>> works > >>>> on other platforms. > >>>> > >>>> > >>>> Furthermore, I am sure if we should add this additional patch. > >>>> Label class is not POD, we should properly call constructor /destructor > >>>> even though those labels are allocated on arena. > >>>> > >>>> > >>>> thanks, > >>>> --lx > >>>> > >>>> On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin > >>>> wrote: > >>>> > >>>> Hi Liu Xin, > >>>>> > >>>>> > >>>>> > >>>>> thanks for understanding my point and checking other places. > >>>>> > >>>>> > >>>>> > >>>>> The templateTable_x86.cpp was reviewed by me. > >>>>> > >>>>> I can?t review the label assertion before my vacation. If other > >>>>> reviewers > >>>>> are convinced that the it is correct, ok. > >>>>> > >>>>> > >>>>> > >>>>> Would be great if somebody could assist with testing other platforms. > >>>>> > >>>>> > >>>>> > >>>>> Best regards, > >>>>> > >>>>> Martin > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com] > >>>>> *Sent:* Dienstag, 17. Juli 2018 19:17 > >>>>> > >>>>> *To:* Doerr, Martin > >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net > >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler > >>>>> Labels for x86 > >>>>> > >>>>> > >>>>> > >>>>> Hi, Martin, > >>>>> > >>>>> > >>>>> > >>>>> Thank you for the feedback. > >>>>> > >>>>> > >>>>> > >>>>> I totally agree with you that we shouldn?t introduce false positive > >>>>> assertion. Let?s insist on the high bar here. > >>>>> > >>>>> I browsed many sources in hotspot recently. Hotspot is the most > >>>>> monolithic > >>>>> software I ever seen. I am glad to be directed by a guidance and clear > >>>>> target. > >>>>> > >>>>> > >>>>> > >>>>> I think I dealt with c1 bailout case. This case triggers "codebuffer > >>>>> overflow" in middle of c1 compilation. > >>>>> > >>>>> compiler/codegen/TestCharVect2.java > >>>>> > >>>>> > >>>>> > >>>>> I am still not sure about c2 bailout case. Let me try to make one. > >>>>> > >>>>> > >>>>> > >>>>> For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp > >>>>> contains many emits methods for MachNode. I will double-check if they > >>>>> > >>>> could > >>>> > >>>>> leave unused labels. > >>>>> > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> ?lx > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Jul 16, 2018, at 2:09 PM, Liu Xin wrote: > >>>>> > >>>>> > >>>>> > >>>>> Hi, List, > >>>>> > >>>>> > >>>>> > >>>>> Could you review this new revision? > >>>>> > >>>>> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ > >>>>> jdk/label_bugfix/index.html > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I > >>>>> don?t understand all the assemblies, but I think they are guarded > >>>>> for UseOnStackReplacement > >>>>> > >>>>> in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). > >>>>> > >>>>> > >>>>> > >>>>> TemplateTable_arm.cpp is a little different. It explicitly binds it > >>>>> later. > >>>>> > >>>>> if (!UseOnStackReplacement) { > >>>>> > >>>>> __ bind(backedge_counter_overflow); > >>>>> > >>>>> } > >>>>> > >>>>> > >>>>> > >>>>> i) I checked the Compile::scratch_emit_size. It only uses the label > >>>>> fakeL > >>>>> for those MachBranch nodes. > >>>>> > >>>>> Because fakeL will be bound to a trivial address if the nodes are > >>>>> MachBranch, It?s also safe for the assertion. > >>>>> > >>>>> > >>>>> > >>>>> bool is_branch = n->is_MachBranch(); > >>>>> > >>>>> if (is_branch) { > >>>>> > >>>>> MacroAssembler masm(&buf); > >>>>> > >>>>> masm.bind(fakeL); > >>>>> > >>>>> n->as_MachBranch()->save_label(&saveL, &save_bnum); > >>>>> > >>>>> n->as_MachBranch()->label_set(&fakeL, 0); > >>>>> > >>>>> } > >>>>> > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> ?lx > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Jul 16, 2018, at 1:30 AM, Doerr, Martin > >>>>> wrote: > >>>>> > >>>>> > >>>>> > >>>>> Hi Liu Xin, > >>>>> > >>>>> > >>>>> > >>>>> thanks for changing. > >>>>> > >>>>> > >>>>> > >>>>> The background of this Assertion is that our engineer used to spend > >>>>>> > >>>>> many > >>>> > >>>>> hour to trace down a corner case. > >>>>> > >>>>> it's trivial if fastdebug/slowdebug stop and tell you immediately. > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> I understand that. But an assertion should only get added when we are > >>>>> convinced that it won?t produce false positives. > >>>>> > >>>>> It?s very annoying if long running tests break due to an incorrect > >>>>> assertion after running many days. > >>>>> > >>>>> > >>>>> > >>>>> I am curious about this "We also may generate code with the purpose to > >>>>>> > >>>>> determine its size.". > >>>>> > >>>>> Could you tell me where is it? it looks quite slow to get buffer size in > >>>>>> > >>>>> this way. > >>>>> > >>>>> > >>>>> > >>>>> C2 Compiler does that in Compile::scratch_emit_size. > >>>>> > >>>>> > >>>>> > >>>>> Please note that I?ll be on vacation soon, so other people will have to > >>>>> review. > >>>>> > >>>>> Thanks again for fixing the -XX:-UseOnStackReplacement issue. > >>>>> > >>>>> > >>>>> > >>>>> Best regards, > >>>>> > >>>>> Martin > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com ] > >>>>> *Sent:* Freitag, 13. Juli 2018 22:30 > >>>>> *To:* Doerr, Martin > >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net > >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler > >>>>> Labels for x86 > >>>>> > >>>>> > >>>>> > >>>>> Hello, Martin, > >>>>> > >>>>> > >>>>> > >>>>> Thanks for reviewing it. > >>>>> > >>>>> > >>>>> > >>>>> I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" > >>>>> and is running tests. > >>>>> > >>>>> > >>>>> > >>>>> The background of this Assertion is that our engineer used to spend many > >>>>> hour to trace down a corner case. it's trivial if fastdebug/slowdebug > >>>>> stop > >>>>> and tell you immediately. > >>>>> > >>>>> > >>>>> > >>>>> I am curious about this "We also may generate code with the purpose to > >>>>> determine its size.". Could you tell me where is it? it looks quite > >>>>> slow > >>>>> to get buffer size in this way. > >>>>> > >>>>> > >>>>> > >>>>> thanks, > >>>>> > >>>>> --lx > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > >>>>> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> thanks for fixing the issue in templateTable_x86. It looks correct. > >>>>> I think even better would be > >>>>> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" > >>>>> and > >>>>> "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. > >>>>> But I leave it up to you if you want to change it. I'm also ok with your > >>>>> version. > >>>>> > >>>>> I'm not convinced that the label assertion is reliable. There may be > >>>>> many > >>>>> more places in hotspot where we bail out having an unbound label. > >>>>> > >>>> Running a > >>>> > >>>>> few tests on x86 is by far not sufficient. The assertion may fire > >>>>> sporadically in large scenarios on some platforms. > >>>>> > >>>>> Best regards, > >>>>> Martin > >>>>> > >>>>> > >>>>> -----Original Message----- > >>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > >>>>> bounces at openjdk.java.net] On Behalf Of Liu Xin > >>>>> Sent: Donnerstag, 12. Juli 2018 22:51 > >>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels > >>>>> for x86 > >>>>> > >>>>> Could you review this patch again? > >>>>> > >>>>> Revision #2. > >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>>> ttps://bugs.openjdk.java.net/browse/JDK-8206075> > >>>>> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ > >>>>> openjdk8u/webrev/index.html >>>>> com/openjdk-webrevs/openjdk8u/webrev/index.html> > >>>>> > >>>>> > >>>>> > >>>>> The idea is simple. I just reset the problematic label when c1 > >>>>> compilation > >>>>> bailout happen. > >>>>> I manually ran tier1 on my laptop. it can pass all of them. > >>>>> Paul help me submit the patch to submit and here is the run result. > >>>>> Build Details: 2018-07-12-1736388.hohensee.source > >>>>> > >>>>> 0 Failed Tests > >>>>> > >>>>> Mach5 Tasks Results Summary > >>>>> > >>>>> PASSED: 75 > >>>>> UNABLE_TO_RUN: 0 > >>>>> KILLED: 0 > >>>>> NA: 0 > >>>>> FAILED: 0 > >>>>> EXECUTED_WITH_FAILURE: 0 > >>>>> > >>>>> > >>>>> Thanks, > >>>>> ?lx > >>>>> > >>>>>> On Jul 11, 2018, at 10:35 AM, Liu Xin wrote: > >>>>>> > >>>>>> Thank you for your reviews. Indeed, I didn?t deal with bailout > >>>>>> > >>>>> situation. "compiler/codegen/TestCharVect2.java? is the case of > >>>>> codeBuffer overflow and leave a unbound label behind. > >>>>> > >>>>>> I made another revision. I will run tests thoroughly. > >>>>>> > >>>>>> Thanks, > >>>>>> ?lx > >>>>>> > >>>>>> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > >>>>>>> > >>>>>> wrote: > >>>>> > >>>>>> > >>>>>>> Imo it's still good hygiene to require that Labels be bound if they're > >>>>>>> > >>>>>> used, even if the generated code will never be executed. E.g., code > >>>>> that > >>>>> generates code for sizing purposes may be repurposed to generate > >>>>> > >>>> executable > >>>> > >>>>> code, in which case an unbound label may be a lurking bug. Also, I'm > >>>>> unaware (I may be corrected!) of any situation where bailing out happens > >>>>> > >>>> in > >>>> > >>>>> such a way as to both leave a Label unbound and execute its destructor. > >>>>> Even if there are, I'd say that'd be indicative of another real problem, > >>>>> such as code buffer overflow, so no harm would result. > >>>>> > >>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Paul > >>>>>>> > >>>>>>> On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" > >>>>>>> > >>>>>> < > >>>> > >>>>> hotspot-runtime-dev-bounces at openjdk.java.net on behalf of > >>>>> martin.doerr at sap.com> wrote: > >>>>> > >>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> I think the idea is good, but doesn't work in all cases. > >>>>>>> We may bail out from code generation and discard the generated code > >>>>>>> > >>>>>> leaving the label unbound. > >>>>> > >>>>>> We also may generate code with the purpose to determine its size. We > >>>>>>> > >>>>>> don't need to bind labels because the code will never get executed. > >>>>> > >>>>>> > >>>>>>> Best regards, > >>>>>>> Martin > >>>>>>> > >>>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > >>>>>>> > >>>>>> bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov > >>>>> > >>>>>> Sent: Mittwoch, 11. Juli 2018 03:34 > >>>>>>> To: Liu Xin ; hotspot > >>>>>>> > >>>>>> -runtime-dev at openjdk.java.net > >>>>> > >>>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler > >>>>>>> > >>>>>> Labels for x86 > >>>>> > >>>>>> > >>>>>>> I hit new assert in few other tests: > >>>>>>> > >>>>>>> compiler/codegen/TestCharVect2.java > >>>>>>> compiler/c2/cr6340864/* > >>>>>>> > >>>>>>> Regards, > >>>>>>> Vladimir > >>>>>>> > >>>>>>> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: > >>>>>>> > >>>>>>>> Fix looks reasonable. I will test it in our framework. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Vladimir > >>>>>>>> > >>>>>>>> On 7/10/18 9:50 AM, Liu Xin wrote: > >>>>>>>> > >>>>>>>>> Hi, Community, > >>>>>>>>> Could you please review this small patch? > >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > >>>>>>>>> > >>>>>>>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ > >>>>>>>>> > >>>>>>>>> Problem: > >>>>>>>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is > >>>>>>>>> > >>>>>>>> OFF. > >>>> > >>>>> This patch align up x86 with other architectures(ppc, arm). > >>>>>>>>> Add an assertion to the destructor of Label. It will be wiped out > >>>>>>>>> in > >>>>>>>>> > >>>>>>>> release build. > >>>>> > >>>>>> Previously, hotspot cannot pass this test with assertion on x86-64. > >>>>>>>>> make run-test > >>>>>>>>> > >>>>>>>> TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java > >>>> > >>>>> If this CR is approved, Paul Hohensee will push it. > >>>>>>>>> Thanks, > >>>>>>>>> --lx > >>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > > From navy.xliu at gmail.com Fri Jul 20 18:49:00 2018 From: navy.xliu at gmail.com (Liu Xin) Date: Fri, 20 Jul 2018 11:49:00 -0700 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: <84f48d03-f2a5-de45-dfe8-c971a3389577@oracle.com> References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> <9EB54488-C95A-4A2F-99E3-410545DB1824@amazon.com> <84f48d03-f2a5-de45-dfe8-c971a3389577@oracle.com> Message-ID: Cool. thanks. On Fri, Jul 20, 2018 at 11:46 AM, Vladimir Kozlov < vladimir.kozlov at oracle.com> wrote: > This looks good. I will sponsor it. > > Thanks, > Vladimir > > > On 7/20/18 11:37 AM, Hohensee, Paul wrote: > >> New webrev: http://cr.openjdk.java.net/~phh/8206075/webrev.01/ >> >> Thanks, >> >> Paul >> >> ?On 7/20/18, 11:32 AM, "hotspot-runtime-dev on behalf of Vladimir Kozlov" >> > vladimir.kozlov at oracle.com> wrote: >> >> On 7/20/18 11:11 AM, Liu Xin wrote: >> > Thanks, Vladimir and Goetz. Could yo approve what you tested? >> I am fine with your latest changes but you need to post webrev >> on >> cr.openjdk. I will review it then. >> > >> > >> > For the patch, I think it's another story. I am *NOT* sure if we >> should >> > need it. It's about C++ object model. I feel hotspot is using C++ >> in >> > non-standard way. I am confusing about C++ in hotspot. >> > In regular C++ , we should manage the life cycle of objects >> carefully. >> > >> > If you take a look at usage of this macro, some non-pod classes >> don't >> > construct but use directly. >> > #define NEW_RESOURCE_ARRAY(type, size)\ >> > (type*) resource_allocate_bytes((size) * sizeof(type)) >> > >> > eg. >> > VMRegPair* out_regs = NEW_RESOURCE_ARRAY(VMRegPair, >> total_c_args); >> > >> > May I create a new RFR to enhance it? >> > I want to introduce a meta-programming template like boost's >> is_pod. >> > https://www.boost.org/doc/libs/1_44_0/libs/type_traits/doc/ >> html/boost_typetraits/reference/is_pod.html >> Be careful. Hotspot have to be compiled by big variety of C++ >> compilers >> and not all of them support latest features. >> Regards, >> Vladimir >> > >> > NEW_RESOURCE_ARRAY should call constructors for those classes >> which are not >> > pod. >> > >> > thanks, >> > --lx >> > >> > >> > >> > >> > On Fri, Jul 20, 2018 at 9:18 AM, Vladimir Kozlov < >> vladimir.kozlov at oracle.com >> >> wrote: >> > >> >> My testing also passed clean. I tested next patch: >> >> >> >> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/ >> >> label_bugfix/index.html >> >> >> >> Please, post it on cr.openjdk server for final review. We can't >> review and >> >> use patches from other places. >> >> >> >> Thanks, >> >> Vladimir >> >> >> >> >> >> On 7/20/18 12:29 AM, Lindenmaier, Goetz wrote: >> >> >> >>> Hi Liu, >> >>> >> >>> Martin had put the patch into our testing queue. >> >>> All the platforms we build are fine. >> >>> This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, >> >>> aix ppc64, solaris sparcv9, mac. >> >>> >> >>> Best regards, >> >>> Goetz. >> >>> >> >>> -----Original Message----- >> >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> >>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >> >>>> Sent: Freitag, 20. Juli 2018 09:16 >> >>>> To: Vladimir Kozlov >> >>>> Cc: hotspot-runtime-dev at openjdk.java.net >> >>>> Subject: Re: RFR(S): 8206075: add assertion for unbound >> assembler Labels >> >>>> for >> >>>> x86 >> >>>> >> >>>> Hello, Vladimir, >> >>>> Could you run on other platform on behalf of Martin? >> >>>> I locally tested on x86_64. I hope the Reviewer can help me >> verify it >> >>>> works >> >>>> on other platforms. >> >>>> >> >>>> >> >>>> Furthermore, I am sure if we should add this additional patch. >> >>>> Label class is not POD, we should properly call constructor >> /destructor >> >>>> even though those labels are allocated on arena. >> >>>> >> >>>> >> >>>> thanks, >> >>>> --lx >> >>>> >> >>>> On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin < >> martin.doerr at sap.com> >> >>>> wrote: >> >>>> >> >>>> Hi Liu Xin, >> >>>>> >> >>>>> >> >>>>> >> >>>>> thanks for understanding my point and checking other places. >> >>>>> >> >>>>> >> >>>>> >> >>>>> The templateTable_x86.cpp was reviewed by me. >> >>>>> >> >>>>> I can?t review the label assertion before my vacation. If other >> >>>>> reviewers >> >>>>> are convinced that the it is correct, ok. >> >>>>> >> >>>>> >> >>>>> >> >>>>> Would be great if somebody could assist with testing other >> platforms. >> >>>>> >> >>>>> >> >>>>> >> >>>>> Best regards, >> >>>>> >> >>>>> Martin >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com] >> >>>>> *Sent:* Dienstag, 17. Juli 2018 19:17 >> >>>>> >> >>>>> *To:* Doerr, Martin >> >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >> >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound >> assembler >> >>>>> Labels for x86 >> >>>>> >> >>>>> >> >>>>> >> >>>>> Hi, Martin, >> >>>>> >> >>>>> >> >>>>> >> >>>>> Thank you for the feedback. >> >>>>> >> >>>>> >> >>>>> >> >>>>> I totally agree with you that we shouldn?t introduce false >> positive >> >>>>> assertion. Let?s insist on the high bar here. >> >>>>> >> >>>>> I browsed many sources in hotspot recently. Hotspot is the most >> >>>>> monolithic >> >>>>> software I ever seen. I am glad to be directed by a guidance >> and clear >> >>>>> target. >> >>>>> >> >>>>> >> >>>>> >> >>>>> I think I dealt with c1 bailout case. This case triggers >> "codebuffer >> >>>>> overflow" in middle of c1 compilation. >> >>>>> >> >>>>> compiler/codegen/TestCharVect2.java >> >>>>> >> >>>>> >> >>>>> >> >>>>> I am still not sure about c2 bailout case. Let me try to make >> one. >> >>>>> >> >>>>> >> >>>>> >> >>>>> For case #2, I got what you concerned. Indeed, the generated >> ad_x86.cpp >> >>>>> contains many emits methods for MachNode. I will double-check >> if they >> >>>>> >> >>>> could >> >>>> >> >>>>> leave unused labels. >> >>>>> >> >>>>> >> >>>>> >> >>>>> Thanks, >> >>>>> >> >>>>> ?lx >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Jul 16, 2018, at 2:09 PM, Liu Xin >> wrote: >> >>>>> >> >>>>> >> >>>>> >> >>>>> Hi, List, >> >>>>> >> >>>>> >> >>>>> >> >>>>> Could you review this new revision? >> >>>>> >> >>>>> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >> >>>>> jdk/label_bugfix/index.html >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> i) I took a look at all architectures, >> arm/aarch64/ppc64/sparc/x86. I >> >>>>> don?t understand all the assemblies, but I think they are >> guarded >> >>>>> for UseOnStackReplacement >> >>>>> >> >>>>> in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). >> >>>>> >> >>>>> >> >>>>> >> >>>>> TemplateTable_arm.cpp is a little different. It explicitly >> binds it >> >>>>> later. >> >>>>> >> >>>>> if (!UseOnStackReplacement) { >> >>>>> >> >>>>> __ bind(backedge_counter_overflow); >> >>>>> >> >>>>> } >> >>>>> >> >>>>> >> >>>>> >> >>>>> i) I checked the Compile::scratch_emit_size. It only uses the >> label >> >>>>> fakeL >> >>>>> for those MachBranch nodes. >> >>>>> >> >>>>> Because fakeL will be bound to a trivial address if the nodes >> are >> >>>>> MachBranch, It?s also safe for the assertion. >> >>>>> >> >>>>> >> >>>>> >> >>>>> bool is_branch = n->is_MachBranch(); >> >>>>> >> >>>>> if (is_branch) { >> >>>>> >> >>>>> MacroAssembler masm(&buf); >> >>>>> >> >>>>> masm.bind(fakeL); >> >>>>> >> >>>>> n->as_MachBranch()->save_label(&saveL, &save_bnum); >> >>>>> >> >>>>> n->as_MachBranch()->label_set(&fakeL, 0); >> >>>>> >> >>>>> } >> >>>>> >> >>>>> >> >>>>> >> >>>>> Thanks, >> >>>>> >> >>>>> ?lx >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Jul 16, 2018, at 1:30 AM, Doerr, Martin < >> martin.doerr at sap.com> >> >>>>> wrote: >> >>>>> >> >>>>> >> >>>>> >> >>>>> Hi Liu Xin, >> >>>>> >> >>>>> >> >>>>> >> >>>>> thanks for changing. >> >>>>> >> >>>>> >> >>>>> >> >>>>> The background of this Assertion is that our engineer used to >> spend >> >>>>>> >> >>>>> many >> >>>> >> >>>>> hour to trace down a corner case. >> >>>>> >> >>>>> it's trivial if fastdebug/slowdebug stop and tell you >> immediately. >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> I understand that. But an assertion should only get added when >> we are >> >>>>> convinced that it won?t produce false positives. >> >>>>> >> >>>>> It?s very annoying if long running tests break due to an >> incorrect >> >>>>> assertion after running many days. >> >>>>> >> >>>>> >> >>>>> >> >>>>> I am curious about this "We also may generate code with the >> purpose to >> >>>>>> >> >>>>> determine its size.". >> >>>>> >> >>>>> Could you tell me where is it? it looks quite slow to get >> buffer size in >> >>>>>> >> >>>>> this way. >> >>>>> >> >>>>> >> >>>>> >> >>>>> C2 Compiler does that in Compile::scratch_emit_size. >> >>>>> >> >>>>> >> >>>>> >> >>>>> Please note that I?ll be on vacation soon, so other people >> will have to >> >>>>> review. >> >>>>> >> >>>>> Thanks again for fixing the -XX:-UseOnStackReplacement issue. >> >>>>> >> >>>>> >> >>>>> >> >>>>> Best regards, >> >>>>> >> >>>>> Martin >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com < >> navy.xliu at gmail.com>] >> >>>>> *Sent:* Freitag, 13. Juli 2018 22:30 >> >>>>> *To:* Doerr, Martin >> >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >> >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound >> assembler >> >>>>> Labels for x86 >> >>>>> >> >>>>> >> >>>>> >> >>>>> Hello, Martin, >> >>>>> >> >>>>> >> >>>>> >> >>>>> Thanks for reviewing it. >> >>>>> >> >>>>> >> >>>>> >> >>>>> I got your point. I made it "if (where != NULL) { jcc(cond, >> *where); }" >> >>>>> and is running tests. >> >>>>> >> >>>>> >> >>>>> >> >>>>> The background of this Assertion is that our engineer used to >> spend many >> >>>>> hour to trace down a corner case. it's trivial if >> fastdebug/slowdebug >> >>>>> stop >> >>>>> and tell you immediately. >> >>>>> >> >>>>> >> >>>>> >> >>>>> I am curious about this "We also may generate code with the >> purpose to >> >>>>> determine its size.". Could you tell me where is it? it looks >> quite >> >>>>> slow >> >>>>> to get buffer size in this way. >> >>>>> >> >>>>> >> >>>>> >> >>>>> thanks, >> >>>>> >> >>>>> --lx >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin < >> martin.doerr at sap.com> >> >>>>> wrote: >> >>>>> >> >>>>> Hi, >> >>>>> >> >>>>> thanks for fixing the issue in templateTable_x86. It looks >> correct. >> >>>>> I think even better would be >> >>>>> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" >> >>>>> and >> >>>>> "if (where != NULL) { jcc(cond, *where); }" in >> interp_masm_x86.cpp. >> >>>>> But I leave it up to you if you want to change it. I'm also ok >> with your >> >>>>> version. >> >>>>> >> >>>>> I'm not convinced that the label assertion is reliable. There >> may be >> >>>>> many >> >>>>> more places in hotspot where we bail out having an unbound >> label. >> >>>>> >> >>>> Running a >> >>>> >> >>>>> few tests on x86 is by far not sufficient. The assertion may >> fire >> >>>>> sporadically in large scenarios on some platforms. >> >>>>> >> >>>>> Best regards, >> >>>>> Martin >> >>>>> >> >>>>> >> >>>>> -----Original Message----- >> >>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> >>>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >> >>>>> Sent: Donnerstag, 12. Juli 2018 22:51 >> >>>>> To: hotspot-runtime-dev at openjdk.java.net >> >>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound >> assembler Labels >> >>>>> for x86 >> >>>>> >> >>>>> Could you review this patch again? >> >>>>> >> >>>>> Revision #2. >> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 > >>>>> ttps://bugs.openjdk.java.net/browse/JDK-8206075> >> >>>>> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >> >>>>> openjdk8u/webrev/index.html > >>>>> com/openjdk-webrevs/openjdk8u/webrev/index.html> >> >>>>> >> >>>>> >> >>>>> >> >>>>> The idea is simple. I just reset the problematic label when c1 >> >>>>> compilation >> >>>>> bailout happen. >> >>>>> I manually ran tier1 on my laptop. it can pass all of them. >> >>>>> Paul help me submit the patch to submit and here is the run >> result. >> >>>>> Build Details: 2018-07-12-1736388.hohensee.source >> >>>>> >> >>>>> 0 Failed Tests >> >>>>> >> >>>>> Mach5 Tasks Results Summary >> >>>>> >> >>>>> PASSED: 75 >> >>>>> UNABLE_TO_RUN: 0 >> >>>>> KILLED: 0 >> >>>>> NA: 0 >> >>>>> FAILED: 0 >> >>>>> EXECUTED_WITH_FAILURE: 0 >> >>>>> >> >>>>> >> >>>>> Thanks, >> >>>>> ?lx >> >>>>> >> >>>>>> On Jul 11, 2018, at 10:35 AM, Liu Xin >> wrote: >> >>>>>> >> >>>>>> Thank you for your reviews. Indeed, I didn?t deal with bailout >> >>>>>> >> >>>>> situation. "compiler/codegen/TestCharVect2.java? is the case >> of >> >>>>> codeBuffer overflow and leave a unbound label behind. >> >>>>> >> >>>>>> I made another revision. I will run tests thoroughly. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> ?lx >> >>>>>> >> >>>>>> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul < >> hohensee at amazon.com> >> >>>>>>> >> >>>>>> wrote: >> >>>>> >> >>>>>> >> >>>>>>> Imo it's still good hygiene to require that Labels be bound >> if they're >> >>>>>>> >> >>>>>> used, even if the generated code will never be executed. >> E.g., code >> >>>>> that >> >>>>> generates code for sizing purposes may be repurposed to >> generate >> >>>>> >> >>>> executable >> >>>> >> >>>>> code, in which case an unbound label may be a lurking bug. >> Also, I'm >> >>>>> unaware (I may be corrected!) of any situation where bailing >> out happens >> >>>>> >> >>>> in >> >>>> >> >>>>> such a way as to both leave a Label unbound and execute its >> destructor. >> >>>>> Even if there are, I'd say that'd be indicative of another >> real problem, >> >>>>> such as code buffer overflow, so no harm would result. >> >>>>> >> >>>>>> >> >>>>>>> Thanks, >> >>>>>>> >> >>>>>>> Paul >> >>>>>>> >> >>>>>>> On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of >> Doerr, Martin" >> >>>>>>> >> >>>>>> < >> >>>> >> >>>>> hotspot-runtime-dev-bounces at openjdk.java.net on behalf of >> >>>>> martin.doerr at sap.com> wrote: >> >>>>> >> >>>>>> >> >>>>>>> Hi, >> >>>>>>> >> >>>>>>> I think the idea is good, but doesn't work in all cases. >> >>>>>>> We may bail out from code generation and discard the >> generated code >> >>>>>>> >> >>>>>> leaving the label unbound. >> >>>>> >> >>>>>> We also may generate code with the purpose to determine >> its size. We >> >>>>>>> >> >>>>>> don't need to bind labels because the code will never get >> executed. >> >>>>> >> >>>>>> >> >>>>>>> Best regards, >> >>>>>>> Martin >> >>>>>>> >> >>>>>>> >> >>>>>>> -----Original Message----- >> >>>>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> >>>>>>> >> >>>>>> bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >> >>>>> >> >>>>>> Sent: Mittwoch, 11. Juli 2018 03:34 >> >>>>>>> To: Liu Xin ; hotspot >> >>>>>>> >> >>>>>> -runtime-dev at openjdk.java.net >> >>>>> >> >>>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound >> assembler >> >>>>>>> >> >>>>>> Labels for x86 >> >>>>> >> >>>>>> >> >>>>>>> I hit new assert in few other tests: >> >>>>>>> >> >>>>>>> compiler/codegen/TestCharVect2.java >> >>>>>>> compiler/c2/cr6340864/* >> >>>>>>> >> >>>>>>> Regards, >> >>>>>>> Vladimir >> >>>>>>> >> >>>>>>> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >> >>>>>>> >> >>>>>>>> Fix looks reasonable. I will test it in our framework. >> >>>>>>>> >> >>>>>>>> Thanks, >> >>>>>>>> Vladimir >> >>>>>>>> >> >>>>>>>> On 7/10/18 9:50 AM, Liu Xin wrote: >> >>>>>>>> >> >>>>>>>>> Hi, Community, >> >>>>>>>>> Could you please review this small patch? >> >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >> >>>>>>>>> >> >>>>>>>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >> >>>>>>>>> >> >>>>>>>>> Problem: >> >>>>>>>>> X86-32/64 will leave an unbound label if >> UseOnStackReplacement is >> >>>>>>>>> >> >>>>>>>> OFF. >> >>>> >> >>>>> This patch align up x86 with other architectures(ppc, arm). >> >>>>>>>>> Add an assertion to the destructor of Label. It will be >> wiped out >> >>>>>>>>> in >> >>>>>>>>> >> >>>>>>>> release build. >> >>>>> >> >>>>>> Previously, hotspot cannot pass this test with assertion on >> x86-64. >> >>>>>>>>> make run-test >> >>>>>>>>> >> >>>>>>>> TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >> >>>> >> >>>>> If this CR is approved, Paul Hohensee will push it. >> >>>>>>>>> Thanks, >> >>>>>>>>> --lx >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >> > From coleen.phillimore at oracle.com Fri Jul 20 19:04:39 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Jul 2018 15:04:39 -0400 Subject: RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero In-Reply-To: References: <4987630f-7aff-246a-22c7-af70a8636feb@oracle.com> <2ED64747-2FFF-4181-8931-B2EB5CD7EECF@oracle.com> <30cd0f77-3d62-2867-2a37-f68d6a1a401f@oracle.com> <01234404-22a6-f02b-20b3-55e059deabab@oracle.com> <5a8a9837-48d5-683a-271b-ba6ff27369df@oracle.com> <3c50d9f1-bb92-1fbb-db44-e5d154a59d5d@oracle.com> <3efca5db-b37a-0afd-e560-50ebfd93e638@oracle.com> <8a37d7da-5547-75b3-aec3-fd3bbe8e6a78@oracle.com> Message-ID: <2ca444dd-da69-ae8f-4cd8-e35ca47477bb@oracle.com> On 7/20/18 2:24 PM, Kim Barrett wrote: >> On Jul 19, 2018, at 6:14 PM, coleen.phillimore at oracle.com wrote: >> >> >> Hi, There is a closed test that does 100,000 lookups on a class that fails resolution, so creates 100,000 Symbols with TempNewSymbol. This results in many zeroed refcounted Symbols in the table which increases lookup time with the current SymbolTable. With the new concurrent symbol table, which this change is intended to support, the zero refcount symbols are cleaned up on insert and concurrently. >> >> I have a workaround so that this test doesn't time out. These are the times for this test on my machine. >> >> old hashtable no patch: 7.32 seconds >> without workaround: 367 seconds (which can time out on a slow machine) >> with workaround: 61.075 seconds >> with new hashtable: 9.135 seconds >> >> There are several ways to fix the old hashtable so that it cleans more frequently for this situation but it's not worth doing with the new concurrent hashtable coming. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/03.incr/webrev >> >> Thanks, >> Coleen > Looks good with Gerard?s suggested improvement. Thanks, Kim! Coleen > From coleen.phillimore at oracle.com Fri Jul 20 19:29:42 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Jul 2018 15:29:42 -0400 Subject: RFR 12 (XXS) 8203382 Rename SystemDictionary::initialize_wk_klass to resolve_wk_klass In-Reply-To: <4472c545-310c-70db-9e26-5b7174126e16@oracle.com> References: <733ed870-a7c1-b0c5-ec03-5b5bdae478e8@oracle.com> <4472c545-310c-70db-9e26-5b7174126e16@oracle.com> Message-ID: +1 "resolve" is more accurate than "initialize". thanks, Coleen On 7/20/18 2:18 PM, Ioi Lam wrote: > Thanks Jiangli! > > - Ioi > > > On 7/20/18 10:57 AM, Jiangli Zhou wrote: >> Looks good and trivial. >> >> Thanks, >> >> Jiangli >> >> >> On 7/20/18 10:44 AM, Ioi Lam wrote: >>> Hi, >>> >>> Please review this very simple renaming change: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8203382 >>> http://cr.openjdk.java.net/~iklam/jdk12/8203382_rename_initialize_wk_klass.v01/ >>> >>> >>> ??? initialize_wk_klass ->resolve_wk_klass >>> ??? initialize_preloaded_classes-> resolve_preloaded_classes >>> >>> because Java class initialization is not actually happening inside >>> these >>> functions. >>> >>> Thanks >>> - Ioi >>> >>> >> > From jcbeyler at google.com Fri Jul 20 19:37:56 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 20 Jul 2018 12:37:56 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> Message-ID: Yes that is right, this is the latest: http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ I apologize for the multiple threads and confusion, Jc On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Thank you a lot, Vladimir! > Yes, the webrev.03 is the latest. > Jc, will correct us if it is not right. > > Thanks, > Serguei > > > On 7/20/18 10:52, Vladimir Kozlov wrote: > > I asked Igor V. to look. > > > > Seems like review is done in an other thread which does not have bug > > id in subject. Currently webrev.03 > > > > Vladimir > > > > On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > >> Thanks, Rahul! > >> In fact, there no good experts for this area in the serviceability team. > >> It would be much better if anyone from the Compiler team could do it. > >> > >> Vladimir K., > >> > >> Is there anyone from the Compiler team available to review this? > >> Otherwise, I could try to review it but am not sure about my review > >> quality. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/19/18 00:48, Rahul Raghavan wrote: > >>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > >>> > >>> (just adding + hotspot-compiler-dev also) > >>> > >>> > >>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > >>> Subject Was: > >>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled > >>> > >>> + serviceability-dev > >>> > >>> Hi all, > >>> > >>> Could anyone else give me a review of this webrev and check/test the > >>> various architecture changes? > >>> > >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>> > >>> > >>> Thanks for all your help! > >>> Jc > >>> > >>> > >>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler > wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> Here is a webrev that does all the architectures in the same way: > >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>> > >>>>> Could anyone review the other architectures and test? > >>>>> - arm, sparc & aarch64 are also modified now to follow the same > >>>>> "if no > >>>>> tlab, then consider eden space allocation" logic. > >>>>> > >>>>> Thanks for your help! > >>>>> Jc > >>>>> > >>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler > >>>>> wrote: > >>>>> > >>>>>> Hi Kim, > >>>>>> > >>>>>> I opened this bug > >>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 > >>>>>> > >>>>>> and now I've done an update: > >>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ > >>>>>> > >>>>>> I basically have done your nits but also removed the try_eden (it > >>>>>> was > >>>>>> used to bind a label but was not used). I updated the comments to > >>>>>> use the > >>>>>> one you preferred. > >>>>>> > >>>>>> I still have to do the other architectures though but at least we > >>>>>> seem to > >>>>>> have a consensus on this architecture, correct? > >>>>>> > >>>>>> Thanks for the review, > >>>>>> Jc > >>>>>> > >>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett > > >>>>>> wrote: > >>>>>> > >>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>> Yes, you are right, I did those changes due to: > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 > >>>>>>>> > >>>>>>>> If Robbin agrees to this change, and if no one sees an issue, > >>>>>>>> I'll go > >>>>>>> ahead > >>>>>>>> and propagate the change across architectures. > >>>>>>>> > >>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's > >>>>>>>> comment > >>>>>>> and > >>>>>>>> review) :) > >>>>>>>> Jc > >>>>>>>> > >>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose > > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I'm not sure if we had left this case intentionally or not > >>>>>>>>> but, if we > >>>>>>> want > >>>>>>>>> it all to be consistent, we should perhaps fix it. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Well, you put in that logic last February, so unless somebody > >>>>>>>>> speaks > >>>>>>> up > >>>>>>>>> quickly, I support your adjusting it to be the way you want it. > >>>>>>>>> > >>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I > >>>>>>>>> src/hotspot/share" > >>>>>>>>> suggests that the GC group is most active in touching this > >>>>>>>>> feature. > >>>>>>>>> If Robbin is OK with it, there's your reviewer. > >>>>>>>>> > >>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person > >>>>>>>>> working on the GC to OK it. > >>>>>>>>> > >>>>>>>>> ? John > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Jc > >>>>>>> > >>>>>>> Robbin is on vacation; you might not hear from him for a while. > >>>>>>> > >>>>>>> I'm assuming you'll open a new bug for this? > >>>>>>> > >>>>>>> Except for a few minor nits (below), this looks okay to me. > >>>>>>> > >>>>>>> The comment at line 1052 needs updating. > >>>>>>> > >>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. > >>>>>>> > >>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at > >>>>>>> line 1058, but unreferenced. > >>>>>>> > >>>>>>> I like the wording of the comment at 1139 better than the > >>>>>>> wording at > >>>>>>> 1016. > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> > >>>>>> Thanks, > >>>>>> Jc > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> > >>>>> Thanks, > >>>>> Jc > >>>>> > >>>> > >>>> > >> > > -- Thanks, Jc From kim.barrett at oracle.com Fri Jul 20 19:47:49 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 20 Jul 2018 15:47:49 -0400 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: <8780C8BC-033A-49A8-B721-3943C22CB035@oracle.com> > On Jul 20, 2018, at 11:30 AM, JC Beyler wrote: > > Awesome thanks Thomas! > > Here is the webrev with the extra information then: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > Thanks again for all the reviews everyone! > Jc This version looks good to me. From hohensee at amazon.com Fri Jul 20 21:12:37 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 20 Jul 2018 21:12:37 +0000 Subject: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 In-Reply-To: References: <4ffed082-946d-1f7b-698e-ba180df8963e@oracle.com> <01f5cada-3f0c-12fe-d130-efaf529b0cd7@oracle.com> <63920997-A885-471E-88D6-A70A902F22F1@gmail.com> <448D23F6-AE68-4D40-A605-DB8A092C5F43@gmail.com> <4d861aa62585483b8f2c9f626406e346@sap.com> <69D49C0A-27DA-4E33-95C2-2FF6BFBCB754@gmail.com> <782c616f-128f-fadc-99e2-f74fe360567a@oracle.com> <9EB54488-C95A-4A2F-99E3-410545DB1824@amazon.com> <84f48d03-f2a5-de45-dfe8-c971a3389577@oracle.com> Message-ID: <8C0CFD8F-6B5C-48F2-B3F6-B079A7FCC3D5@amazon.com> +1. :) From: Liu Xin Date: Friday, July 20, 2018 at 11:49 AM To: Vladimir Kozlov Cc: "Hohensee, Paul" , "hotspot-runtime-dev at openjdk.java.net" Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels for x86 Cool. thanks. On Fri, Jul 20, 2018 at 11:46 AM, Vladimir Kozlov > wrote: This looks good. I will sponsor it. Thanks, Vladimir On 7/20/18 11:37 AM, Hohensee, Paul wrote: New webrev: http://cr.openjdk.java.net/~phh/8206075/webrev.01/ Thanks, Paul On 7/20/18, 11:32 AM, "hotspot-runtime-dev on behalf of Vladimir Kozlov" on behalf of vladimir.kozlov at oracle.com> wrote: On 7/20/18 11:11 AM, Liu Xin wrote: > Thanks, Vladimir and Goetz. Could yo approve what you tested? I am fine with your latest changes but you need to post webrev on cr.openjdk. I will review it then. > > > For the patch, I think it's another story. I am *NOT* sure if we should > need it. It's about C++ object model. I feel hotspot is using C++ in > non-standard way. I am confusing about C++ in hotspot. > In regular C++ , we should manage the life cycle of objects carefully. > > If you take a look at usage of this macro, some non-pod classes don't > construct but use directly. > #define NEW_RESOURCE_ARRAY(type, size)\ > (type*) resource_allocate_bytes((size) * sizeof(type)) > > eg. > VMRegPair* out_regs = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args); > > May I create a new RFR to enhance it? > I want to introduce a meta-programming template like boost's is_pod. > https://www.boost.org/doc/libs/1_44_0/libs/type_traits/doc/html/boost_typetraits/reference/is_pod.html Be careful. Hotspot have to be compiled by big variety of C++ compilers and not all of them support latest features. Regards, Vladimir > > NEW_RESOURCE_ARRAY should call constructors for those classes which are not > pod. > > thanks, > --lx > > > > > On Fri, Jul 20, 2018 at 9:18 AM, Vladimir Kozlov >> wrote: > >> My testing also passed clean. I tested next patch: >> >> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/jdk/ >> label_bugfix/index.html >> >> Please, post it on cr.openjdk server for final review. We can't review and >> use patches from other places. >> >> Thanks, >> Vladimir >> >> >> On 7/20/18 12:29 AM, Lindenmaier, Goetz wrote: >> >>> Hi Liu, >>> >>> Martin had put the patch into our testing queue. >>> All the platforms we build are fine. >>> This are: windows x86_64, linux: ppc64, ppc64le, x86_64, s390x, >>> aix ppc64, solaris sparcv9, mac. >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>>> Sent: Freitag, 20. Juli 2018 09:16 >>>> To: Vladimir Kozlov > >>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>>> for >>>> x86 >>>> >>>> Hello, Vladimir, >>>> Could you run on other platform on behalf of Martin? >>>> I locally tested on x86_64. I hope the Reviewer can help me verify it >>>> works >>>> on other platforms. >>>> >>>> >>>> Furthermore, I am sure if we should add this additional patch. >>>> Label class is not POD, we should properly call constructor /destructor >>>> even though those labels are allocated on arena. >>>> >>>> >>>> thanks, >>>> --lx >>>> >>>> On Wed, Jul 18, 2018 at 4:07 AM, Doerr, Martin > >>>> wrote: >>>> >>>> Hi Liu Xin, >>>>> >>>>> >>>>> >>>>> thanks for understanding my point and checking other places. >>>>> >>>>> >>>>> >>>>> The templateTable_x86.cpp was reviewed by me. >>>>> >>>>> I can?t review the label assertion before my vacation. If other >>>>> reviewers >>>>> are convinced that the it is correct, ok. >>>>> >>>>> >>>>> >>>>> Would be great if somebody could assist with testing other platforms. >>>>> >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Martin >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com] >>>>> *Sent:* Dienstag, 17. Juli 2018 19:17 >>>>> >>>>> *To:* Doerr, Martin > >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>>> Labels for x86 >>>>> >>>>> >>>>> >>>>> Hi, Martin, >>>>> >>>>> >>>>> >>>>> Thank you for the feedback. >>>>> >>>>> >>>>> >>>>> I totally agree with you that we shouldn?t introduce false positive >>>>> assertion. Let?s insist on the high bar here. >>>>> >>>>> I browsed many sources in hotspot recently. Hotspot is the most >>>>> monolithic >>>>> software I ever seen. I am glad to be directed by a guidance and clear >>>>> target. >>>>> >>>>> >>>>> >>>>> I think I dealt with c1 bailout case. This case triggers "codebuffer >>>>> overflow" in middle of c1 compilation. >>>>> >>>>> compiler/codegen/TestCharVect2.java >>>>> >>>>> >>>>> >>>>> I am still not sure about c2 bailout case. Let me try to make one. >>>>> >>>>> >>>>> >>>>> For case #2, I got what you concerned. Indeed, the generated ad_x86.cpp >>>>> contains many emits methods for MachNode. I will double-check if they >>>>> >>>> could >>>> >>>>> leave unused labels. >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> ?lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 16, 2018, at 2:09 PM, Liu Xin > wrote: >>>>> >>>>> >>>>> >>>>> Hi, List, >>>>> >>>>> >>>>> >>>>> Could you review this new revision? >>>>> >>>>> https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>>> jdk/label_bugfix/index.html >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> i) I took a look at all architectures, arm/aarch64/ppc64/sparc/x86. I >>>>> don?t understand all the assemblies, but I think they are guarded >>>>> for UseOnStackReplacement >>>>> >>>>> in templateTable_xxx.cpp ::branch(bool is_jsr, bool is_wide). >>>>> >>>>> >>>>> >>>>> TemplateTable_arm.cpp is a little different. It explicitly binds it >>>>> later. >>>>> >>>>> if (!UseOnStackReplacement) { >>>>> >>>>> __ bind(backedge_counter_overflow); >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> i) I checked the Compile::scratch_emit_size. It only uses the label >>>>> fakeL >>>>> for those MachBranch nodes. >>>>> >>>>> Because fakeL will be bound to a trivial address if the nodes are >>>>> MachBranch, It?s also safe for the assertion. >>>>> >>>>> >>>>> >>>>> bool is_branch = n->is_MachBranch(); >>>>> >>>>> if (is_branch) { >>>>> >>>>> MacroAssembler masm(&buf); >>>>> >>>>> masm.bind(fakeL); >>>>> >>>>> n->as_MachBranch()->save_label(&saveL, &save_bnum); >>>>> >>>>> n->as_MachBranch()->label_set(&fakeL, 0); >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> ?lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 16, 2018, at 1:30 AM, Doerr, Martin > >>>>> wrote: >>>>> >>>>> >>>>> >>>>> Hi Liu Xin, >>>>> >>>>> >>>>> >>>>> thanks for changing. >>>>> >>>>> >>>>> >>>>> The background of this Assertion is that our engineer used to spend >>>>>> >>>>> many >>>> >>>>> hour to trace down a corner case. >>>>> >>>>> it's trivial if fastdebug/slowdebug stop and tell you immediately. >>>>>> >>>>> >>>>> >>>>> >>>>> I understand that. But an assertion should only get added when we are >>>>> convinced that it won?t produce false positives. >>>>> >>>>> It?s very annoying if long running tests break due to an incorrect >>>>> assertion after running many days. >>>>> >>>>> >>>>> >>>>> I am curious about this "We also may generate code with the purpose to >>>>>> >>>>> determine its size.". >>>>> >>>>> Could you tell me where is it? it looks quite slow to get buffer size in >>>>>> >>>>> this way. >>>>> >>>>> >>>>> >>>>> C2 Compiler does that in Compile::scratch_emit_size. >>>>> >>>>> >>>>> >>>>> Please note that I?ll be on vacation soon, so other people will have to >>>>> review. >>>>> >>>>> Thanks again for fixing the -XX:-UseOnStackReplacement issue. >>>>> >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Martin >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Liu Xin [mailto:navy.xliu at gmail.com >] >>>>> *Sent:* Freitag, 13. Juli 2018 22:30 >>>>> *To:* Doerr, Martin > >>>>> *Cc:* hotspot-runtime-dev at openjdk.java.net >>>>> *Subject:* Re: RFR(S): 8206075: add assertion for unbound assembler >>>>> Labels for x86 >>>>> >>>>> >>>>> >>>>> Hello, Martin, >>>>> >>>>> >>>>> >>>>> Thanks for reviewing it. >>>>> >>>>> >>>>> >>>>> I got your point. I made it "if (where != NULL) { jcc(cond, *where); }" >>>>> and is running tests. >>>>> >>>>> >>>>> >>>>> The background of this Assertion is that our engineer used to spend many >>>>> hour to trace down a corner case. it's trivial if fastdebug/slowdebug >>>>> stop >>>>> and tell you immediately. >>>>> >>>>> >>>>> >>>>> I am curious about this "We also may generate code with the purpose to >>>>> determine its size.". Could you tell me where is it? it looks quite >>>>> slow >>>>> to get buffer size in this way. >>>>> >>>>> >>>>> >>>>> thanks, >>>>> >>>>> --lx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Jul 13, 2018 at 2:54 AM, Doerr, Martin > >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> thanks for fixing the issue in templateTable_x86. It looks correct. >>>>> I think even better would be >>>>> "UseOnStackReplacement ? &backedge_counter_overflow : NULL" >>>>> and >>>>> "if (where != NULL) { jcc(cond, *where); }" in interp_masm_x86.cpp. >>>>> But I leave it up to you if you want to change it. I'm also ok with your >>>>> version. >>>>> >>>>> I'm not convinced that the label assertion is reliable. There may be >>>>> many >>>>> more places in hotspot where we bail out having an unbound label. >>>>> >>>> Running a >>>> >>>>> few tests on x86 is by far not sufficient. The assertion may fire >>>>> sporadically in large scenarios on some platforms. >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>> bounces at openjdk.java.net] On Behalf Of Liu Xin >>>>> Sent: Donnerstag, 12. Juli 2018 22:51 >>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler Labels >>>>> for x86 >>>>> >>>>> Could you review this patch again? >>>>> >>>>> Revision #2. >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>> ttps://bugs.openjdk.java.net/browse/JDK-8206075> >>>>> CR: https://s3-us-west-2.amazonaws.com/openjdk-webrevs/ >>>>> openjdk8u/webrev/index.html >>>> com/openjdk-webrevs/openjdk8u/webrev/index.html> >>>>> >>>>> >>>>> >>>>> The idea is simple. I just reset the problematic label when c1 >>>>> compilation >>>>> bailout happen. >>>>> I manually ran tier1 on my laptop. it can pass all of them. >>>>> Paul help me submit the patch to submit and here is the run result. >>>>> Build Details: 2018-07-12-1736388.hohensee.source >>>>> >>>>> 0 Failed Tests >>>>> >>>>> Mach5 Tasks Results Summary >>>>> >>>>> PASSED: 75 >>>>> UNABLE_TO_RUN: 0 >>>>> KILLED: 0 >>>>> NA: 0 >>>>> FAILED: 0 >>>>> EXECUTED_WITH_FAILURE: 0 >>>>> >>>>> >>>>> Thanks, >>>>> ?lx >>>>> >>>>>> On Jul 11, 2018, at 10:35 AM, Liu Xin > wrote: >>>>>> >>>>>> Thank you for your reviews. Indeed, I didn?t deal with bailout >>>>>> >>>>> situation. "compiler/codegen/TestCharVect2.java? is the case of >>>>> codeBuffer overflow and leave a unbound label behind. >>>>> >>>>>> I made another revision. I will run tests thoroughly. >>>>>> >>>>>> Thanks, >>>>>> ?lx >>>>>> >>>>>> On Jul 11, 2018, at 7:49 AM, Hohensee, Paul > >>>>>>> >>>>>> wrote: >>>>> >>>>>> >>>>>>> Imo it's still good hygiene to require that Labels be bound if they're >>>>>>> >>>>>> used, even if the generated code will never be executed. E.g., code >>>>> that >>>>> generates code for sizing purposes may be repurposed to generate >>>>> >>>> executable >>>> >>>>> code, in which case an unbound label may be a lurking bug. Also, I'm >>>>> unaware (I may be corrected!) of any situation where bailing out happens >>>>> >>>> in >>>> >>>>> such a way as to both leave a Label unbound and execute its destructor. >>>>> Even if there are, I'd say that'd be indicative of another real problem, >>>>> such as code buffer overflow, so no harm would result. >>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Paul >>>>>>> >>>>>>> On 7/11/18, 3:41 AM, "hotspot-runtime-dev on behalf of Doerr, Martin" >>>>>>> >>>>>> < >>>> >>>>> hotspot-runtime-dev-bounces at openjdk.java.net on behalf of >>>>> martin.doerr at sap.com> wrote: >>>>> >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I think the idea is good, but doesn't work in all cases. >>>>>>> We may bail out from code generation and discard the generated code >>>>>>> >>>>>> leaving the label unbound. >>>>> >>>>>> We also may generate code with the purpose to determine its size. We >>>>>>> >>>>>> don't need to bind labels because the code will never get executed. >>>>> >>>>>> >>>>>>> Best regards, >>>>>>> Martin >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>>>>> >>>>>> bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov >>>>> >>>>>> Sent: Mittwoch, 11. Juli 2018 03:34 >>>>>>> To: Liu Xin >; hotspot >>>>>>> >>>>>> -runtime-dev at openjdk.java.net >>>>> >>>>>> Subject: Re: RFR(S): 8206075: add assertion for unbound assembler >>>>>>> >>>>>> Labels for x86 >>>>> >>>>>> >>>>>>> I hit new assert in few other tests: >>>>>>> >>>>>>> compiler/codegen/TestCharVect2.java >>>>>>> compiler/c2/cr6340864/* >>>>>>> >>>>>>> Regards, >>>>>>> Vladimir >>>>>>> >>>>>>> On 7/10/18 5:08 PM, Vladimir Kozlov wrote: >>>>>>> >>>>>>>> Fix looks reasonable. I will test it in our framework. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 7/10/18 9:50 AM, Liu Xin wrote: >>>>>>>> >>>>>>>>> Hi, Community, >>>>>>>>> Could you please review this small patch? >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206075 >>>>>>>>> >>>>>>>>> CR: http://cr.openjdk.java.net/~phh/8206075/webrev.00/ >>>>>>>>> >>>>>>>>> Problem: >>>>>>>>> X86-32/64 will leave an unbound label if UseOnStackReplacement is >>>>>>>>> >>>>>>>> OFF. >>>> >>>>> This patch align up x86 with other architectures(ppc, arm). >>>>>>>>> Add an assertion to the destructor of Label. It will be wiped out >>>>>>>>> in >>>>>>>>> >>>>>>>> release build. >>>>> >>>>>> Previously, hotspot cannot pass this test with assertion on x86-64. >>>>>>>>> make run-test >>>>>>>>> >>>>>>>> TEST=test/hotspot/jtreg/compiler/c1/Test7090976.java >>>> >>>>> If this CR is approved, Paul Hohensee will push it. >>>>>>>>> Thanks, >>>>>>>>> --lx >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From jiangli.zhou at oracle.com Fri Jul 20 22:26:33 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 20 Jul 2018 15:26:33 -0700 Subject: [11] RFR 8203820: [TESTBUG] vmTestbase/metaspace/staticReferences/StaticReferences.java timed out In-Reply-To: References: Message-ID: <577cdd92-609c-fa5d-5514-6f101893dd78@oracle.com> Looks good to me! The changes reduce the test execution duration and still inline with the original intention. Thanks, Jiangli On 7/20/18 9:12 AM, coleen.phillimore at oracle.com wrote: > Summary: Moved InMemoryJavaCompiler out of loops or reduced loops with > InMemoryJavaCompiler > > I also reformatted StressRedefine.java which had the same problem as > the two in the bug report. > > These test were timing out in test runs in the javac compiler. See bug > for more detail. > > open webrev at http://cr.openjdk.java.net/~coleenp/8203820.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8203820 > > Thanks, > Coleen From coleen.phillimore at oracle.com Fri Jul 20 22:27:54 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Jul 2018 18:27:54 -0400 Subject: [11] RFR 8203820: [TESTBUG] vmTestbase/metaspace/staticReferences/StaticReferences.java timed out In-Reply-To: <577cdd92-609c-fa5d-5514-6f101893dd78@oracle.com> References: <577cdd92-609c-fa5d-5514-6f101893dd78@oracle.com> Message-ID: <25fecfea-a08b-2831-5d2d-ba95dcdca3ca@oracle.com> Thanks for reviewing! Coleen On 7/20/18 6:26 PM, Jiangli Zhou wrote: > Looks good to me! The changes reduce the test execution duration and > still inline with the original intention. > > Thanks, > > Jiangli > > > On 7/20/18 9:12 AM, coleen.phillimore at oracle.com wrote: >> Summary: Moved InMemoryJavaCompiler out of loops or reduced loops >> with InMemoryJavaCompiler >> >> I also reformatted StressRedefine.java which had the same problem as >> the two in the bug report. >> >> These test were timing out in test runs in the javac compiler. See >> bug for more detail. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8203820.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8203820 >> >> Thanks, >> Coleen > From igor.veresov at oracle.com Sat Jul 21 20:47:34 2018 From: igor.veresov at oracle.com (Igor Veresov) Date: Sat, 21 Jul 2018 13:47:34 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> Message-ID: <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> I think you can just predicate the emission of these stubs for !UseTLAB, and not mess with the CPU-specific code. What do you think? diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp b/src/hotspot/share/c1/c1_LIRGenerator.cpp --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp @@ -674,7 +674,7 @@ void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { klass2reg_with_patching(klass_reg, klass, info, is_unresolved); // If klass is not loaded we do not know if the klass has finalizers: - if (UseFastNewInstance && klass->is_loaded() + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { Runtime1::StubID stub_id = klass->is_initialized() ? Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; igor > On Jul 20, 2018, at 12:37 PM, JC Beyler wrote: > > Yes that is right, this is the latest: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > I apologize for the multiple threads and confusion, > Jc > > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < > serguei.spitsyn at oracle.com> wrote: > >> Thank you a lot, Vladimir! >> Yes, the webrev.03 is the latest. >> Jc, will correct us if it is not right. >> >> Thanks, >> Serguei >> >> >> On 7/20/18 10:52, Vladimir Kozlov wrote: >>> I asked Igor V. to look. >>> >>> Seems like review is done in an other thread which does not have bug >>> id in subject. Currently webrev.03 >>> >>> Vladimir >>> >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >>>> Thanks, Rahul! >>>> In fact, there no good experts for this area in the serviceability team. >>>> It would be much better if anyone from the Compiler team could do it. >>>> >>>> Vladimir K., >>>> >>>> Is there anyone from the Compiler team available to review this? >>>> Otherwise, I could try to review it but am not sure about my review >>>> quality. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/19/18 00:48, Rahul Raghavan wrote: >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >>>>> >>>>> (just adding + hotspot-compiler-dev also) >>>>> >>>>> >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >>>>> Subject Was: >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >>>>> >>>>> + serviceability-dev >>>>> >>>>> Hi all, >>>>> >>>>> Could anyone else give me a review of this webrev and check/test the >>>>> various architecture changes? >>>>> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>>> >>>>> >>>>> Thanks for all your help! >>>>> Jc >>>>> >>>>> >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler >> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Here is a webrev that does all the architectures in the same way: >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>>>>> >>>>>>> Could anyone review the other architectures and test? >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same >>>>>>> "if no >>>>>>> tlab, then consider eden space allocation" logic. >>>>>>> >>>>>>> Thanks for your help! >>>>>>> Jc >>>>>>> >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Kim, >>>>>>>> >>>>>>>> I opened this bug >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>>>>>> >>>>>>>> and now I've done an update: >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>>>>>> >>>>>>>> I basically have done your nits but also removed the try_eden (it >>>>>>>> was >>>>>>>> used to bind a label but was not used). I updated the comments to >>>>>>>> use the >>>>>>>> one you preferred. >>>>>>>> >>>>>>>> I still have to do the other architectures though but at least we >>>>>>>> seem to >>>>>>>> have a consensus on this architecture, correct? >>>>>>>> >>>>>>>> Thanks for the review, >>>>>>>> Jc >>>>>>>> >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >> >>>>>>>> wrote: >>>>>>>> >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Yes, you are right, I did those changes due to: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>>>>>> >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>>>>>> I'll go >>>>>>>>> ahead >>>>>>>>>> and propagate the change across architectures. >>>>>>>>>> >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>>>>>> comment >>>>>>>>> and >>>>>>>>>> review) :) >>>>>>>>>> Jc >>>>>>>>>> >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm not sure if we had left this case intentionally or not >>>>>>>>>>> but, if we >>>>>>>>> want >>>>>>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>>>>>> speaks >>>>>>>>> up >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>>>>>> >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>>>>>> src/hotspot/share" >>>>>>>>>>> suggests that the GC group is most active in touching this >>>>>>>>>>> feature. >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>>>>>> >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>>>>>> working on the GC to OK it. >>>>>>>>>>> >>>>>>>>>>> ? John >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jc >>>>>>>>> >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. >>>>>>>>> >>>>>>>>> I'm assuming you'll open a new bug for this? >>>>>>>>> >>>>>>>>> Except for a few minor nits (below), this looks okay to me. >>>>>>>>> >>>>>>>>> The comment at line 1052 needs updating. >>>>>>>>> >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>>>>>> >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>>>>>> line 1058, but unreferenced. >>>>>>>>> >>>>>>>>> I like the wording of the comment at 1139 better than the >>>>>>>>> wording at >>>>>>>>> 1016. >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jc >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks, >>>>>>> Jc >>>>>>> >>>>>> >>>>>> >>>> >> >> > > -- > > Thanks, > Jc From jcbeyler at google.com Sun Jul 22 02:06:26 2018 From: jcbeyler at google.com (JC Beyler) Date: Sat, 21 Jul 2018 19:06:26 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> Message-ID: Hi Igor, Thanks for looking at it! I don't know the code paths enough to know if that is sufficient (I'll trust you evidently). I can run the tests next week if we prefer that route. Were I to choose, I would prefer that interpreter/c1/c2 all follow the same kind of paths, which would be my fix I believe: 1) If TLAB, allocate there or slowpath 2) Else If contiguous inline allocations are enabled, try that 3) Goto Slowpath With your fix, even if we do not have the issue anymore, it still keeps code that is not consistent but perhaps I'm missing something? Jc On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov wrote: > I think you can just predicate the emission of these stubs for !UseTLAB, > and not mess with the CPU-specific code. What do you think? > > diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp > b/src/hotspot/share/c1/c1_LIRGenerator.cpp > --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp > +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp > @@ -674,7 +674,7 @@ > void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool > is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, > LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { > klass2reg_with_patching(klass_reg, klass, info, is_unresolved); > // If klass is not loaded we do not know if the klass has finalizers: > - if (UseFastNewInstance && klass->is_loaded() > + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() > && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { > > Runtime1::StubID stub_id = klass->is_initialized() ? > Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; > > > igor > > > On Jul 20, 2018, at 12:37 PM, JC Beyler wrote: > > > > Yes that is right, this is the latest: > > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > > > I apologize for the multiple threads and confusion, > > Jc > > > > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < > > serguei.spitsyn at oracle.com> wrote: > > > >> Thank you a lot, Vladimir! > >> Yes, the webrev.03 is the latest. > >> Jc, will correct us if it is not right. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/20/18 10:52, Vladimir Kozlov wrote: > >>> I asked Igor V. to look. > >>> > >>> Seems like review is done in an other thread which does not have bug > >>> id in subject. Currently webrev.03 > >>> > >>> Vladimir > >>> > >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > >>>> Thanks, Rahul! > >>>> In fact, there no good experts for this area in the serviceability > team. > >>>> It would be much better if anyone from the Compiler team could do it. > >>>> > >>>> Vladimir K., > >>>> > >>>> Is there anyone from the Compiler team available to review this? > >>>> Otherwise, I could try to review it but am not sure about my review > >>>> quality. > >>>> > >>>> Thanks, > >>>> Serguei > >>>> > >>>> > >>>> On 7/19/18 00:48, Rahul Raghavan wrote: > >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> (just adding + hotspot-compiler-dev also) > >>>>> > >>>>> > >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > >>>>> Subject Was: > >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> + serviceability-dev > >>>>> > >>>>> Hi all, > >>>>> > >>>>> Could anyone else give me a review of this webrev and check/test the > >>>>> various architecture changes? > >>>>> > >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>> > >>>>> > >>>>> Thanks for all your help! > >>>>> Jc > >>>>> > >>>>> > >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler > >> wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Here is a webrev that does all the architectures in the same way: > >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>>>> > >>>>>>> Could anyone review the other architectures and test? > >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same > >>>>>>> "if no > >>>>>>> tlab, then consider eden space allocation" logic. > >>>>>>> > >>>>>>> Thanks for your help! > >>>>>>> Jc > >>>>>>> > >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Hi Kim, > >>>>>>>> > >>>>>>>> I opened this bug > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 > >>>>>>>> > >>>>>>>> and now I've done an update: > >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ > >>>>>>>> > >>>>>>>> I basically have done your nits but also removed the try_eden (it > >>>>>>>> was > >>>>>>>> used to bind a label but was not used). I updated the comments to > >>>>>>>> use the > >>>>>>>> one you preferred. > >>>>>>>> > >>>>>>>> I still have to do the other architectures though but at least we > >>>>>>>> seem to > >>>>>>>> have a consensus on this architecture, correct? > >>>>>>>> > >>>>>>>> Thanks for the review, > >>>>>>>> Jc > >>>>>>>> > >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett < > kim.barrett at oracle.com > >>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Yes, you are right, I did those changes due to: > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 > >>>>>>>>>> > >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, > >>>>>>>>>> I'll go > >>>>>>>>> ahead > >>>>>>>>>> and propagate the change across architectures. > >>>>>>>>>> > >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's > >>>>>>>>>> comment > >>>>>>>>> and > >>>>>>>>>> review) :) > >>>>>>>>>> Jc > >>>>>>>>>> > >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose < > john.r.rose at oracle.com > >>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I'm not sure if we had left this case intentionally or not > >>>>>>>>>>> but, if we > >>>>>>>>> want > >>>>>>>>>>> it all to be consistent, we should perhaps fix it. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Well, you put in that logic last February, so unless somebody > >>>>>>>>>>> speaks > >>>>>>>>> up > >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it. > >>>>>>>>>>> > >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I > >>>>>>>>>>> src/hotspot/share" > >>>>>>>>>>> suggests that the GC group is most active in touching this > >>>>>>>>>>> feature. > >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. > >>>>>>>>>>> > >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other > person > >>>>>>>>>>> working on the GC to OK it. > >>>>>>>>>>> > >>>>>>>>>>> ? John > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Jc > >>>>>>>>> > >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. > >>>>>>>>> > >>>>>>>>> I'm assuming you'll open a new bug for this? > >>>>>>>>> > >>>>>>>>> Except for a few minor nits (below), this looks okay to me. > >>>>>>>>> > >>>>>>>>> The comment at line 1052 needs updating. > >>>>>>>>> > >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is > unused. > >>>>>>>>> > >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound > at > >>>>>>>>> line 1058, but unreferenced. > >>>>>>>>> > >>>>>>>>> I like the wording of the comment at 1139 better than the > >>>>>>>>> wording at > >>>>>>>>> 1016. > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Jc > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Jc > >>>>>>> > >>>>>> > >>>>>> > >>>> > >> > >> > > > > -- > > > > Thanks, > > Jc > > -- Thanks, Jc From igor.veresov at oracle.com Sun Jul 22 02:39:21 2018 From: igor.veresov at oracle.com (Igor Veresov) Date: Sat, 21 Jul 2018 19:39:21 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> Message-ID: <719DC045-4311-499E-9F7D-784096A044C6@oracle.com> Yeah, the fix I proposed doesn?t do exactly what we?d want. Sorry for the confusion. Your fix is fine. Reviewed. igor > On Jul 21, 2018, at 7:06 PM, JC Beyler wrote: > > Hi Igor, > > Thanks for looking at it! I don't know the code paths enough to know if that is sufficient (I'll trust you evidently). I can run the tests next week if we prefer that route. > > Were I to choose, I would prefer that interpreter/c1/c2 all follow the same kind of paths, which would be my fix I believe: > 1) If TLAB, allocate there or slowpath > 2) Else If contiguous inline allocations are enabled, try that > 3) Goto Slowpath > > With your fix, even if we do not have the issue anymore, it still keeps code that is not consistent but perhaps I'm missing something? > Jc > > On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov > wrote: > I think you can just predicate the emission of these stubs for !UseTLAB, and not mess with the CPU-specific code. What do you think? > > diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp b/src/hotspot/share/c1/c1_LIRGenerator.cpp > --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp > +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp > @@ -674,7 +674,7 @@ > void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { > klass2reg_with_patching(klass_reg, klass, info, is_unresolved); > // If klass is not loaded we do not know if the klass has finalizers: > - if (UseFastNewInstance && klass->is_loaded() > + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() > && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { > > Runtime1::StubID stub_id = klass->is_initialized() ? Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; > > > igor > > > On Jul 20, 2018, at 12:37 PM, JC Beyler > wrote: > > > > Yes that is right, this is the latest: > > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > > > I apologize for the multiple threads and confusion, > > Jc > > > > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < > > serguei.spitsyn at oracle.com > wrote: > > > >> Thank you a lot, Vladimir! > >> Yes, the webrev.03 is the latest. > >> Jc, will correct us if it is not right. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/20/18 10:52, Vladimir Kozlov wrote: > >>> I asked Igor V. to look. > >>> > >>> Seems like review is done in an other thread which does not have bug > >>> id in subject. Currently webrev.03 > >>> > >>> Vladimir > >>> > >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > >>>> Thanks, Rahul! > >>>> In fact, there no good experts for this area in the serviceability team. > >>>> It would be much better if anyone from the Compiler team could do it. > >>>> > >>>> Vladimir K., > >>>> > >>>> Is there anyone from the Compiler team available to review this? > >>>> Otherwise, I could try to review it but am not sure about my review > >>>> quality. > >>>> > >>>> Thanks, > >>>> Serguei > >>>> > >>>> > >>>> On 7/19/18 00:48, Rahul Raghavan wrote: > >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> (just adding + hotspot-compiler-dev also) > >>>>> > >>>>> > >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > >>>>> Subject Was: > >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> + serviceability-dev > >>>>> > >>>>> Hi all, > >>>>> > >>>>> Could anyone else give me a review of this webrev and check/test the > >>>>> various architecture changes? > >>>>> > >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>> > >>>>> > >>>>> Thanks for all your help! > >>>>> Jc > >>>>> > >>>>> > >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler > > >> wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Here is a webrev that does all the architectures in the same way: > >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>>>> > >>>>>>> Could anyone review the other architectures and test? > >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same > >>>>>>> "if no > >>>>>>> tlab, then consider eden space allocation" logic. > >>>>>>> > >>>>>>> Thanks for your help! > >>>>>>> Jc > >>>>>>> > >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler > > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Hi Kim, > >>>>>>>> > >>>>>>>> I opened this bug > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 > >>>>>>>> > >>>>>>>> and now I've done an update: > >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ > >>>>>>>> > >>>>>>>> I basically have done your nits but also removed the try_eden (it > >>>>>>>> was > >>>>>>>> used to bind a label but was not used). I updated the comments to > >>>>>>>> use the > >>>>>>>> one you preferred. > >>>>>>>> > >>>>>>>> I still have to do the other architectures though but at least we > >>>>>>>> seem to > >>>>>>>> have a consensus on this architecture, correct? > >>>>>>>> > >>>>>>>> Thanks for the review, > >>>>>>>> Jc > >>>>>>>> > >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett > >>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler > > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Yes, you are right, I did those changes due to: > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 > >>>>>>>>>> > >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, > >>>>>>>>>> I'll go > >>>>>>>>> ahead > >>>>>>>>>> and propagate the change across architectures. > >>>>>>>>>> > >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's > >>>>>>>>>> comment > >>>>>>>>> and > >>>>>>>>>> review) :) > >>>>>>>>>> Jc > >>>>>>>>>> > >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose > >>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler > > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I'm not sure if we had left this case intentionally or not > >>>>>>>>>>> but, if we > >>>>>>>>> want > >>>>>>>>>>> it all to be consistent, we should perhaps fix it. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Well, you put in that logic last February, so unless somebody > >>>>>>>>>>> speaks > >>>>>>>>> up > >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it. > >>>>>>>>>>> > >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I > >>>>>>>>>>> src/hotspot/share" > >>>>>>>>>>> suggests that the GC group is most active in touching this > >>>>>>>>>>> feature. > >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. > >>>>>>>>>>> > >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person > >>>>>>>>>>> working on the GC to OK it. > >>>>>>>>>>> > >>>>>>>>>>> ? John > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Jc > >>>>>>>>> > >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. > >>>>>>>>> > >>>>>>>>> I'm assuming you'll open a new bug for this? > >>>>>>>>> > >>>>>>>>> Except for a few minor nits (below), this looks okay to me. > >>>>>>>>> > >>>>>>>>> The comment at line 1052 needs updating. > >>>>>>>>> > >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. > >>>>>>>>> > >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at > >>>>>>>>> line 1058, but unreferenced. > >>>>>>>>> > >>>>>>>>> I like the wording of the comment at 1139 better than the > >>>>>>>>> wording at > >>>>>>>>> 1016. > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Jc > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Jc > >>>>>>> > >>>>>> > >>>>>> > >>>> > >> > >> > > > > -- > > > > Thanks, > > Jc > > > > -- > > Thanks, > Jc From jcbeyler at google.com Mon Jul 23 03:04:15 2018 From: jcbeyler at google.com (JC Beyler) Date: Sun, 22 Jul 2018 20:04:15 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <719DC045-4311-499E-9F7D-784096A044C6@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> <719DC045-4311-499E-9F7D-784096A044C6@oracle.com> Message-ID: Thanks Igor! http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.04/ Has now your name in the reviewers. Would anyone want to push it by chance? Thanks! Jc On Sat, Jul 21, 2018 at 7:39 PM Igor Veresov wrote: > Yeah, the fix I proposed doesn?t do exactly what we?d want. Sorry for the > confusion. Your fix is fine. Reviewed. > > igor > > On Jul 21, 2018, at 7:06 PM, JC Beyler wrote: > > Hi Igor, > > Thanks for looking at it! I don't know the code paths enough to know if > that is sufficient (I'll trust you evidently). I can run the tests next > week if we prefer that route. > > Were I to choose, I would prefer that interpreter/c1/c2 all follow the > same kind of paths, which would be my fix I believe: > 1) If TLAB, allocate there or slowpath > 2) Else If contiguous inline allocations are enabled, try that > 3) Goto Slowpath > > With your fix, even if we do not have the issue anymore, it still keeps > code that is not consistent but perhaps I'm missing something? > Jc > > On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov > wrote: > >> I think you can just predicate the emission of these stubs for !UseTLAB, >> and not mess with the CPU-specific code. What do you think? >> >> diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp >> b/src/hotspot/share/c1/c1_LIRGenerator.cpp >> --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp >> +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp >> @@ -674,7 +674,7 @@ >> void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, >> bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, >> LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { >> klass2reg_with_patching(klass_reg, klass, info, is_unresolved); >> // If klass is not loaded we do not know if the klass has finalizers: >> - if (UseFastNewInstance && klass->is_loaded() >> + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() >> && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { >> >> Runtime1::StubID stub_id = klass->is_initialized() ? >> Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; >> >> >> igor >> >> > On Jul 20, 2018, at 12:37 PM, JC Beyler wrote: >> > >> > Yes that is right, this is the latest: >> > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ >> > >> > I apologize for the multiple threads and confusion, >> > Jc >> > >> > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < >> > serguei.spitsyn at oracle.com> wrote: >> > >> >> Thank you a lot, Vladimir! >> >> Yes, the webrev.03 is the latest. >> >> Jc, will correct us if it is not right. >> >> >> >> Thanks, >> >> Serguei >> >> >> >> >> >> On 7/20/18 10:52, Vladimir Kozlov wrote: >> >>> I asked Igor V. to look. >> >>> >> >>> Seems like review is done in an other thread which does not have bug >> >>> id in subject. Currently webrev.03 >> >>> >> >>> Vladimir >> >>> >> >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >> >>>> Thanks, Rahul! >> >>>> In fact, there no good experts for this area in the serviceability >> team. >> >>>> It would be much better if anyone from the Compiler team could do it. >> >>>> >> >>>> Vladimir K., >> >>>> >> >>>> Is there anyone from the Compiler team available to review this? >> >>>> Otherwise, I could try to review it but am not sure about my review >> >>>> quality. >> >>>> >> >>>> Thanks, >> >>>> Serguei >> >>>> >> >>>> >> >>>> On 7/19/18 00:48, Rahul Raghavan wrote: >> >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >> >>>>> >> >>>>> (just adding + hotspot-compiler-dev also) >> >>>>> >> >>>>> >> >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >> >>>>> Subject Was: >> >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >> >>>>> >> >>>>> + serviceability-dev >> >>>>> >> >>>>> Hi all, >> >>>>> >> >>>>> Could anyone else give me a review of this webrev and check/test the >> >>>>> various architecture changes? >> >>>>> >> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >>>>> >> >>>>> >> >>>>> Thanks for all your help! >> >>>>> Jc >> >>>>> >> >>>>> >> >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler >> >> wrote: >> >>>>>> >> >>>>>>> Hi all, >> >>>>>>> >> >>>>>>> Here is a webrev that does all the architectures in the same way: >> >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >>>>>>> >> >>>>>>> Could anyone review the other architectures and test? >> >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same >> >>>>>>> "if no >> >>>>>>> tlab, then consider eden space allocation" logic. >> >>>>>>> >> >>>>>>> Thanks for your help! >> >>>>>>> Jc >> >>>>>>> >> >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> Hi Kim, >> >>>>>>>> >> >>>>>>>> I opened this bug >> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >> >>>>>>>> >> >>>>>>>> and now I've done an update: >> >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >> >>>>>>>> >> >>>>>>>> I basically have done your nits but also removed the try_eden (it >> >>>>>>>> was >> >>>>>>>> used to bind a label but was not used). I updated the comments to >> >>>>>>>> use the >> >>>>>>>> one you preferred. >> >>>>>>>> >> >>>>>>>> I still have to do the other architectures though but at least we >> >>>>>>>> seem to >> >>>>>>>> have a consensus on this architecture, correct? >> >>>>>>>> >> >>>>>>>> Thanks for the review, >> >>>>>>>> Jc >> >>>>>>>> >> >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett < >> kim.barrett at oracle.com >> >>> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler >> >>>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>> Yes, you are right, I did those changes due to: >> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >> >>>>>>>>>> >> >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, >> >>>>>>>>>> I'll go >> >>>>>>>>> ahead >> >>>>>>>>>> and propagate the change across architectures. >> >>>>>>>>>> >> >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >> >>>>>>>>>> comment >> >>>>>>>>> and >> >>>>>>>>>> review) :) >> >>>>>>>>>> Jc >> >>>>>>>>>> >> >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose < >> john.r.rose at oracle.com >> >>> >> >>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> I'm not sure if we had left this case intentionally or not >> >>>>>>>>>>> but, if we >> >>>>>>>>> want >> >>>>>>>>>>> it all to be consistent, we should perhaps fix it. >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> Well, you put in that logic last February, so unless somebody >> >>>>>>>>>>> speaks >> >>>>>>>>> up >> >>>>>>>>>>> quickly, I support your adjusting it to be the way you want >> it. >> >>>>>>>>>>> >> >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >> >>>>>>>>>>> src/hotspot/share" >> >>>>>>>>>>> suggests that the GC group is most active in touching this >> >>>>>>>>>>> feature. >> >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. >> >>>>>>>>>>> >> >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other >> person >> >>>>>>>>>>> working on the GC to OK it. >> >>>>>>>>>>> >> >>>>>>>>>>> ? John >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> -- >> >>>>>>>>>> >> >>>>>>>>>> Thanks, >> >>>>>>>>>> Jc >> >>>>>>>>> >> >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. >> >>>>>>>>> >> >>>>>>>>> I'm assuming you'll open a new bug for this? >> >>>>>>>>> >> >>>>>>>>> Except for a few minor nits (below), this looks okay to me. >> >>>>>>>>> >> >>>>>>>>> The comment at line 1052 needs updating. >> >>>>>>>>> >> >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is >> unused. >> >>>>>>>>> >> >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound >> at >> >>>>>>>>> line 1058, but unreferenced. >> >>>>>>>>> >> >>>>>>>>> I like the wording of the comment at 1139 better than the >> >>>>>>>>> wording at >> >>>>>>>>> 1016. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> >> >>>>>>>> Thanks, >> >>>>>>>> Jc >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> >> >>>>>>> Thanks, >> >>>>>>> Jc >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>> >> >> >> >> >> > >> > -- >> > >> > Thanks, >> > Jc >> >> > > -- > > Thanks, > Jc > > > -- Thanks, Jc From felix.yang at huawei.com Mon Jul 23 08:46:19 2018 From: felix.yang at huawei.com (Yangfei (Felix)) Date: Mon, 23 Jul 2018 08:46:19 +0000 Subject: [aarch64-port-dev ] RFR: 8207838: AArch64: fix the order in which float registers are restored in restore_args In-Reply-To: References: Message-ID: Hi, Thanks for reviewing. We plan to integrate the new test into an existing jtreg test: test/hotspot/jtreg/compiler/floatingpoint/TestFloatJNIArgs.java This jtreg test did something similar, but for non-synchronized JNI method. What do you think? Thanks, Felix > > On 07/19/2018 08:39 AM, Yangfei (Felix) wrote: > > Is it OK for jdk/jdk11? > > Great catch! That bug was committed by me on on Tue Apr 30 2013, > which makes it more than five years old. I think that's a record for > AArch64. > > I like the patch, but I think it'll need a proper jtreg test case. > It's useful to test the slow JNI locking path on all arches, not just > AArch64. > > You can make the test case fail more reliably by increasing the > contention like this: > > public void run() { > for (int i = 0; i < 1000; i++) { > float d = JniStaticContextFloat.staticMethodFloat1((float) (1), (float) (2), > (float) (4), (float) (8)); > } > > > Thanks. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Mon Jul 23 10:02:44 2018 From: aph at redhat.com (Andrew Haley) Date: Mon, 23 Jul 2018 11:02:44 +0100 Subject: [aarch64-port-dev ] RFR: 8207838: AArch64: fix the order in which float registers are restored in restore_args In-Reply-To: References: Message-ID: On 07/23/2018 09:46 AM, Yangfei (Felix) wrote: > We plan to integrate the new test into an existing jtreg test: test/hotspot/jtreg/compiler/floatingpoint/TestFloatJNIArgs.java > This jtreg test did something similar, but for non-synchronized JNI method. What do you think? That's an excellent idea. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From gunter.haug at sap.com Mon Jul 23 10:07:19 2018 From: gunter.haug at sap.com (Haug, Gunter) Date: Mon, 23 Jul 2018 10:07:19 +0000 Subject: PPC64: jfr profiling doesn't work (PPC64 only) In-Reply-To: <51f19327d29c4ce089706c8ec7e3aed9@sap.com> References: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> <51f19327d29c4ce089706c8ec7e3aed9@sap.com> Message-ID: <5B9874C9-5C76-469D-AE2A-A5C85F351E62@sap.com> Hi Goetz, Thanks for taking the time looking into this. > I think instead of JavaThread::stack_red_zone_size() + JavaThread::stack_yellow_zone_size() > you should use JavaThread::stack_red_zone_size() + JavaThread::stack_yellow_reserved_zone_size() You are absolutely correct here, I missed this. The formal stuff: I'll fix that. The empty if was an artefact, as you might have guessed, sorry. Thanks again, Gunter ?On 20.07.18, 10:24, "Lindenmaier, Goetz" wrote: Hi Gunter, thanks for fixing these issues. frame_ppc.cpp: I think instead of JavaThread::stack_red_zone_size() + JavaThread::stack_yellow_zone_size() you should use JavaThread::stack_red_zone_size() + JavaThread::stack_yellow_reserved_zone_size() Minor stuff: Please add RFR(S): to the Subject of your mail. Also, ususally the bug title is prefixed with [ppc]. Please remove redundant spaces from code like address sp = (address)_sp; (fp <= thread->stack_base()) && (fp > sp) as well as double newlines. No space before ) please: if (_cb != NULL ) { Also please break some of the comments to shorter lines. thread_linux_ppc.cpp also just minor stuff: Please fix indentations. Please indent by two with spaces, no tabs. There is an empty if (ProfileInterpreter) { }. Why? I can sponsor this for you. Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Haug, Gunter > Sent: Donnerstag, 19. Juli 2018 12:53 > To: hotspot-runtime-dev at openjdk.java.net > Subject: [CAUTION] PPC64: jfr profiling doesn't work (PPC64 only) > > Hi all, > > can I please have reviews and a sponsor for the following fix: > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK- > 8207392?filter=allopenissues > http://cr.openjdk.java.net/~ghaug/webrevs/8207392/ > > JFR profiling on linux PPC64 has not been implemented correctly so far, the > VM crashes when it is turned on. Therefore > hotspot/jtreg/runtime/appcds/TestWithProfiler.java fails. With this fix the > test succeeds. I've analyzed a couple of benchmarks with JMC and results > look plausible when compared to linux x86. > > Thanks and best regards, > Gunter > > > > From gunter.haug at sap.com Mon Jul 23 14:27:28 2018 From: gunter.haug at sap.com (Haug, Gunter) Date: Mon, 23 Jul 2018 14:27:28 +0000 Subject: PPC64: jfr profiling doesn't work (PPC64 only) In-Reply-To: References: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> Message-ID: <57916CF7-3FF0-4C9F-B32F-C2193415B434@sap.com> Thanks a lot, Volker for the review! Here is an updated webrev: http://cr.openjdk.java.net/~ghaug/webrevs/8207392.v1 I have incorporated all the suggestions you made. Moreover, Goetz' improvements are in as well. I'll ask Goetz to sponsor it tomorrow if nobody else objects. Best regards, Gunter ?On 20.07.18, 14:27, "Volker Simonis" wrote: Hi Gunter, thanks for fixing this! The change looks god in general. Please find some comments questions below: src/hotspot/cpu/ppc/frame_ppc.cpp ========================== 78 // an fp must be within the stack and above (but not equal) sp 79 bool fp_safe = (fp <= thread->stack_base()) && (fp > sp) && ((fp - sp) >= (ijava_state_size + top_ijava_frame_abi_size)); Is this check for interpreter frames only? Then better name it 'fp_interp_safe' and adapt the comment. Otherwise, why does the 'fp - sp' have to be larger than the java interpreter state? Also the line is quite long. Better break it after '&&' 81 // We know sp/unextended_sp are safe only fp is questionable here Better put a comma (or period) after 'safe' to make it more readable. 88 // First check if frame is complete and tester is reliable 89 // Unfortunately we can only check frame complete for runtime stubs and nmethod 90 // other generic buffer blobs are more problematic so we just assume they are 91 // ok. adapter blobs never have a frame complete and are never ok. Better: "First check if the frame is complete and the test is reliable. Unfortunately we can only check frame completeness for runtime stubs and nmethods. Other generic buffer blobs are more problematic so we just assume they are OK. Adapter blobs never have a complete frame and are never OK." In general please start comments with an uppercase letter and use a period at the end of sentences. 98 // Could just be some random pointer within the codeBlob 99 if (!_cb->code_contains(_pc)) { Shouldn't this be the first, basic check after we know that '_cb != NULL' (i.e. even before we check for frame completeness)? 103 // Entry frame checks 104 if (is_entry_frame()) { 105 // an entry frame must have a valid fp. 106 return fp_safe && is_entry_frame_valid(thread); 107 } An entry frame is not an interpreter frame but you use 'fp_safe' as computed for interpreter frames which is probably too conservative. Maybe the check in 'is_entry_frame_valid()' is sufficient already? 118 CodeBlob* sender_blob = CodeCache::find_blob_unsafe(sender_pc); 119 if (sender_pc == NULL || sender_blob == NULL) { 120 return false; 121 } 'find_blob_unsafe()' returns NULL if the 'sender_pc' is NULL so there's no need for the extra 'sender_pc == NULL' check in the if-clause. 135 // an fp must be within the stack and above (but not equal) current frame's _FP 136 137 bool sender_fp_safe = (sender_fp <= thread->stack_base()) && (sender_fp > fp); 138 139 if (!sender_fp_safe) { 140 return false; 141 } Shorter: 135 // sender_fp must be within the stack and above (but not equal) current frame's fp 137 if (sender_fp > thread->stack_base() || sender_fp <= fp) { 140 return false; 141 } 158 if (sender.is_entry_frame()) { 159 // Validate the JavaCallWrapper an entry frame must have 160 161 address jcw = (address)sender.entry_frame_call_wrapper(); 162 163 bool jcw_safe = (jcw <= thread->stack_base()) && (jcw > sender_fp); 164 return jcw_safe; 165 } Why don't you use 'sender.is_entry_frame_valid()' valid instead of duplicating that code here? 173 // Could put some more validation for the potential non-interpreted sender 174 // frame we'd create by calling sender if I could think of any. Wait for next crash in forte... 175 176 // One idea is seeing if the sender_pc we have is one that we'd expect to call to current cb I think these comments are leftovers from other architectures which can be removed (we don't support 'forte' on ppc :) 184 // Must be native-compiled frame. Since sender will try and use fp to find 185 // linkages it must be safe 186 187 if (!fp_safe) return false; If it's a native compiled frame the 'fp_safe' check is too strict because it was computed for interpreter frames. 189 // could try and do some more potential verification of native frame if we could think of some... Useless comment - can be removed. src/hotspot/os_cpu/linux_ppc/thread_linux_ppc.cpp ===================================== 45 assert(this->is_Java_thread(), "must be JavaThread"); 46 JavaThread* jt = (JavaThread *)this; 'this' is already a 'JavaThread' so no need for the assertion and the new local variable 'jt'. 81 if (ProfileInterpreter) { 82 } Unused - can be deleted. Regards, Volker On Thu, Jul 19, 2018 at 12:53 PM, Haug, Gunter wrote: > Hi all, > > can I please have reviews and a sponsor for the following fix: > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8207392?filter=allopenissues > http://cr.openjdk.java.net/~ghaug/webrevs/8207392/ > > JFR profiling on linux PPC64 has not been implemented correctly so far, the VM crashes when it is turned on. Therefore hotspot/jtreg/runtime/appcds/TestWithProfiler.java fails. With this fix the test succeeds. I've analyzed a couple of benchmarks with JMC and results look plausible when compared to linux x86. > > Thanks and best regards, > Gunter > > > > > From jonathan.gibbons at oracle.com Mon Jul 23 21:48:10 2018 From: jonathan.gibbons at oracle.com (Jonathan Gibbons) Date: Mon, 23 Jul 2018 14:48:10 -0700 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: References: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> Message-ID: <4e671fef-bcc5-0508-b734-befe1d1686b4@oracle.com> Volker, Sorry I missed this before my recent vacation. Yes, it is now "officially" supported that you can have multiple test descriptions (comment blocks beginning '@test'). Since forever, it was unofficially allowed but not well integrated as a feature. This was fixed a while back: 7901940: support multiple @test in one test file The issue which needed to be fixed was in the naming of the individual tests within the file, such that you could distinguish the individual tests from the overall collection of tests in the file. -- Jon On 7/11/18 12:34 AM, Volker Simonis wrote: > Hi David, > > so it obviously works and as Goetz mentioned there are already other, > existing tests which use this feature. > > Do you want me to get a formal review which confirms this from > somebody from the JTreg team? > > I've CC-ed jtreg-use and Jonathan in the hope that they can confirm this. > > Regards, > Volker > > On Tue, Jul 10, 2018 at 11:24 PM, David Holmes wrote: >> Hi Volker, >> >> On 11/07/2018 3:52 AM, Volker Simonis wrote: >>> Hi, >>> >>> can I please get a review for the following test-only change: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ >>> https://bugs.openjdk.java.net/browse/JDK-8206998 >>> >>> The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java >>> intentionally disables caching of Elf sections during symbol lookup >>> with WhiteBox.disableElfSectionCache(). On platforms which do not use >>> file descriptors instead of plain function pointers this slows down >>> the lookup just a little bit, because all the symbols from an Elf file >>> are still read consecutively after one 'fseek()' call. But on >>> platforms with file descriptors like ppc64 big-endian, we get two >>> 'fseek()' calls for each symbol read from the Elf file because reading >>> the file descriptor table is nested inside the loop which reads the >>> symbols. This really trashes the I/O system and considerable slows >>> down the test, so we need an extra long timeout setting. >>> >>> The fix is trivial - simply provide two test versions (i.e. comments): >>> the first one for all Linux flavors which are not ppc64 and a second, >>> new one for Linux/ppc64 which simply has a bigger timeout. >> >> I was not aware that this was a valid way of defining a test! This suggests >> there can only be one "leading comment" per "defining file: >> >> http://openjdk.java.net/jtreg/tag-spec.html >> >> Need to verify this with the jtreg folk: jtreg-use at openjdk.java.net >> >> Thanks, >> David >> >> >>> Thank you and best regards, >>> Volker >>> From volker.simonis at gmail.com Tue Jul 24 07:29:23 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 24 Jul 2018 09:29:23 +0200 Subject: [11] RFR(S): 8206998: [test] runtime/ElfDecoder/TestElfDirectRead.java requires longer timeout on ppc64 In-Reply-To: <4e671fef-bcc5-0508-b734-befe1d1686b4@oracle.com> References: <578bb78a-8d62-ebd9-84e7-8ce37da77fbe@oracle.com> <4e671fef-bcc5-0508-b734-befe1d1686b4@oracle.com> Message-ID: Hi Jonathan, thanks for following up on this issue. Would it be possible to also update the corresponding documentation [1] ? It still has the following information which is simply wrong now: INFORMATIONAL TAGS ... @test * Defining-file identifier; * is typically SCCS identification info ... Any particular informational tag, except the @comment tag, may occur at most once in a given test. The @comment tag may be used multiple times. Thank you and best regards, Volker [1] http://openjdk.java.net/jtreg/tag-spec.html On Mon, Jul 23, 2018 at 11:48 PM, Jonathan Gibbons wrote: > Volker, > > Sorry I missed this before my recent vacation. > > Yes, it is now "officially" supported that you can have multiple test > descriptions (comment blocks beginning '@test'). > > Since forever, it was unofficially allowed but not well integrated as a > feature. This was fixed a while back: > > 7901940: support multiple @test in one test file > > The issue which needed to be fixed was in the naming of the individual tests > within the file, such that you could distinguish the individual tests from > the overall collection of tests in the file. > > -- Jon > > > On 7/11/18 12:34 AM, Volker Simonis wrote: > > Hi David, > > so it obviously works and as Goetz mentioned there are already other, > existing tests which use this feature. > > Do you want me to get a formal review which confirms this from > somebody from the JTreg team? > > I've CC-ed jtreg-use and Jonathan in the hope that they can confirm this. > > Regards, > Volker > > On Tue, Jul 10, 2018 at 11:24 PM, David Holmes > wrote: > > Hi Volker, > > On 11/07/2018 3:52 AM, Volker Simonis wrote: > > Hi, > > can I please get a review for the following test-only change: > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8206998/ > https://bugs.openjdk.java.net/browse/JDK-8206998 > > The problem is that the test runtime/ElfDecoder/TestElfDirectRead.java > intentionally disables caching of Elf sections during symbol lookup > with WhiteBox.disableElfSectionCache(). On platforms which do not use > file descriptors instead of plain function pointers this slows down > the lookup just a little bit, because all the symbols from an Elf file > are still read consecutively after one 'fseek()' call. But on > platforms with file descriptors like ppc64 big-endian, we get two > 'fseek()' calls for each symbol read from the Elf file because reading > the file descriptor table is nested inside the loop which reads the > symbols. This really trashes the I/O system and considerable slows > down the test, so we need an extra long timeout setting. > > The fix is trivial - simply provide two test versions (i.e. comments): > the first one for all Linux flavors which are not ppc64 and a second, > new one for Linux/ppc64 which simply has a bigger timeout. > > I was not aware that this was a valid way of defining a test! This suggests > there can only be one "leading comment" per "defining file: > > http://openjdk.java.net/jtreg/tag-spec.html > > Need to verify this with the jtreg folk: jtreg-use at openjdk.java.net > > Thanks, > David > > > Thank you and best regards, > Volker > > From volker.simonis at gmail.com Tue Jul 24 09:06:44 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 24 Jul 2018 11:06:44 +0200 Subject: PPC64: jfr profiling doesn't work (PPC64 only) In-Reply-To: <57916CF7-3FF0-4C9F-B32F-C2193415B434@sap.com> References: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> <57916CF7-3FF0-4C9F-B32F-C2193415B434@sap.com> Message-ID: Hi Gunter, looks pretty good now :) Just some minor issues (no need for a new webrev): - Indentation is wrong (do you have tabs in your file?). I think jcheck should normally detect that, but I'm not sure. 96 // Now check if the frame is complete and the test is 97 // reliable. Unfortunately we can only check frame completeness for 98 // runtime stubs and nmethods. Other generic buffer blobs are more 99 // problematic so we just assume they are OK. Adapter blobs never have a 100 // complete frame and are never OK 101 if (!_cb->is_frame_complete_at(_pc)) { - Also the following comment is still weird - maybe you can reword it to something more understandable ? 163 // If the frame size is 0 something (or less) is bad because every nmethod has a non-zero frame size 164 // because you must allocate window space. And finally I just realized that in thread_linux_ppc.cpp the method 'JavaThread::pd_get_top_frame_for_signal_handler()' is still unimplemented. Looking at the signature (and the other platforms) I think it can be simply implemented by forwarding the call to 'pd_get_top_frame_for_profiling()'. With that simple change we should get support for the async profile [1] on ppc which I think would be cool. Regards, Volker [1] https://github.com/jvm-profiling-tools/async-profiler On Mon, Jul 23, 2018 at 4:27 PM, Haug, Gunter wrote: > Thanks a lot, Volker for the review! > > Here is an updated webrev: > > http://cr.openjdk.java.net/~ghaug/webrevs/8207392.v1 > > I have incorporated all the suggestions you made. Moreover, Goetz' improvements are in as well. > > I'll ask Goetz to sponsor it tomorrow if nobody else objects. > > Best regards, > Gunter > > > ?On 20.07.18, 14:27, "Volker Simonis" wrote: > > Hi Gunter, > > thanks for fixing this! The change looks god in general. Please find > some comments questions below: > > > src/hotspot/cpu/ppc/frame_ppc.cpp > ========================== > > 78 // an fp must be within the stack and above (but not equal) sp > 79 bool fp_safe = (fp <= thread->stack_base()) && (fp > sp) && > ((fp - sp) >= (ijava_state_size + top_ijava_frame_abi_size)); > > Is this check for interpreter frames only? Then better name it > 'fp_interp_safe' and adapt the comment. Otherwise, why does the 'fp - > sp' have to be larger than the java interpreter state? > Also the line is quite long. Better break it after '&&' > > 81 // We know sp/unextended_sp are safe only fp is questionable here > > Better put a comma (or period) after 'safe' to make it more readable. > > 88 // First check if frame is complete and tester is reliable > 89 // Unfortunately we can only check frame complete for runtime > stubs and nmethod > 90 // other generic buffer blobs are more problematic so we just > assume they are > 91 // ok. adapter blobs never have a frame complete and are never ok. > > Better: "First check if the frame is complete and the test is > reliable. Unfortunately we can only check frame completeness for > runtime stubs and nmethods. Other generic buffer blobs are more > problematic so we just assume they are OK. Adapter blobs never have a > complete frame and are never OK." > > In general please start comments with an uppercase letter and use a > period at the end of sentences. > > 98 // Could just be some random pointer within the codeBlob > 99 if (!_cb->code_contains(_pc)) { > > Shouldn't this be the first, basic check after we know that '_cb != > NULL' (i.e. even before we check for frame completeness)? > > 103 // Entry frame checks > 104 if (is_entry_frame()) { > 105 // an entry frame must have a valid fp. > 106 return fp_safe && is_entry_frame_valid(thread); > 107 } > > An entry frame is not an interpreter frame but you use 'fp_safe' as > computed for interpreter frames which is probably too conservative. > Maybe the check in 'is_entry_frame_valid()' is sufficient already? > > 118 CodeBlob* sender_blob = CodeCache::find_blob_unsafe(sender_pc); > 119 if (sender_pc == NULL || sender_blob == NULL) { > 120 return false; > 121 } > > 'find_blob_unsafe()' returns NULL if the 'sender_pc' is NULL so > there's no need for the extra 'sender_pc == NULL' check in the > if-clause. > > 135 // an fp must be within the stack and above (but not equal) > current frame's _FP > 136 > 137 bool sender_fp_safe = (sender_fp <= thread->stack_base()) && > (sender_fp > fp); > 138 > 139 if (!sender_fp_safe) { > 140 return false; > 141 } > > Shorter: > > 135 // sender_fp must be within the stack and above (but not > equal) current frame's fp > 137 if (sender_fp > thread->stack_base() || sender_fp <= fp) { > 140 return false; > 141 } > > 158 if (sender.is_entry_frame()) { > 159 // Validate the JavaCallWrapper an entry frame must have > 160 > 161 address jcw = (address)sender.entry_frame_call_wrapper(); > 162 > 163 bool jcw_safe = (jcw <= thread->stack_base()) && (jcw > sender_fp); > 164 return jcw_safe; > 165 } > > Why don't you use 'sender.is_entry_frame_valid()' valid instead of > duplicating that code here? > > 173 // Could put some more validation for the potential > non-interpreted sender > 174 // frame we'd create by calling sender if I could think of > any. Wait for next crash in forte... > 175 > 176 // One idea is seeing if the sender_pc we have is one that > we'd expect to call to current cb > > I think these comments are leftovers from other architectures which > can be removed (we don't support 'forte' on ppc :) > > 184 // Must be native-compiled frame. Since sender will try and use > fp to find > 185 // linkages it must be safe > 186 > 187 if (!fp_safe) return false; > > If it's a native compiled frame the 'fp_safe' check is too strict > because it was computed for interpreter frames. > > 189 // could try and do some more potential verification of native > frame if we could think of some... > > Useless comment - can be removed. > > src/hotspot/os_cpu/linux_ppc/thread_linux_ppc.cpp > ===================================== > > 45 assert(this->is_Java_thread(), "must be JavaThread"); > 46 JavaThread* jt = (JavaThread *)this; > > 'this' is already a 'JavaThread' so no need for the assertion and the > new local variable 'jt'. > > 81 if (ProfileInterpreter) { > 82 } > > Unused - can be deleted. > > Regards, > Volker > > > On Thu, Jul 19, 2018 at 12:53 PM, Haug, Gunter wrote: > > Hi all, > > > > can I please have reviews and a sponsor for the following fix: > > > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8207392?filter=allopenissues > > http://cr.openjdk.java.net/~ghaug/webrevs/8207392/ > > > > JFR profiling on linux PPC64 has not been implemented correctly so far, the VM crashes when it is turned on. Therefore hotspot/jtreg/runtime/appcds/TestWithProfiler.java fails. With this fix the test succeeds. I've analyzed a couple of benchmarks with JMC and results look plausible when compared to linux x86. > > > > Thanks and best regards, > > Gunter > > > > > > > > > > > > From goetz.lindenmaier at sap.com Tue Jul 24 13:42:40 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 24 Jul 2018 13:42:40 +0000 Subject: PPC64: jfr profiling doesn't work (PPC64 only) In-Reply-To: References: <51BCC98D-788C-43BD-A739-A304DB3EA847@sap.com> <57916CF7-3FF0-4C9F-B32F-C2193415B434@sap.com> Message-ID: <9c7eb412aacc4e7a90cb8fb32f2bc7af@sap.com> Hi Gunter, I'll sponsor this for you. I'll change the bug to [ppc64] Implement JFR profiling. I think this better expresses what's going on. The bug still states that this fixes crashes, so this still is a bugfix acceptable for RDP1. Best regards, Goetz. > -----Original Message----- > From: Volker Simonis > Sent: Dienstag, 24. Juli 2018 11:07 > To: Haug, Gunter > Cc: hotspot-runtime-dev at openjdk.java.net; Lindenmaier, Goetz > > Subject: Re: PPC64: jfr profiling doesn't work (PPC64 only) > > Hi Gunter, > > looks pretty good now :) > > Just some minor issues (no need for a new webrev): > > - Indentation is wrong (do you have tabs in your file?). I think > jcheck should normally detect that, but I'm not sure. > > 96 // Now check if the frame is complete and the test is > 97 // reliable. Unfortunately we can only check frame completeness for > 98 // runtime stubs and nmethods. Other generic buffer blobs are more > 99 // problematic so we just assume they are OK. Adapter > blobs never have a > 100 // complete frame and are never OK > 101 if (!_cb->is_frame_complete_at(_pc)) { > > - Also the following comment is still weird - maybe you can reword it > to something more understandable ? > > 163 // If the frame size is 0 something (or less) is bad because > every nmethod has a non-zero frame size > 164 // because you must allocate window space. > > And finally I just realized that in thread_linux_ppc.cpp the method > 'JavaThread::pd_get_top_frame_for_signal_handler()' is still > unimplemented. Looking at the signature (and the other platforms) I > think it can be simply implemented by forwarding the call to > 'pd_get_top_frame_for_profiling()'. With that simple change we should > get support for the async profile [1] on ppc which I think would be > cool. > > Regards, > Volker > > [1] https://github.com/jvm-profiling-tools/async-profiler > > > > On Mon, Jul 23, 2018 at 4:27 PM, Haug, Gunter > wrote: > > Thanks a lot, Volker for the review! > > > > Here is an updated webrev: > > > > http://cr.openjdk.java.net/~ghaug/webrevs/8207392.v1 > > > > I have incorporated all the suggestions you made. Moreover, Goetz' > improvements are in as well. > > > > I'll ask Goetz to sponsor it tomorrow if nobody else objects. > > > > Best regards, > > Gunter > > > > > > ?On 20.07.18, 14:27, "Volker Simonis" wrote: > > > > Hi Gunter, > > > > thanks for fixing this! The change looks god in general. Please find > > some comments questions below: > > > > > > src/hotspot/cpu/ppc/frame_ppc.cpp > > ========================== > > > > 78 // an fp must be within the stack and above (but not equal) sp > > 79 bool fp_safe = (fp <= thread->stack_base()) && (fp > sp) && > > ((fp - sp) >= (ijava_state_size + top_ijava_frame_abi_size)); > > > > Is this check for interpreter frames only? Then better name it > > 'fp_interp_safe' and adapt the comment. Otherwise, why does the 'fp - > > sp' have to be larger than the java interpreter state? > > Also the line is quite long. Better break it after '&&' > > > > 81 // We know sp/unextended_sp are safe only fp is questionable here > > > > Better put a comma (or period) after 'safe' to make it more readable. > > > > 88 // First check if frame is complete and tester is reliable > > 89 // Unfortunately we can only check frame complete for runtime > > stubs and nmethod > > 90 // other generic buffer blobs are more problematic so we just > > assume they are > > 91 // ok. adapter blobs never have a frame complete and are never > ok. > > > > Better: "First check if the frame is complete and the test is > > reliable. Unfortunately we can only check frame completeness for > > runtime stubs and nmethods. Other generic buffer blobs are more > > problematic so we just assume they are OK. Adapter blobs never have a > > complete frame and are never OK." > > > > In general please start comments with an uppercase letter and use a > > period at the end of sentences. > > > > 98 // Could just be some random pointer within the codeBlob > > 99 if (!_cb->code_contains(_pc)) { > > > > Shouldn't this be the first, basic check after we know that '_cb != > > NULL' (i.e. even before we check for frame completeness)? > > > > 103 // Entry frame checks > > 104 if (is_entry_frame()) { > > 105 // an entry frame must have a valid fp. > > 106 return fp_safe && is_entry_frame_valid(thread); > > 107 } > > > > An entry frame is not an interpreter frame but you use 'fp_safe' as > > computed for interpreter frames which is probably too conservative. > > Maybe the check in 'is_entry_frame_valid()' is sufficient already? > > > > 118 CodeBlob* sender_blob = > CodeCache::find_blob_unsafe(sender_pc); > > 119 if (sender_pc == NULL || sender_blob == NULL) { > > 120 return false; > > 121 } > > > > 'find_blob_unsafe()' returns NULL if the 'sender_pc' is NULL so > > there's no need for the extra 'sender_pc == NULL' check in the > > if-clause. > > > > 135 // an fp must be within the stack and above (but not equal) > > current frame's _FP > > 136 > > 137 bool sender_fp_safe = (sender_fp <= thread->stack_base()) && > > (sender_fp > fp); > > 138 > > 139 if (!sender_fp_safe) { > > 140 return false; > > 141 } > > > > Shorter: > > > > 135 // sender_fp must be within the stack and above (but not > > equal) current frame's fp > > 137 if (sender_fp > thread->stack_base() || sender_fp <= fp) { > > 140 return false; > > 141 } > > > > 158 if (sender.is_entry_frame()) { > > 159 // Validate the JavaCallWrapper an entry frame must have > > 160 > > 161 address jcw = (address)sender.entry_frame_call_wrapper(); > > 162 > > 163 bool jcw_safe = (jcw <= thread->stack_base()) && (jcw > > sender_fp); > > 164 return jcw_safe; > > 165 } > > > > Why don't you use 'sender.is_entry_frame_valid()' valid instead of > > duplicating that code here? > > > > 173 // Could put some more validation for the potential > > non-interpreted sender > > 174 // frame we'd create by calling sender if I could think of > > any. Wait for next crash in forte... > > 175 > > 176 // One idea is seeing if the sender_pc we have is one that > > we'd expect to call to current cb > > > > I think these comments are leftovers from other architectures which > > can be removed (we don't support 'forte' on ppc :) > > > > 184 // Must be native-compiled frame. Since sender will try and use > > fp to find > > 185 // linkages it must be safe > > 186 > > 187 if (!fp_safe) return false; > > > > If it's a native compiled frame the 'fp_safe' check is too strict > > because it was computed for interpreter frames. > > > > 189 // could try and do some more potential verification of native > > frame if we could think of some... > > > > Useless comment - can be removed. > > > > src/hotspot/os_cpu/linux_ppc/thread_linux_ppc.cpp > > ===================================== > > > > 45 assert(this->is_Java_thread(), "must be JavaThread"); > > 46 JavaThread* jt = (JavaThread *)this; > > > > 'this' is already a 'JavaThread' so no need for the assertion and the > > new local variable 'jt'. > > > > 81 if (ProfileInterpreter) { > > 82 } > > > > Unused - can be deleted. > > > > Regards, > > Volker > > > > > > On Thu, Jul 19, 2018 at 12:53 PM, Haug, Gunter > wrote: > > > Hi all, > > > > > > can I please have reviews and a sponsor for the following fix: > > > > > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK- > 8207392?filter=allopenissues > > > http://cr.openjdk.java.net/~ghaug/webrevs/8207392/ > > > > > > JFR profiling on linux PPC64 has not been implemented correctly so far, > the VM crashes when it is turned on. Therefore > hotspot/jtreg/runtime/appcds/TestWithProfiler.java fails. With this fix the > test succeeds. I've analyzed a couple of benchmarks with JMC and results > look plausible when compared to linux x86. > > > > > > Thanks and best regards, > > > Gunter > > > > > > > > > > > > > > > > > > > From felix.yang at linaro.org Tue Jul 24 14:02:03 2018 From: felix.yang at linaro.org (Felix Yang) Date: Tue, 24 Jul 2018 22:02:03 +0800 Subject: [aarch64-port-dev ] RFR: 8207838: AArch64: fix the order in which float registers are restored in restore_args In-Reply-To: References: Message-ID: Webrev: http://cr.openjdk.java.net/~fyang/8207838/webrev.00/ As I have some issues with my network, please help push this if it is OK to go. Thanks, Felix On 23 July 2018 at 18:02, Andrew Haley wrote: > On 07/23/2018 09:46 AM, Yangfei (Felix) wrote: > > We plan to integrate the new test into an existing jtreg test: > test/hotspot/jtreg/compiler/floatingpoint/TestFloatJNIArgs.java > > This jtreg test did something similar, but for non-synchronized JNI > method. What do you think? > > That's an excellent idea. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From mikhailo.seledtsov at oracle.com Tue Jul 24 16:25:19 2018 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Tue, 24 Jul 2018 09:25:19 -0700 Subject: RFR:8189762: [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration In-Reply-To: <6CF7A513-90FA-4BC9-8652-A889A687BF50@oracle.com> References: <6D1F8A5A-769C-499C-B647-26DE01D072EA@oracle.com> <6CF7A513-90FA-4BC9-8652-A889A687BF50@oracle.com> Message-ID: <023a791d-38d9-da20-3d16-ded36b0ae95e@oracle.com> The changes look good to me, Misha On 07/19/2018 11:45 AM, Bob Vandette wrote: > Could you try to ask Misha to review these changes (mikhailo.seledtsov at oracle.com ) > since he wrote these tests? > > It would be helpful to have a webrev comparing the JDK11 test sources against yours. > > In JDK 10, we are using @requires docker.support. Is this not possible in JDK8? > > There have been a few fixes to the docker tests in JDK 11. You should make sure to get the > latest versions of these tests. > > We have also re-worked some of these tests during the addition of the Container Metrics > API and associated tests in JDK 11 to move out common utility classes. > > I try to add the ?docker? label to any tests and improvements related to cgroups or docker. > Here?s a query for JDK11 AND Label == docker. > > https://bugs.openjdk.java.net/issues/?filter=33939&jql=project%20%3D%20JDK%20AND%20fixVersion%20%3D%20%2211%22%20AND%20labels%20%3D%20docker%20ORDER%20BY%20priority%20DESC > > Bob. > > >> On Jul 17, 2018, at 10:31 AM, Vaibhav Choudhary wrote: >> >> Hi, >> >> Please review the following backport test enhancement for JDK8u written for container awareness. >> Webrev : http://cr.openjdk.java.net/~rpatil/8189762/webrev.00/ >> >> Bug https://bugs.openjdk.java.net/browse/JDK-8189762 >> [TESTBUG] Create tests for JDK-8146115 container awareness and resource configuration >> >> Its a backport from JDK10. >> >> JDK10 changeset: http://hg.openjdk.java.net/jdk/jdk/rev/d6d00f785f39 >> JDK10 review thread : http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-November/025086.html >> >> Description: Tests are very similar to JDK10, but differs in logging mechanism. -XX options like UseContainerSupport, PrintContainerInfo has been used in place of -Xlog. Few changes has been done in the Util files to make the test compatible. >> >> Testing: Testing has been done on Ubuntu with and without Docker environment. >> >> Thanks, >> Vaibhav C From harold.seigel at oracle.com Wed Jul 25 15:47:46 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Wed, 25 Jul 2018 11:47:46 -0400 Subject: RFR 8207779: Method::is_valid_method() compares 'this' with NULL Message-ID: <19825af4-87c0-f7ef-c48e-9f83edc2187e@oracle.com> Hi, Please review this JDK-12 fix for bug JDK-8207779.? The fix changes function is_valid_method() into a static function to avoid comparisons between 'this' and NULL and to avoid accessing the validating method through the Method being validated. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207779/webrev/index.html JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207779 This fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests on Linux-x64. Thanks, Harold From lois.foltan at oracle.com Wed Jul 25 16:04:09 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 25 Jul 2018 12:04:09 -0400 Subject: RFR 8207779: Method::is_valid_method() compares 'this' with NULL In-Reply-To: <19825af4-87c0-f7ef-c48e-9f83edc2187e@oracle.com> References: <19825af4-87c0-f7ef-c48e-9f83edc2187e@oracle.com> Message-ID: <43c43a6e-92e6-f894-58b5-2d32b7bacb31@oracle.com> Looks good. Lois On 7/25/2018 11:47 AM, Harold David Seigel wrote: > Hi, > > Please review this JDK-12 fix for bug JDK-8207779.? The fix changes > function is_valid_method() into a static function to avoid comparisons > between 'this' and NULL and to avoid accessing the validating method > through the Method being validated. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8207779/webrev/index.html > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207779 > > This fix was regression tested by running Mach5 tiers 1 and 2 tests > and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running > tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests > on Linux-x64. > > Thanks, Harold > From harold.seigel at oracle.com Wed Jul 25 17:12:13 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Wed, 25 Jul 2018 13:12:13 -0400 Subject: RFR 8207779: Method::is_valid_method() compares 'this' with NULL In-Reply-To: <43c43a6e-92e6-f894-58b5-2d32b7bacb31@oracle.com> References: <19825af4-87c0-f7ef-c48e-9f83edc2187e@oracle.com> <43c43a6e-92e6-f894-58b5-2d32b7bacb31@oracle.com> Message-ID: Thanks Lois! Harold On 7/25/2018 12:04 PM, Lois Foltan wrote: > Looks good. > Lois > > On 7/25/2018 11:47 AM, Harold David Seigel wrote: >> Hi, >> >> Please review this JDK-12 fix for bug JDK-8207779.? The fix changes >> function is_valid_method() into a static function to avoid >> comparisons between 'this' and NULL and to avoid accessing the >> validating method through the Method being validated. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8207779/webrev/index.html >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207779 >> >> This fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and >> VM tests on Linux-x64. >> >> Thanks, Harold >> > From harold.seigel at oracle.com Wed Jul 25 18:56:47 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Wed, 25 Jul 2018 14:56:47 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: <37e16040-a847-eefe-fe41-088891d4ae07@oracle.com> References: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> <37e16040-a847-eefe-fe41-088891d4ae07@oracle.com> Message-ID: Hi, Please review this updated webrev: http://cr.openjdk.java.net/~hseigel/bug_8202171.2/webrev/index.html This includes null checks when needed for callers to nonstatic oopDesc::print() and oopDesc::print_on() functions and changes the oopDesc verify() functions to static. Thanks, Harold On 7/18/2018 8:44 AM, Harold David Seigel wrote: > Hi Coleen, Kim, > > Thanks for your comments! > > I'll make the changes suggested by Coleen and put out a new webrev. > > Thanks, Harold > > On 7/17/2018 5:38 PM, coleen.phillimore at oracle.com wrote: >> >> Hi Harold, >> >> Looking at this change, I would like us to keep the nonstatic print() >> and print_on(outputStream*) functions because other Metadata and >> types within the jvm have these functions.? I think the few places >> where the oop can be NULL at the caller should be checked instead and >> remove the this == NULL check in the oopDesc::print_on() function.? >> Most places already do check for NULL.? The verify function seems >> fine to make a static member function though. >> >> I agree with Kim that there are other places where "this" is compared >> to NULL which shouldn't be done, and we should file separate RFEs to >> deal with them, specifically Method::is_valid_method() and >> Metadata::print_{value_}on_maybe_null() functions. >> >> Thanks, >> Coleen >> >> On 7/16/18 3:24 PM, Harold David Seigel wrote: >>> Hi, >>> >>> Please review this JDK-12 fix for bug JDK-8202171.? The fix changes >>> a few functions in oop.cpp into static functions to avoid >>> comparisons between 'this' and NULL. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8202171/webrev/index.html >>> >>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8202171 >>> >>> This fix was regression tested by running Mach5 tiers 1 and 2 tests >>> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >>> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and >>> VM tests on Linux-x64. >>> >>> Thanks, Harold >>> >> > From patricio.chilano.mateo at oracle.com Wed Jul 25 19:45:22 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Wed, 25 Jul 2018 15:45:22 -0400 Subject: RFR 8171157: Convert ObjectMonitor_test to GTest Message-ID: <7555b190-737f-300f-84dc-746ff7207ec3@oracle.com> Hi all, Could you please review this change? It?s a migration of the ObjectMonitor test to GTest. Two GTests were actually created, one for ObjectMonitor and one for ObjectSynchronizer. Summary:Convert ObjectMonitor_test to GTest Bug URL: https://bugs.openjdk.java.net/browse/JDK-8171157 Webrev URL: http://cr.openjdk.java.net/~dcubed/for_patricio/8171157.01/webrev/ The fix was tested with Mach5 on tiers 1- 3 on all platforms. Thanks, Patricio From daniel.daugherty at oracle.com Wed Jul 25 20:15:57 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 25 Jul 2018 16:15:57 -0400 Subject: RFR 8171157: Convert ObjectMonitor_test to GTest In-Reply-To: <7555b190-737f-300f-84dc-746ff7207ec3@oracle.com> References: <7555b190-737f-300f-84dc-746ff7207ec3@oracle.com> Message-ID: <70855bf4-8b7e-310b-c5a5-3b9aabb02a23@oracle.com> On 7/25/18 3:45 PM, patricio.chilano.mateo at oracle.com wrote: > Hi all, > > Could you please review this change? It?s a migration of the > ObjectMonitor test to GTest. Two GTests were actually created, one for > ObjectMonitor and one for ObjectSynchronizer. > > Summary:Convert ObjectMonitor_test to GTest > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8171157 > Webrev URL: > http://cr.openjdk.java.net/~dcubed/for_patricio/8171157.01/webrev/ > src/hotspot/share/runtime/objectMonitor.cpp ??? No comments. src/hotspot/share/runtime/objectMonitor.hpp ??? No comments. src/hotspot/share/runtime/synchronizer.cpp ??? No comments. src/hotspot/share/runtime/synchronizer.hpp ??? No comments. src/hotspot/share/utilities/internalVMTests.cpp ??? No comments. test/hotspot/gtest/runtime/test_objectMonitor.cpp ??? No comments. test/hotspot/gtest/runtime/test_synchronizer.cpp ??? No comments. Thumbs up. Dan > > The fix was tested with Mach5 on tiers 1- 3 on all platforms. > > Thanks, > Patricio From patricio.chilano.mateo at oracle.com Wed Jul 25 20:27:51 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Wed, 25 Jul 2018 16:27:51 -0400 Subject: RFR 8171157: Convert ObjectMonitor_test to GTest In-Reply-To: <70855bf4-8b7e-310b-c5a5-3b9aabb02a23@oracle.com> References: <7555b190-737f-300f-84dc-746ff7207ec3@oracle.com> <70855bf4-8b7e-310b-c5a5-3b9aabb02a23@oracle.com> Message-ID: <7ee7f445-7d0c-3fae-bb88-4895c9aa54e5@oracle.com> Thanks Dan! Patricio On 7/25/18 4:15 PM, Daniel D. Daugherty wrote: > On 7/25/18 3:45 PM, patricio.chilano.mateo at oracle.com wrote: >> Hi all, >> >> Could you please review this change? It?s a migration of the >> ObjectMonitor test to GTest. Two GTests were actually created, one >> for ObjectMonitor and one for ObjectSynchronizer. >> >> Summary:Convert ObjectMonitor_test to GTest >> >> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8171157 >> Webrev URL: >> http://cr.openjdk.java.net/~dcubed/for_patricio/8171157.01/webrev/ >> > > src/hotspot/share/runtime/objectMonitor.cpp > ??? No comments. > > src/hotspot/share/runtime/objectMonitor.hpp > ??? No comments. > > src/hotspot/share/runtime/synchronizer.cpp > ??? No comments. > > src/hotspot/share/runtime/synchronizer.hpp > ??? No comments. > > src/hotspot/share/utilities/internalVMTests.cpp > ??? No comments. > > test/hotspot/gtest/runtime/test_objectMonitor.cpp > ??? No comments. > > test/hotspot/gtest/runtime/test_synchronizer.cpp > ??? No comments. > > Thumbs up. > > Dan > > >> >> The fix was tested with Mach5 on tiers 1- 3 on all platforms. >> >> Thanks, >> Patricio > From karen.kinnear at oracle.com Wed Jul 25 21:56:00 2018 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 25 Jul 2018 17:56:00 -0400 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS In-Reply-To: References: Message-ID: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> Man, Thank you for your proposal. The runtime is the correct team. Could you please file an rfe under hotspot/runtime with the information below and the patch as well as any tests you have run and any performance results you have? That will help us track this information and find you a sponsor. thanks, Karen > On Jul 18, 2018, at 9:53 PM, Man Cao wrote: > > Hello, > > The Java platform team at Google has maintained a local patch to inline os::SpinPause() since 2014. We would like to upstream this patch to OpenJDK. Could someone sponsor this patch? > > It is difficult to demonstrate performance improvement in Java benchmarks. It is more of a code refactoring to better utilize modern GCC. It partly addresses the comment about inlining SpinPause() above its declaration in os.hpp. > I found an interesting discussion about PAUSE and a microbenchmark in: > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html > However, the microbenchmark has a large variance in our experiment, making it difficult to tell if there's any benefit from inlining PAUSE. Inlining PAUSE does seem to reduce the variance a bit. > > The patch is inlined and attached below: > > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > @@ -63,15 +63,6 @@ > popl %eax > ret > > - .globl SYMBOL(SpinPause) > - ELF_TYPE(SpinPause, at function) > - .p2align 4,,15 > -SYMBOL(SpinPause): > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > @@ -46,15 +46,6 @@ > > .text > > - .globl SYMBOL(SpinPause) > - .p2align 4,,15 > - ELF_TYPE(SpinPause, at function) > -SYMBOL(SpinPause): > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > @@ -42,15 +42,6 @@ > > .text > > - .globl SpinPause > - .type SpinPause, at function > - .p2align 4,,15 > -SpinPause: > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > @@ -38,15 +38,6 @@ > > .text > > - .globl SpinPause > - .align 16 > - .type SpinPause, at function > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > @@ -51,15 +51,6 @@ > movq %fs:0x0,%rax > ret > > - .globl SpinPause > - .align 16 > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > - > / Support for void Copy::arrayof_conjoint_bytes(void* from, > / void* to, > / size_t count) > diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp > --- a/src/hotspot/share/runtime/os.hpp > +++ b/src/hotspot/share/runtime/os.hpp > @@ -1031,6 +1031,13 @@ > // of the global SpinPause() with C linkage. > // It'd also be eligible for inlining on many platforms. > > +#if defined(X86) && !defined(_WINDOWS) > +extern "C" int inline SpinPause() { > + __asm__ __volatile__ ("pause"); > + return 1; > +} > +#else > extern "C" int SpinPause(); > +#endif > > #endif // SHARE_VM_RUNTIME_OS_HPP > > -Man > From manc at google.com Thu Jul 26 00:08:27 2018 From: manc at google.com (Man Cao) Date: Wed, 25 Jul 2018 17:08:27 -0700 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS In-Reply-To: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> References: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> Message-ID: Thanks Karen for the response! I don't have a JBS account currently. I could ask my colleagues with JBS accounts to create an RFE for this issue, but probably I cannot directly post comments or performance results on JBS. I'm working on upstreaming more Google-local runtime and GC patches, so I can become an Author, according to: https://wiki.openjdk.java.net/display/general/JBS+Overview So far I have just contributed one patch: JDK-8193386. Can I just post performance results on the mailing list and someone could copy the results when creating an RFE? -Man On Wed, Jul 25, 2018 at 2:56 PM Karen Kinnear wrote: > Man, > > Thank you for your proposal. The runtime is the correct team. > > Could you please file an rfe under hotspot/runtime with the information > below and the patch as well as any > tests you have run and any performance results you have? > > That will help us track this information and find you a sponsor. > > thanks, > Karen > > On Jul 18, 2018, at 9:53 PM, Man Cao wrote: > > Hello, > > The Java platform team at Google has maintained a local patch to inline > os::SpinPause() since 2014. We would like to upstream this patch to > OpenJDK. Could someone sponsor this patch? > > It is difficult to demonstrate performance improvement in Java benchmarks. > It is more of a code refactoring to better utilize modern GCC. It partly > addresses the comment about inlining SpinPause() above its declaration in > os.hpp. > I found an interesting discussion about PAUSE and a microbenchmark in: > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html > However, the microbenchmark has a large variance in our experiment, making > it difficult to tell if there's any benefit from inlining PAUSE. Inlining > PAUSE does seem to reduce the variance a bit. > > The patch is inlined and attached below: > > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > @@ -63,15 +63,6 @@ > popl %eax > ret > > - .globl SYMBOL(SpinPause) > - ELF_TYPE(SpinPause, at function) > - .p2align 4,,15 > -SYMBOL(SpinPause): > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > @@ -46,15 +46,6 @@ > > .text > > - .globl SYMBOL(SpinPause) > - .p2align 4,,15 > - ELF_TYPE(SpinPause, at function) > -SYMBOL(SpinPause): > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > @@ -42,15 +42,6 @@ > > .text > > - .globl SpinPause > - .type SpinPause, at function > - .p2align 4,,15 > -SpinPause: > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > @@ -38,15 +38,6 @@ > > .text > > - .globl SpinPause > - .align 16 > - .type SpinPause, at function > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > @@ -51,15 +51,6 @@ > movq %fs:0x0,%rax > ret > > - .globl SpinPause > - .align 16 > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > - > / Support for void Copy::arrayof_conjoint_bytes(void* from, > / void* to, > / size_t count) > diff --git a/src/hotspot/share/runtime/os.hpp > b/src/hotspot/share/runtime/os.hpp > --- a/src/hotspot/share/runtime/os.hpp > +++ b/src/hotspot/share/runtime/os.hpp > @@ -1031,6 +1031,13 @@ > // of the global SpinPause() with C linkage. > // It'd also be eligible for inlining on many platforms. > > +#if defined(X86) && !defined(_WINDOWS) > +extern "C" int inline SpinPause() { > + __asm__ __volatile__ ("pause"); > + return 1; > +} > +#else > extern "C" int SpinPause(); > +#endif > > #endif // SHARE_VM_RUNTIME_OS_HPP > > -Man > > > > From karen.kinnear at oracle.com Thu Jul 26 00:40:04 2018 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 25 Jul 2018 20:40:04 -0400 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS In-Reply-To: References: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> Message-ID: <85A4AE3E-62C9-4663-BC3E-BC2C64037FAB@oracle.com> Man. > On Jul 25, 2018, at 8:08 PM, Man Cao wrote: > > Thanks Karen for the response! > I don't have a JBS account currently. I could ask my colleagues with JBS accounts to create an RFE for this issue, but probably I cannot directly post comments or performance results on JBS. Sounds good - why don?t you work with a colleague who has a JBS account until you can become an Author. > > I'm working on upstreaming more Google-local runtime and GC patches, so I can become an Author, according to: > https://wiki.openjdk.java.net/display/general/JBS+Overview > So far I have just contributed one patch: JDK-8193386. > > Can I just post performance results on the mailing list and someone could copy the results when creating an RFE? Sounds like you will be finding a colleague who can help to create the initial RFE. Feel free to work with that colleague to add updates, or wait until you have the information to only need to ask them a favor once. thanks, Karen > > -Man > > > On Wed, Jul 25, 2018 at 2:56 PM Karen Kinnear > wrote: > Man, > > Thank you for your proposal. The runtime is the correct team. > > Could you please file an rfe under hotspot/runtime with the information below and the patch as well as any > tests you have run and any performance results you have? > > That will help us track this information and find you a sponsor. > > thanks, > Karen > >> On Jul 18, 2018, at 9:53 PM, Man Cao > wrote: >> >> Hello, >> >> The Java platform team at Google has maintained a local patch to inline os::SpinPause() since 2014. We would like to upstream this patch to OpenJDK. Could someone sponsor this patch? >> >> It is difficult to demonstrate performance improvement in Java benchmarks. It is more of a code refactoring to better utilize modern GCC. It partly addresses the comment about inlining SpinPause() above its declaration in os.hpp. >> I found an interesting discussion about PAUSE and a microbenchmark in: >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html >> However, the microbenchmark has a large variance in our experiment, making it difficult to tell if there's any benefit from inlining PAUSE. Inlining PAUSE does seem to reduce the variance a bit. >> >> The patch is inlined and attached below: >> >> diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> @@ -63,15 +63,6 @@ >> popl %eax >> ret >> >> - .globl SYMBOL(SpinPause) >> - ELF_TYPE(SpinPause, at function) >> - .p2align 4,,15 >> -SYMBOL(SpinPause): >> - rep >> - nop >> - movl $1, %eax >> - ret >> - >> # Support for void Copy::conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> @@ -46,15 +46,6 @@ >> >> .text >> >> - .globl SYMBOL(SpinPause) >> - .p2align 4,,15 >> - ELF_TYPE(SpinPause, at function) >> -SYMBOL(SpinPause): >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> # Support for void Copy::arrayof_conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> @@ -42,15 +42,6 @@ >> >> .text >> >> - .globl SpinPause >> - .type SpinPause, at function >> - .p2align 4,,15 >> -SpinPause: >> - rep >> - nop >> - movl $1, %eax >> - ret >> - >> # Support for void Copy::conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> @@ -38,15 +38,6 @@ >> >> .text >> >> - .globl SpinPause >> - .align 16 >> - .type SpinPause, at function >> -SpinPause: >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> # Support for void Copy::arrayof_conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> @@ -51,15 +51,6 @@ >> movq %fs:0x0,%rax >> ret >> >> - .globl SpinPause >> - .align 16 >> -SpinPause: >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> - >> / Support for void Copy::arrayof_conjoint_bytes(void* from, >> / void* to, >> / size_t count) >> diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp >> --- a/src/hotspot/share/runtime/os.hpp >> +++ b/src/hotspot/share/runtime/os.hpp >> @@ -1031,6 +1031,13 @@ >> // of the global SpinPause() with C linkage. >> // It'd also be eligible for inlining on many platforms. >> >> +#if defined(X86) && !defined(_WINDOWS) >> +extern "C" int inline SpinPause() { >> + __asm__ __volatile__ ("pause"); >> + return 1; >> +} >> +#else >> extern "C" int SpinPause(); >> +#endif >> >> #endif // SHARE_VM_RUNTIME_OS_HPP >> >> -Man >> From harold.seigel at oracle.com Thu Jul 26 17:05:24 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Thu, 26 Jul 2018 13:05:24 -0400 Subject: RFR (XS) 8207944: java.lang.ClassFormatError: Extra bytes at the end of class file test" possibly violation of JVMS 4.7.1 Message-ID: Hi, Please review this JDK-11 fix for bug JDK-8207944.? The fix adds necessary handling of unknown attributes when the class file version >= JAVA_11_VERSION.? Without this fix, the class file bytes for those attributes do not get skipped over when parsing the class file, corrupting further parsing of the class file. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207944_11/webrev/ JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207944 This fix was regression tested by running JCK Lang and VM tests on Linux-x64 and by running the new test.? Testing that consists of running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, and running tiers 3-5 tests on Linux-x64, is in progress. Thanks, Harold From mikhailo.seledtsov at oracle.com Thu Jul 26 17:51:41 2018 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Thu, 26 Jul 2018 10:51:41 -0700 Subject: RFR(S): 8185531: [TESTBUG] Improve test configuration for shared strings Message-ID: <5B5A0A2D.7070909@oracle.com> Please review this simple improvement to add 2 more relevant test configurations to shared strings tests: -XX:+UseStringDeduplication and -XX:-CompactStrings JBS: https://bugs.openjdk.java.net/browse/JDK-8185531 Webrev: http://cr.openjdk.java.net/~mseledtsov/8185531.00/ Testing: 1. Locally: ran the affected tests locally (Linux-x64) All PASS 2. Automated multi-platform test system: Same set: test/hotspot/jtreg/runtime/appcds/sharedStrings/ All PASS Thank you, Misha From lois.foltan at oracle.com Thu Jul 26 18:59:11 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 26 Jul 2018 14:59:11 -0400 Subject: RFR (XS) 8207944: java.lang.ClassFormatError: Extra bytes at the end of class file test" possibly violation of JVMS 4.7.1 In-Reply-To: References: Message-ID: <56661d83-b8e4-c8ed-0eaf-6027b5504c65@oracle.com> Hi Harold, This fix looks good and also I believe trivial enough to push. Lois On 7/26/2018 1:05 PM, Harold David Seigel wrote: > Hi, > > Please review this JDK-11 fix for bug JDK-8207944.? The fix adds > necessary handling of unknown attributes when the class file version > >= JAVA_11_VERSION.? Without this fix, the class file bytes for those > attributes do not get skipped over when parsing the class file, > corrupting further parsing of the class file. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207944_11/webrev/ > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207944 > > This fix was regression tested by running JCK Lang and VM tests on > Linux-x64 and by running the new test.? Testing that consists of > running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, > Solaris Sparc, and Mac OS X, and running tiers 3-5 tests on Linux-x64, > is in progress. > > Thanks, Harold > From harold.seigel at oracle.com Thu Jul 26 19:00:09 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Thu, 26 Jul 2018 15:00:09 -0400 Subject: RFR (XS) 8207944: java.lang.ClassFormatError: Extra bytes at the end of class file test" possibly violation of JVMS 4.7.1 In-Reply-To: <56661d83-b8e4-c8ed-0eaf-6027b5504c65@oracle.com> References: <56661d83-b8e4-c8ed-0eaf-6027b5504c65@oracle.com> Message-ID: Thanks Lois! Harold On 7/26/2018 2:59 PM, Lois Foltan wrote: > Hi Harold, > > This fix looks good and also I believe trivial enough to push. > > Lois > > > On 7/26/2018 1:05 PM, Harold David Seigel wrote: >> Hi, >> >> Please review this JDK-11 fix for bug JDK-8207944.? The fix adds >> necessary handling of unknown attributes when the class file version >> >= JAVA_11_VERSION.? Without this fix, the class file bytes for those >> attributes do not get skipped over when parsing the class file, >> corrupting further parsing of the class file. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207944_11/webrev/ >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207944 >> >> This fix was regression tested by running JCK Lang and VM tests on >> Linux-x64 and by running the new test.? Testing that consists of >> running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, >> Solaris Sparc, and Mac OS X, and running tiers 3-5 tests on >> Linux-x64, is in progress. >> >> Thanks, Harold >> > From karen.kinnear at oracle.com Thu Jul 26 20:25:22 2018 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 26 Jul 2018 16:25:22 -0400 Subject: RFR (XS) 8207944: java.lang.ClassFormatError: Extra bytes at the end of class file test" possibly violation of JVMS 4.7.1 In-Reply-To: References: Message-ID: <40D821F4-9A18-4F5B-BCCF-46AC92A363B0@oracle.com> Ship it. thanks, Karen > On Jul 26, 2018, at 1:05 PM, Harold David Seigel wrote: > > Hi, > > Please review this JDK-11 fix for bug JDK-8207944. The fix adds necessary handling of unknown attributes when the class file version >= JAVA_11_VERSION. Without this fix, the class file bytes for those attributes do not get skipped over when parsing the class file, corrupting further parsing of the class file. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207944_11/webrev/ > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8207944 > > This fix was regression tested by running JCK Lang and VM tests on Linux-x64 and by running the new test. Testing that consists of running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, and running tiers 3-5 tests on Linux-x64, is in progress. > > Thanks, Harold > From harold.seigel at oracle.com Fri Jul 27 11:31:51 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 27 Jul 2018 07:31:51 -0400 Subject: RFR (XS) 8207944: java.lang.ClassFormatError: Extra bytes at the end of class file test" possibly violation of JVMS 4.7.1 In-Reply-To: <40D821F4-9A18-4F5B-BCCF-46AC92A363B0@oracle.com> References: <40D821F4-9A18-4F5B-BCCF-46AC92A363B0@oracle.com> Message-ID: Thanks Karen! Harold On 7/26/2018 4:25 PM, Karen Kinnear wrote: > Ship it. > > thanks, > Karen > >> On Jul 26, 2018, at 1:05 PM, Harold David Seigel wrote: >> >> Hi, >> >> Please review this JDK-11 fix for bug JDK-8207944. The fix adds necessary handling of unknown attributes when the class file version >= JAVA_11_VERSION. Without this fix, the class file bytes for those attributes do not get skipped over when parsing the class file, corrupting further parsing of the class file. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207944_11/webrev/ >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8207944 >> >> This fix was regression tested by running JCK Lang and VM tests on Linux-x64 and by running the new test. Testing that consists of running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, and running tiers 3-5 tests on Linux-x64, is in progress. >> >> Thanks, Harold >> From harold.seigel at oracle.com Fri Jul 27 17:15:10 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 27 Jul 2018 13:15:10 -0400 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL Message-ID: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> Hi, Please review this JDK-12 fix for bug JDK-8208399.? The fix changes function print_value_on_maybe_null() into a static function to avoid comparisons between 'this' and NULL.? It also deletes unused methods print_maybe_null() and print_on_maybe_null(). Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8208399 This fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests on Linux-x64. Thanks, Harold From manc at google.com Fri Jul 27 17:27:46 2018 From: manc at google.com (Man Cao) Date: Fri, 27 Jul 2018 10:27:46 -0700 Subject: RFR 8208458: Simplify and inline os::SpinPause() for non-Windows OS on X86 In-Reply-To: <85A4AE3E-62C9-4663-BC3E-BC2C64037FAB@oracle.com> References: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> <85A4AE3E-62C9-4663-BC3E-BC2C64037FAB@oracle.com> Message-ID: Hi all, JC kindly created an RFE and webrev for me, could someone review and sponsor this change? http://cr.openjdk.java.net/~jcbeyler/8208458/webrev.00/ I also reran performance experiments on a machine with more cores and lower noises. Results are attached in the RFE. -Man On Wed, Jul 25, 2018 at 5:40 PM Karen Kinnear wrote: > Man. > > On Jul 25, 2018, at 8:08 PM, Man Cao wrote: > > Thanks Karen for the response! > I don't have a JBS account currently. I could ask my colleagues with JBS > accounts to create an RFE for this issue, but probably I cannot directly > post comments or performance results on JBS. > > Sounds good - why don?t you work with a colleague who has a JBS account > until you can become an Author. > > > I'm working on upstreaming more Google-local runtime and GC patches, so I > can become an Author, according to: > https://wiki.openjdk.java.net/display/general/JBS+Overview > So far I have just contributed one patch: JDK-8193386. > > Can I just post performance results on the mailing list and someone could > copy the results when creating an RFE? > > Sounds like you will be finding a colleague who can help to create the > initial RFE. Feel free to work with > that colleague to add updates, or wait until you have the information to > only need to ask them a favor once. > > thanks, > Karen > > > -Man > > > On Wed, Jul 25, 2018 at 2:56 PM Karen Kinnear > wrote: > >> Man, >> >> Thank you for your proposal. The runtime is the correct team. >> >> Could you please file an rfe under hotspot/runtime with the information >> below and the patch as well as any >> tests you have run and any performance results you have? >> >> That will help us track this information and find you a sponsor. >> >> thanks, >> Karen >> >> On Jul 18, 2018, at 9:53 PM, Man Cao wrote: >> >> Hello, >> >> The Java platform team at Google has maintained a local patch to inline >> os::SpinPause() since 2014. We would like to upstream this patch to >> OpenJDK. Could someone sponsor this patch? >> >> It is difficult to demonstrate performance improvement in Java >> benchmarks. It is more of a code refactoring to better utilize modern GCC. >> It partly addresses the comment about inlining SpinPause() above its >> declaration in os.hpp. >> I found an interesting discussion about PAUSE and a microbenchmark in: >> >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html >> However, the microbenchmark has a large variance in our experiment, >> making it difficult to tell if there's any benefit from inlining PAUSE. >> Inlining PAUSE does seem to reduce the variance a bit. >> >> The patch is inlined and attached below: >> >> diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> @@ -63,15 +63,6 @@ >> popl %eax >> ret >> >> - .globl SYMBOL(SpinPause) >> - ELF_TYPE(SpinPause, at function) >> - .p2align 4,,15 >> -SYMBOL(SpinPause): >> - rep >> - nop >> - movl $1, %eax >> - ret >> - >> # Support for void Copy::conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> @@ -46,15 +46,6 @@ >> >> .text >> >> - .globl SYMBOL(SpinPause) >> - .p2align 4,,15 >> - ELF_TYPE(SpinPause, at function) >> -SYMBOL(SpinPause): >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> # Support for void Copy::arrayof_conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> @@ -42,15 +42,6 @@ >> >> .text >> >> - .globl SpinPause >> - .type SpinPause, at function >> - .p2align 4,,15 >> -SpinPause: >> - rep >> - nop >> - movl $1, %eax >> - ret >> - >> # Support for void Copy::conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> @@ -38,15 +38,6 @@ >> >> .text >> >> - .globl SpinPause >> - .align 16 >> - .type SpinPause, at function >> -SpinPause: >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> # Support for void Copy::arrayof_conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> @@ -51,15 +51,6 @@ >> movq %fs:0x0,%rax >> ret >> >> - .globl SpinPause >> - .align 16 >> -SpinPause: >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> - >> / Support for void Copy::arrayof_conjoint_bytes(void* from, >> / void* to, >> / size_t count) >> diff --git a/src/hotspot/share/runtime/os.hpp >> b/src/hotspot/share/runtime/os.hpp >> --- a/src/hotspot/share/runtime/os.hpp >> +++ b/src/hotspot/share/runtime/os.hpp >> @@ -1031,6 +1031,13 @@ >> // of the global SpinPause() with C linkage. >> // It'd also be eligible for inlining on many platforms. >> >> +#if defined(X86) && !defined(_WINDOWS) >> +extern "C" int inline SpinPause() { >> + __asm__ __volatile__ ("pause"); >> + return 1; >> +} >> +#else >> extern "C" int SpinPause(); >> +#endif >> >> #endif // SHARE_VM_RUNTIME_OS_HPP >> >> -Man >> >> >> > From harold.seigel at oracle.com Fri Jul 27 19:30:16 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 27 Jul 2018 15:30:16 -0400 Subject: RFR 8171157: Convert ObjectMonitor_test to GTest In-Reply-To: <7555b190-737f-300f-84dc-746ff7207ec3@oracle.com> References: <7555b190-737f-300f-84dc-746ff7207ec3@oracle.com> Message-ID: Hi Patricio, These changes look good. Thanks, Harold On 7/25/2018 3:45 PM, patricio.chilano.mateo at oracle.com wrote: > Hi all, > > Could you please review this change? It?s a migration of the > ObjectMonitor test to GTest. Two GTests were actually created, one for > ObjectMonitor and one for ObjectSynchronizer. > > Summary:Convert ObjectMonitor_test to GTest > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8171157 > Webrev URL: > http://cr.openjdk.java.net/~dcubed/for_patricio/8171157.01/webrev/ > > > The fix was tested with Mach5 on tiers 1- 3 on all platforms. > > Thanks, > Patricio From patricio.chilano.mateo at oracle.com Fri Jul 27 19:53:50 2018 From: patricio.chilano.mateo at oracle.com (patricio.chilano.mateo at oracle.com) Date: Fri, 27 Jul 2018 15:53:50 -0400 Subject: RFR 8171157: Convert ObjectMonitor_test to GTest In-Reply-To: References: <7555b190-737f-300f-84dc-746ff7207ec3@oracle.com> Message-ID: <7542399f-c79c-cc53-aa97-ce93d98e379c@oracle.com> Thanks Harold! Patricio On 7/27/18 3:30 PM, Harold David Seigel wrote: > Hi Patricio, > > These changes look good. > > Thanks, Harold > > > On 7/25/2018 3:45 PM, patricio.chilano.mateo at oracle.com wrote: >> Hi all, >> >> Could you please review this change? It?s a migration of the >> ObjectMonitor test to GTest. Two GTests were actually created, one >> for ObjectMonitor and one for ObjectSynchronizer. >> >> Summary:Convert ObjectMonitor_test to GTest >> >> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8171157 >> Webrev URL: >> http://cr.openjdk.java.net/~dcubed/for_patricio/8171157.01/webrev/ >> >> >> The fix was tested with Mach5 on tiers 1- 3 on all platforms. >> >> Thanks, >> Patricio > From david.holmes at oracle.com Mon Jul 30 06:47:07 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 30 Jul 2018 16:47:07 +1000 Subject: RFR (XS) 8207944: java.lang.ClassFormatError: Extra bytes at the end of class file test" possibly violation of JVMS 4.7.1 In-Reply-To: References: Message-ID: <96dd1db5-bd1d-f8f6-30a9-3ddf56351b3e@oracle.com> Thanks for fixing this nestmates-induced bug Harold! Cheers, David On 27/07/2018 3:05 AM, Harold David Seigel wrote: > Hi, > > Please review this JDK-11 fix for bug JDK-8207944.? The fix adds > necessary handling of unknown attributes when the class file version >= > JAVA_11_VERSION.? Without this fix, the class file bytes for those > attributes do not get skipped over when parsing the class file, > corrupting further parsing of the class file. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207944_11/webrev/ > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207944 > > This fix was regression tested by running JCK Lang and VM tests on > Linux-x64 and by running the new test.? Testing that consists of running > Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris > Sparc, and Mac OS X, and running tiers 3-5 tests on Linux-x64, is in > progress. > > Thanks, Harold > From harold.seigel at oracle.com Mon Jul 30 12:47:43 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Mon, 30 Jul 2018 08:47:43 -0400 Subject: RFR (XS) 8207944: java.lang.ClassFormatError: Extra bytes at the end of class file test" possibly violation of JVMS 4.7.1 In-Reply-To: <96dd1db5-bd1d-f8f6-30a9-3ddf56351b3e@oracle.com> References: <96dd1db5-bd1d-f8f6-30a9-3ddf56351b3e@oracle.com> Message-ID: You're welcome! I wish the code was clearer about the need to handle unknown attributes. Harold On 7/30/2018 2:47 AM, David Holmes wrote: > Thanks for fixing this nestmates-induced bug Harold! > > Cheers, > David > > On 27/07/2018 3:05 AM, Harold David Seigel wrote: >> Hi, >> >> Please review this JDK-11 fix for bug JDK-8207944.? The fix adds >> necessary handling of unknown attributes when the class file version >> >= JAVA_11_VERSION.? Without this fix, the class file bytes for those >> attributes do not get skipped over when parsing the class file, >> corrupting further parsing of the class file. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207944_11/webrev/ >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207944 >> >> This fix was regression tested by running JCK Lang and VM tests on >> Linux-x64 and by running the new test.? Testing that consists of >> running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, >> Solaris Sparc, and Mac OS X, and running tiers 3-5 tests on >> Linux-x64, is in progress. >> >> Thanks, Harold >> From lois.foltan at oracle.com Mon Jul 30 14:01:35 2018 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 30 Jul 2018 10:01:35 -0400 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> Message-ID: <5f2c4bab-d8d8-7b7c-9efc-73add0701e84@oracle.com> Looks good. Lois On 7/27/2018 1:15 PM, Harold David Seigel wrote: > Hi, > > Please review this JDK-12 fix for bug JDK-8208399.? The fix changes > function print_value_on_maybe_null() into a static function to avoid > comparisons between 'this' and NULL.? It also deletes unused methods > print_maybe_null() and print_on_maybe_null(). > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8208399 > > This fix was regression tested by running Mach5 tiers 1 and 2 tests > and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running > tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests > on Linux-x64. > > Thanks, Harold > From harold.seigel at oracle.com Mon Jul 30 14:05:05 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Mon, 30 Jul 2018 10:05:05 -0400 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: <5f2c4bab-d8d8-7b7c-9efc-73add0701e84@oracle.com> References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> <5f2c4bab-d8d8-7b7c-9efc-73add0701e84@oracle.com> Message-ID: Thanks Lois! Harold On 7/30/2018 10:01 AM, Lois Foltan wrote: > Looks good. > Lois > > On 7/27/2018 1:15 PM, Harold David Seigel wrote: >> Hi, >> >> Please review this JDK-12 fix for bug JDK-8208399.? The fix changes >> function print_value_on_maybe_null() into a static function to avoid >> comparisons between 'this' and NULL.? It also deletes unused methods >> print_maybe_null() and print_on_maybe_null(). >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8208399 >> >> This fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and >> VM tests on Linux-x64. >> >> Thanks, Harold >> > From zgu at redhat.com Mon Jul 30 14:58:44 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 30 Jul 2018 10:58:44 -0400 Subject: [12] RFR: 8208499 NMT: Missing memory tag for Safepoint polling page Message-ID: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> Hi, Could I have reviews for this small fix to add missing memory tag for global safepoint polling page? I added "Safepoint" memory type, as a part of effort to fine grain the memory types [1] Bug: https://bugs.openjdk.java.net/browse/JDK-8208499 Webrev: http://cr.openjdk.java.net/~zgu/8208499/webrev.00/index.html Test: hotspot_nmt on Linux 64 (fastdebug and release) [1] JDK-8199746 NMT: Breakup overused mtInternal into more fine grained categories. Thanks, -Zhengyu From shade at redhat.com Mon Jul 30 15:05:16 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 30 Jul 2018 17:05:16 +0200 Subject: [12] RFR: 8208499 NMT: Missing memory tag for Safepoint polling page In-Reply-To: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> References: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> Message-ID: On 07/30/2018 04:58 PM, Zhengyu Gu wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8208499 > Webrev: http://cr.openjdk.java.net/~zgu/8208499/webrev.00/index.html *) memory/allocation.hpp, is it really necessary to drop values from enum? *) services/memTracker.cpp, not sure why you need to assert this: 54 // memory type occupies a byte 55 STATIC_ASSERT(mt_number_of_types <= max_jubyte); *) runtime/NMT/SafepointPollingPages.java, excess newline: 48 49 Otherwise looks good. -Aleksey From zgu at redhat.com Mon Jul 30 15:20:09 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 30 Jul 2018 11:20:09 -0400 Subject: [12] RFR: 8208499 NMT: Missing memory tag for Safepoint polling page In-Reply-To: References: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> Message-ID: Thanks for the quick review. On 07/30/2018 11:05 AM, Aleksey Shipilev wrote: > On 07/30/2018 04:58 PM, Zhengyu Gu wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8208499 >> Webrev: http://cr.openjdk.java.net/~zgu/8208499/webrev.00/index.html > > *) memory/allocation.hpp, is it really necessary to drop values from enum? The values are unnecessary. Every time we add a new tag, we have to shift/update values, which is inconvenience and might introduce bugs. > > *) services/memTracker.cpp, not sure why you need to assert this: > > 54 // memory type occupies a byte > 55 STATIC_ASSERT(mt_number_of_types <= max_jubyte); memory type is encoded into tracking header as a byte field. The assertion ensures that we don't introduce more types that can overflow a byte. > > *) runtime/NMT/SafepointPollingPages.java, excess newline: > > 48 > 49 Fixed. http://cr.openjdk.java.net/~zgu/8208499/webrev.01/ -Zhengyu > > Otherwise looks good. > > -Aleksey > From zgu at redhat.com Mon Jul 30 15:34:26 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 30 Jul 2018 11:34:26 -0400 Subject: [12] RFR: 8208499 NMT: Missing memory tag for Safepoint polling page In-Reply-To: References: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> Message-ID: <022b106c-b34b-9404-5c67-0a41fbef6100@redhat.com> On 07/30/2018 11:20 AM, Zhengyu Gu wrote: > Thanks for the quick review. > > On 07/30/2018 11:05 AM, Aleksey Shipilev wrote: >> On 07/30/2018 04:58 PM, Zhengyu Gu wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208499 >>> Webrev: http://cr.openjdk.java.net/~zgu/8208499/webrev.00/index.html >> >> *) memory/allocation.hpp, is it really necessary to drop values from >> enum? > > The values are unnecessary. Every time we add a new tag, we have to > shift/update values, which is inconvenience and might introduce bugs. > >> >> *) services/memTracker.cpp, not sure why you need to assert this: >> >> ?? 54?? // memory type occupies a byte >> ?? 55?? STATIC_ASSERT(mt_number_of_types <= max_jubyte); > > memory type is encoded into tracking header as a byte field. The > assertion ensures that we don't introduce more types that can overflow a > byte. Updated comment to clarify the assertion. Thanks, -Zhengyu > >> >> *) runtime/NMT/SafepointPollingPages.java, excess newline: >> >> ?? 48 >> ?? 49 > > Fixed. > > http://cr.openjdk.java.net/~zgu/8208499/webrev.01/ > > -Zhengyu > >> >> Otherwise looks good. >> >> -Aleksey >> From gerard.ziemski at oracle.com Mon Jul 30 16:19:18 2018 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Mon, 30 Jul 2018 11:19:18 -0500 Subject: RFR (L) JDK-8195100: Use a low latency hashtable for SymbolTable Message-ID: Please review this Enhancement, which uses the new concurrent hash table for SymbolTable. This is an effort similar to the one behind JDK-8195097 "Make it possible to process StringTable outside safepoint? from a while ago. The main expected goal here is to eliminate safepoint pauses needed to cleanup the table. This goal was achieved by using ?Service Thread? to do the cleaning. Checking whether we need to clean is performed on ?VM Thread?, after class unloading (we check the entire table). We also check the bucket into which we happen to be inserting a new item. A few things to note: - The SymbolTable implementation follows closely that of StringTable - we might be able to factor out common code later - There are a few small, but statistically significant, regressions in startup benchmarks (around 2-5%) that will be addressed later, tracked by JDK-8208142 - There is an outstanding question about whether we can safely walk the table during a safepoint using do_scan, but without locking, tracked by JDK-8208462 - There is a cleanup opportunity presented now to remove rehashable hash table, tracked by JDK-8208519 - There is a new test that validates that we free dead entries when we insert new symbols (using short lived symbols via Class.forName() API) Tested using Mach5 hs-tier1,2,3,4,5 (final test running right now?) Webrev: http://cr.openjdk.java.net/~gziemski/8195100_rev1/ Bug: https://bugs.openjdk.java.net/browse/JDK-8195100 Cheers From shade at redhat.com Mon Jul 30 17:28:07 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 30 Jul 2018 19:28:07 +0200 Subject: [12] RFR: 8208499 NMT: Missing memory tag for Safepoint polling page In-Reply-To: <022b106c-b34b-9404-5c67-0a41fbef6100@redhat.com> References: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> <022b106c-b34b-9404-5c67-0a41fbef6100@redhat.com> Message-ID: <9f879b17-c62a-8123-ae47-b3e1535c6cb5@redhat.com> On 07/30/2018 05:34 PM, Zhengyu Gu wrote: >>> *) memory/allocation.hpp, is it really necessary to drop values from enum? >> >> The values are unnecessary. Every time we add a new tag, we have to shift/update values, which is >> inconvenience and might introduce bugs. Right. What I meant was the apparent disconnect between bug synopsis and the actual change. I would have expected only Safepoint-related changes, not collateral refactoring. It is your call if you want to conflate the two. -Aleksey From gerard.ziemski at oracle.com Mon Jul 30 18:05:44 2018 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Mon, 30 Jul 2018 13:05:44 -0500 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> Message-ID: Looks good. In src/hotspot/share/oops/metadata.hpp tiny nitpick - I was once told that instead of: + if (m == NULL) we should be using: + if (NULL == m) but I guess the entire file would have to use that syntax. cheers > On Jul 27, 2018, at 12:15 PM, Harold David Seigel wrote: > > Hi, > > Please review this JDK-12 fix for bug JDK-8208399. The fix changes function print_value_on_maybe_null() into a static function to avoid comparisons between 'this' and NULL. It also deletes unused methods print_maybe_null() and print_on_maybe_null(). > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8208399 > > This fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests on Linux-x64. > > Thanks, Harold > From coleen.phillimore at oracle.com Mon Jul 30 18:14:46 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Jul 2018 14:14:46 -0400 Subject: RFR 8207779: Method::is_valid_method() compares 'this' with NULL In-Reply-To: <19825af4-87c0-f7ef-c48e-9f83edc2187e@oracle.com> References: <19825af4-87c0-f7ef-c48e-9f83edc2187e@oracle.com> Message-ID: Looks good, thank you for fixing this! Coleen On 7/25/18 11:47 AM, Harold David Seigel wrote: > Hi, > > Please review this JDK-12 fix for bug JDK-8207779.? The fix changes > function is_valid_method() into a static function to avoid comparisons > between 'this' and NULL and to avoid accessing the validating method > through the Method being validated. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8207779/webrev/index.html > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207779 > > This fix was regression tested by running Mach5 tiers 1 and 2 tests > and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running > tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests > on Linux-x64. > > Thanks, Harold > From harold.seigel at oracle.com Mon Jul 30 18:16:35 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Mon, 30 Jul 2018 14:16:35 -0400 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> Message-ID: <16fb57bd-cfe3-0875-2446-e39a01bb92e3@oracle.com> Thanks Gerard! I'll make that change before pushing the fix. Harold On 7/30/2018 2:05 PM, Gerard Ziemski wrote: > Looks good. > > In src/hotspot/share/oops/metadata.hpp tiny nitpick - I was once told that instead of: > > + if (m == NULL) > > we should be using: > > + if (NULL == m) > > but I guess the entire file would have to use that syntax. > > > > cheers > >> On Jul 27, 2018, at 12:15 PM, Harold David Seigel wrote: >> >> Hi, >> >> Please review this JDK-12 fix for bug JDK-8208399. The fix changes function print_value_on_maybe_null() into a static function to avoid comparisons between 'this' and NULL. It also deletes unused methods print_maybe_null() and print_on_maybe_null(). >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8208399 >> >> This fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests on Linux-x64. >> >> Thanks, Harold >> From coleen.phillimore at oracle.com Mon Jul 30 18:18:48 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Jul 2018 14:18:48 -0400 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> Message-ID: <70834371-4379-25ba-c282-b0af64fe4b01@oracle.com> I agree, this looks good.? I think starting to use (NULL == m) is a good idea, starting with this one line.? Not for lines that haven't been touched, though. Thanks! Coleen On 7/30/18 2:05 PM, Gerard Ziemski wrote: > Looks good. > > In src/hotspot/share/oops/metadata.hpp tiny nitpick - I was once told that instead of: > > + if (m == NULL) > > we should be using: > > + if (NULL == m) > > but I guess the entire file would have to use that syntax. > > > > cheers > >> On Jul 27, 2018, at 12:15 PM, Harold David Seigel wrote: >> >> Hi, >> >> Please review this JDK-12 fix for bug JDK-8208399. The fix changes function print_value_on_maybe_null() into a static function to avoid comparisons between 'this' and NULL. It also deletes unused methods print_maybe_null() and print_on_maybe_null(). >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8208399 >> >> This fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and VM tests on Linux-x64. >> >> Thanks, Harold >> From coleen.phillimore at oracle.com Mon Jul 30 18:23:25 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Jul 2018 14:23:25 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: References: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> <37e16040-a847-eefe-fe41-088891d4ae07@oracle.com> Message-ID: <8592acab-9aca-4ea9-48b8-a9805964093d@oracle.com> This change looks good also. Thanks, Coleen On 7/25/18 2:56 PM, Harold David Seigel wrote: > Hi, > > Please review this updated webrev: > > http://cr.openjdk.java.net/~hseigel/bug_8202171.2/webrev/index.html > > This includes null checks when needed for callers to nonstatic > oopDesc::print() and oopDesc::print_on() functions and changes the > oopDesc verify() functions to static. > > Thanks, Harold > > > On 7/18/2018 8:44 AM, Harold David Seigel wrote: >> Hi Coleen, Kim, >> >> Thanks for your comments! >> >> I'll make the changes suggested by Coleen and put out a new webrev. >> >> Thanks, Harold >> >> On 7/17/2018 5:38 PM, coleen.phillimore at oracle.com wrote: >>> >>> Hi Harold, >>> >>> Looking at this change, I would like us to keep the nonstatic >>> print() and print_on(outputStream*) functions because other Metadata >>> and types within the jvm have these functions.? I think the few >>> places where the oop can be NULL at the caller should be checked >>> instead and remove the this == NULL check in the oopDesc::print_on() >>> function.? Most places already do check for NULL.? The verify >>> function seems fine to make a static member function though. >>> >>> I agree with Kim that there are other places where "this" is >>> compared to NULL which shouldn't be done, and we should file >>> separate RFEs to deal with them, specifically >>> Method::is_valid_method() and >>> Metadata::print_{value_}on_maybe_null() functions. >>> >>> Thanks, >>> Coleen >>> >>> On 7/16/18 3:24 PM, Harold David Seigel wrote: >>>> Hi, >>>> >>>> Please review this JDK-12 fix for bug JDK-8202171.? The fix changes >>>> a few functions in oop.cpp into static functions to avoid >>>> comparisons between 'this' and NULL. >>>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8202171/webrev/index.html >>>> >>>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8202171 >>>> >>>> This fix was regression tested by running Mach5 tiers 1 and 2 tests >>>> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >>>> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang >>>> and VM tests on Linux-x64. >>>> >>>> Thanks, Harold >>>> >>> >> > From harold.seigel at oracle.com Mon Jul 30 18:34:07 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Mon, 30 Jul 2018 14:34:07 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: <8592acab-9aca-4ea9-48b8-a9805964093d@oracle.com> References: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> <37e16040-a847-eefe-fe41-088891d4ae07@oracle.com> <8592acab-9aca-4ea9-48b8-a9805964093d@oracle.com> Message-ID: <6b261ea9-a1cf-cbf0-e47d-bac6472cfa7c@oracle.com> Hi Coleen, Thanks for the reviews! Harold On 7/30/2018 2:23 PM, coleen.phillimore at oracle.com wrote: > > This change looks good also. > Thanks, > Coleen > > On 7/25/18 2:56 PM, Harold David Seigel wrote: >> Hi, >> >> Please review this updated webrev: >> >> http://cr.openjdk.java.net/~hseigel/bug_8202171.2/webrev/index.html >> >> This includes null checks when needed for callers to nonstatic >> oopDesc::print() and oopDesc::print_on() functions and changes the >> oopDesc verify() functions to static. >> >> Thanks, Harold >> >> >> On 7/18/2018 8:44 AM, Harold David Seigel wrote: >>> Hi Coleen, Kim, >>> >>> Thanks for your comments! >>> >>> I'll make the changes suggested by Coleen and put out a new webrev. >>> >>> Thanks, Harold >>> >>> On 7/17/2018 5:38 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi Harold, >>>> >>>> Looking at this change, I would like us to keep the nonstatic >>>> print() and print_on(outputStream*) functions because other >>>> Metadata and types within the jvm have these functions.? I think >>>> the few places where the oop can be NULL at the caller should be >>>> checked instead and remove the this == NULL check in the >>>> oopDesc::print_on() function.? Most places already do check for >>>> NULL.? The verify function seems fine to make a static member >>>> function though. >>>> >>>> I agree with Kim that there are other places where "this" is >>>> compared to NULL which shouldn't be done, and we should file >>>> separate RFEs to deal with them, specifically >>>> Method::is_valid_method() and >>>> Metadata::print_{value_}on_maybe_null() functions. >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 7/16/18 3:24 PM, Harold David Seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this JDK-12 fix for bug JDK-8202171.? The fix >>>>> changes a few functions in oop.cpp into static functions to avoid >>>>> comparisons between 'this' and NULL. >>>>> >>>>> Open Webrev: >>>>> http://cr.openjdk.java.net/~hseigel/bug_8202171/webrev/index.html >>>>> >>>>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8202171 >>>>> >>>>> This fix was regression tested by running Mach5 tiers 1 and 2 >>>>> tests and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS >>>>> X, running tiers 3-5 tests on Linux-x64, and by running JCK-11 >>>>> Lang and VM tests on Linux-x64. >>>>> >>>>> Thanks, Harold >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Mon Jul 30 18:40:12 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Jul 2018 14:40:12 -0400 Subject: [12] RFR: 8208499 NMT: Missing memory tag for Safepoint polling page In-Reply-To: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> References: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> Message-ID: <35a25492-878e-ea38-58e6-4382e6016118@oracle.com> This looks good.? I'm fine with dropping the values for the mt enum with this change, since it now makes it easier to add values. Thanks, Coleen On 7/30/18 10:58 AM, Zhengyu Gu wrote: > Hi, > > Could I have reviews for this small fix to add missing memory tag for > global safepoint polling page? > > I added "Safepoint" memory type, as a part of effort to fine grain the > memory types [1] > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8208499 > Webrev: http://cr.openjdk.java.net/~zgu/8208499/webrev.00/index.html > > > Test: > > ? hotspot_nmt on Linux 64 (fastdebug and release) > > [1] JDK-8199746 NMT: Breakup overused mtInternal into more fine > grained categories. > > Thanks, > > -Zhengyu From zgu at redhat.com Mon Jul 30 18:52:39 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 30 Jul 2018 14:52:39 -0400 Subject: [12] RFR: 8208499 NMT: Missing memory tag for Safepoint polling page In-Reply-To: <35a25492-878e-ea38-58e6-4382e6016118@oracle.com> References: <6bb6e3aa-5bc1-a3e8-8eae-da1da9e7a4aa@redhat.com> <35a25492-878e-ea38-58e6-4382e6016118@oracle.com> Message-ID: <67e8ce8c-388a-76e0-34ed-2c0ce1bef798@redhat.com> Thanks, Coleen. -Zhengyu On 07/30/2018 02:40 PM, coleen.phillimore at oracle.com wrote: > > This looks good.? I'm fine with dropping the values for the mt enum with > this change, since it now makes it easier to add values. > > Thanks, > Coleen > > On 7/30/18 10:58 AM, Zhengyu Gu wrote: >> Hi, >> >> Could I have reviews for this small fix to add missing memory tag for >> global safepoint polling page? >> >> I added "Safepoint" memory type, as a part of effort to fine grain the >> memory types [1] >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8208499 >> Webrev: http://cr.openjdk.java.net/~zgu/8208499/webrev.00/index.html >> >> >> Test: >> >> ? hotspot_nmt on Linux 64 (fastdebug and release) >> >> [1] JDK-8199746 NMT: Breakup overused mtInternal into more fine >> grained categories. >> >> Thanks, >> >> -Zhengyu > From coleen.phillimore at oracle.com Mon Jul 30 20:49:57 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Jul 2018 16:49:57 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException Message-ID: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Summary: fixed refactoring caused by JDK-8203820 open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8208074 Ran the test in mach5 on all Oracle supported platforms.? Also took the test out of ProblemList.txt because JDK-8203820 fixes https://bugs.openjdk.java.net/browse/JDK-8202896. Thanks, Coleen From david.holmes at oracle.com Mon Jul 30 21:46:21 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 31 Jul 2018 07:46:21 +1000 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote: > Summary: fixed refactoring caused by JDK-8203820 > > open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8208074 For the sake of other readers who don't want to have to reverse engineer the actual cause of the problem, the original code has two Method.invoke sequences: one for a static method and which passed a null receiver; one for a non-static method which passed a non-null receiver. The refactoring extracted the invoke logic but always passed a null receiver - which was wrong for the non-static case. The fix always passes a non-null receiver to fix the non-static case, and which is ignored in the static case. Reviewed. Trivial. Thanks, David > Ran the test in mach5 on all Oracle supported platforms.? Also took the > test out of ProblemList.txt because JDK-8203820 fixes > https://bugs.openjdk.java.net/browse/JDK-8202896. > > Thanks, > Coleen From david.holmes at oracle.com Mon Jul 30 21:52:48 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 31 Jul 2018 07:52:48 +1000 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: <16fb57bd-cfe3-0875-2446-e39a01bb92e3@oracle.com> References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> <16fb57bd-cfe3-0875-2446-e39a01bb92e3@oracle.com> Message-ID: <4d1cc088-1499-4596-bbae-ce0539c61bea@oracle.com> On 31/07/2018 4:16 AM, Harold David Seigel wrote: > Thanks Gerard! > > I'll make that change before pushing the fix. Hmmm is that part of our coding guidelines? It is not something we generally follow. Personally I've never liked the "if ( == variable)" form as it just reads wrong to me. David > Harold > > > On 7/30/2018 2:05 PM, Gerard Ziemski wrote: >> Looks good. >> >> In src/hotspot/share/oops/metadata.hpp tiny nitpick - I was once told >> that instead of: >> >> +??? if (m == NULL) >> >> we should be using: >> >> +??? if (NULL == m) >> >> but I guess the entire file would have to use that syntax. >> >> >> >> cheers >> >>> On Jul 27, 2018, at 12:15 PM, Harold David Seigel >>> wrote: >>> >>> Hi, >>> >>> Please review this JDK-12 fix for bug JDK-8208399.? The fix changes >>> function print_value_on_maybe_null() into a static function to avoid >>> comparisons between 'this' and NULL.? It also deletes unused methods >>> print_maybe_null() and print_on_maybe_null(). >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html >>> >>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8208399 >>> >>> This fix was regression tested by running Mach5 tiers 1 and 2 tests >>> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >>> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang and >>> VM tests on Linux-x64. >>> >>> Thanks, Harold >>> > From ioi.lam at oracle.com Tue Jul 31 01:00:22 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 30 Jul 2018 18:00:22 -0700 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: <4d1cc088-1499-4596-bbae-ce0539c61bea@oracle.com> References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> <16fb57bd-cfe3-0875-2446-e39a01bb92e3@oracle.com> <4d1cc088-1499-4596-bbae-ce0539c61bea@oracle.com> Message-ID: <2f03ca3a-bd88-e3c3-a4ba-b82b817350fc@oracle.com> There are 92 cases of '[(]NULL ==' vs 3403 cases of '== NULL[)]' in the hotspot source code. It's been claimed that (NULL == variable) is superior because if you type "=" by mistake, or delete one of the equal signs by mistake, the compiler will catch you. - Ioi On 7/30/18 2:52 PM, David Holmes wrote: > On 31/07/2018 4:16 AM, Harold David Seigel wrote: >> Thanks Gerard! >> >> I'll make that change before pushing the fix. > > Hmmm is that part of our coding guidelines? It is not something we > generally follow. Personally I've never liked the "if ( == > variable)" form as it just reads wrong to me. > > David > >> Harold >> >> >> On 7/30/2018 2:05 PM, Gerard Ziemski wrote: >>> Looks good. >>> >>> In src/hotspot/share/oops/metadata.hpp tiny nitpick - I was once >>> told that instead of: >>> >>> +??? if (m == NULL) >>> >>> we should be using: >>> >>> +??? if (NULL == m) >>> >>> but I guess the entire file would have to use that syntax. >>> >>> >>> >>> cheers >>> >>>> On Jul 27, 2018, at 12:15 PM, Harold David Seigel >>>> wrote: >>>> >>>> Hi, >>>> >>>> Please review this JDK-12 fix for bug JDK-8208399.? The fix changes >>>> function print_value_on_maybe_null() into a static function to >>>> avoid comparisons between 'this' and NULL.? It also deletes unused >>>> methods print_maybe_null() and print_on_maybe_null(). >>>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html >>>> >>>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8208399 >>>> >>>> This fix was regression tested by running Mach5 tiers 1 and 2 tests >>>> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >>>> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang >>>> and VM tests on Linux-x64. >>>> >>>> Thanks, Harold >>>> >> From david.holmes at oracle.com Tue Jul 31 01:15:27 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 31 Jul 2018 11:15:27 +1000 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: <2f03ca3a-bd88-e3c3-a4ba-b82b817350fc@oracle.com> References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> <16fb57bd-cfe3-0875-2446-e39a01bb92e3@oracle.com> <4d1cc088-1499-4596-bbae-ce0539c61bea@oracle.com> <2f03ca3a-bd88-e3c3-a4ba-b82b817350fc@oracle.com> Message-ID: On 31/07/2018 11:00 AM, Ioi Lam wrote: > There are 92 cases of '[(]NULL ==' vs 3403 cases of '== NULL[)]' in the > hotspot source code. > > It's been claimed that (NULL == variable) is superior because if you > type "=" by mistake, or delete one of the equal signs by mistake, the > compiler will catch you. Yes well aware of the reason behind it :) I just find it jarring to read code expressed that way. If I'd been taught it from the beginning, or if hotspot had always used it ... But unless we're planning on updating the coding guidelines to require this I don't see we should be making such changes "just because" Cheers, David > - Ioi > > > On 7/30/18 2:52 PM, David Holmes wrote: >> On 31/07/2018 4:16 AM, Harold David Seigel wrote: >>> Thanks Gerard! >>> >>> I'll make that change before pushing the fix. >> >> Hmmm is that part of our coding guidelines? It is not something we >> generally follow. Personally I've never liked the "if ( == >> variable)" form as it just reads wrong to me. >> >> David >> >>> Harold >>> >>> >>> On 7/30/2018 2:05 PM, Gerard Ziemski wrote: >>>> Looks good. >>>> >>>> In src/hotspot/share/oops/metadata.hpp tiny nitpick - I was once >>>> told that instead of: >>>> >>>> +??? if (m == NULL) >>>> >>>> we should be using: >>>> >>>> +??? if (NULL == m) >>>> >>>> but I guess the entire file would have to use that syntax. >>>> >>>> >>>> >>>> cheers >>>> >>>>> On Jul 27, 2018, at 12:15 PM, Harold David Seigel >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> Please review this JDK-12 fix for bug JDK-8208399.? The fix changes >>>>> function print_value_on_maybe_null() into a static function to >>>>> avoid comparisons between 'this' and NULL.? It also deletes unused >>>>> methods print_maybe_null() and print_on_maybe_null(). >>>>> >>>>> Open Webrev: >>>>> http://cr.openjdk.java.net/~hseigel/bug_8208399/webrev/index.html >>>>> >>>>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8208399 >>>>> >>>>> This fix was regression tested by running Mach5 tiers 1 and 2 tests >>>>> and builds on Linux-X64, Windows, Solaris Sparc, and Mac OS X, >>>>> running tiers 3-5 tests on Linux-x64, and by running JCK-11 Lang >>>>> and VM tests on Linux-x64. >>>>> >>>>> Thanks, Harold >>>>> >>> > From chris.plummer at oracle.com Tue Jul 31 01:34:21 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 30 Jul 2018 18:34:21 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: Hi Coleen, Now that this had been pushed, I assume JDK-8202896 should be closed as a dup. And what about JDK-8206076? Is it fixed by this change also? thanks, Chris On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: > Summary: fixed refactoring caused by JDK-8203820 > > open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8208074 > > Ran the test in mach5 on all Oracle supported platforms.? Also took > the test out of ProblemList.txt because JDK-8203820 fixes > https://bugs.openjdk.java.net/browse/JDK-8202896. > > Thanks, > Coleen From chris.plummer at oracle.com Tue Jul 31 07:16:03 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Jul 2018 00:16:03 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> Sorry, I thought this had been pushed already, but it hasn't. But it still looks like JDK-8202896 should be closed as a dup, and it's unclear to me if JDK-8206076 has been fixed and this test can be removed from the problem list. Chris On 7/30/18 6:34 PM, Chris Plummer wrote: > Hi Coleen, > > Now that this had been pushed, I assume JDK-8202896 should be closed > as a dup. And what about JDK-8206076? Is it fixed by this change also? > > thanks, > > Chris > > On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >> >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen > > > From serguei.spitsyn at oracle.com Tue Jul 31 07:20:54 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 00:20:54 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> Message-ID: Hi Coleen, The explanation from David is very helpful - thanks! So the fix looks good to me as well. We still need to answer questions from Chris though. Thanks, Serguei On 7/30/18 14:46, David Holmes wrote: > On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 > > For the sake of other readers who don't want to have to reverse > engineer the actual cause of the problem, the original code has two > Method.invoke sequences: one for a static method and which passed a > null receiver; one ?for a non-static method which passed a non-null > receiver. The refactoring extracted the invoke logic but always passed > a null receiver - which was wrong for the non-static case. The fix > always passes a non-null receiver to fix the non-static case, and > which is ignored in the static case. > > Reviewed. Trivial. > > Thanks, > David > >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen From serguei.spitsyn at oracle.com Tue Jul 31 07:29:59 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 00:29:59 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> Message-ID: <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> Hi Chris, Good catch. It is possible that this webrev does not fix the JDK-8202896. The JDK-8202896 is about timeouts which are normally intermittent (is it right?). There are two options here: ? A: close 8202896 as a dup of 8208074 ? B: keep the test problem listed and labeled with 8202896 Let's wait for Coleen's answer. Thanks, Serguei On 7/31/18 00:16, Chris Plummer wrote: > Sorry, I thought this had been pushed already, but it hasn't. But it > still looks like JDK-8202896 should be closed as a dup, and it's > unclear to me if JDK-8206076 has been fixed and this test can be > removed from the problem list. > > Chris > > On 7/30/18 6:34 PM, Chris Plummer wrote: >> Hi Coleen, >> >> Now that this had been pushed, I assume JDK-8202896 should be closed >> as a dup. And what about JDK-8206076? Is it fixed by this change also? >> >> thanks, >> >> Chris >> >> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>> Summary: fixed refactoring caused by JDK-8203820 >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>> >>> Ran the test in mach5 on all Oracle supported platforms.? Also took >>> the test out of ProblemList.txt because JDK-8203820 fixes >>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>> >>> Thanks, >>> Coleen >> >> >> > > From coleen.phillimore at oracle.com Tue Jul 31 11:56:15 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 07:56:15 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> Message-ID: <72b2618f-ec62-00ff-8af5-53dbc67156ef@oracle.com> On 7/30/18 5:46 PM, David Holmes wrote: > On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 > > For the sake of other readers who don't want to have to reverse > engineer the actual cause of the problem, the original code has two > Method.invoke sequences: one for a static method and which passed a > null receiver; one ?for a non-static method which passed a non-null > receiver. The refactoring extracted the invoke logic but always passed > a null receiver - which was wrong for the non-static case. The fix > always passes a non-null receiver to fix the non-static case, and > which is ignored in the static case. Thank you David for summarizing the bug(s) and the review. Coleen > > Reviewed. Trivial. > > Thanks, > David > >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Tue Jul 31 12:01:08 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 08:01:08 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: <2f6bf712-594c-d859-128d-cb30343ec591@oracle.com> On 7/30/18 9:34 PM, Chris Plummer wrote: > Hi Coleen, > > Now that this had been pushed, I assume JDK-8202896 should be closed > as a dup. And what about JDK-8206076? Is it fixed by this change also? Yes, it should be closed also.?? I didn't see this bug.? When I was fixing the first one: https://bugs.openjdk.java.net/browse/JDK-8203820 , I looked for similar patterns in the vmTestbase tests and found this test also. All of these tests were calling InMemoryJavaCompiler from within a loop and from within multiple threads to get the same result.? I can imagine this easily timing out for -Xcomp. I haven't pushed it yet.? I was hoping you'd see this and comment on it, since you had comments for the whole set of bugs. Thanks! Coleen > > thanks, > > Chris > > On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >> >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen > > > From coleen.phillimore at oracle.com Tue Jul 31 12:06:35 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 08:06:35 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> Message-ID: <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > Good catch. > It is possible that this webrev does not fix the JDK-8202896. > The JDK-8202896 is about timeouts which are normally intermittent (is > it right?). > > There are two options here: > ? A: close 8202896 as a dup of 8208074 > ? B: keep the test problem listed and labeled with 8202896 > > Let's wait for Coleen's answer. I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts with -Xcomp) ?as a duplicate of https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took InMemoryCompiler out of the threads) because that's where the attempted fix was. I think https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many open files intermittently) should be closed as a duplicate too because it's the same root cause. And this one: https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) fixes my fix and will remove the test from the ProblemList.txt. I believe it should be removed fromt he problem list because I don't think it will time out or intermittently fail again for the same reason.? If it times out or fails for a different reason, we should file a whole new bug, with that specific analysis. Thanks, Coleen > > Thanks, > Serguei > > > On 7/31/18 00:16, Chris Plummer wrote: >> Sorry, I thought this had been pushed already, but it hasn't. But it >> still looks like JDK-8202896 should be closed as a dup, and it's >> unclear to me if JDK-8206076 has been fixed and this test can be >> removed from the problem list. >> >> Chris >> >> On 7/30/18 6:34 PM, Chris Plummer wrote: >>> Hi Coleen, >>> >>> Now that this had been pushed, I assume JDK-8202896 should be closed >>> as a dup. And what about JDK-8206076? Is it fixed by this change also? >>> >>> thanks, >>> >>> Chris >>> >>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: fixed refactoring caused by JDK-8203820 >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>> >>>> Ran the test in mach5 on all Oracle supported platforms. Also took >>>> the test out of ProblemList.txt because JDK-8203820 fixes >>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>> >>>> Thanks, >>>> Coleen >>> >>> >>> >> >> > From kim.barrett at oracle.com Tue Jul 31 14:30:59 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 31 Jul 2018 10:30:59 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: References: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> <37e16040-a847-eefe-fe41-088891d4ae07@oracle.com> Message-ID: <1FF97D32-0997-4E1D-9188-6726940C7CEE@oracle.com> > On Jul 25, 2018, at 2:56 PM, Harold David Seigel wrote: > > Hi, > > Please review this updated webrev: > > http://cr.openjdk.java.net/~hseigel/bug_8202171.2/webrev/index.html > > This includes null checks when needed for callers to nonstatic oopDesc::print() and oopDesc::print_on() functions and changes the oopDesc verify() functions to static. > > Thanks, Harold > Looks good. From kim.barrett at oracle.com Tue Jul 31 14:35:38 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 31 Jul 2018 10:35:38 -0400 Subject: RFR 8208399: Metadata methods print_(value_)on_maybe_null() compare 'this' to NULL In-Reply-To: References: <0a0f9fb7-bc30-5ec7-268a-cee3953c3d06@oracle.com> <16fb57bd-cfe3-0875-2446-e39a01bb92e3@oracle.com> <4d1cc088-1499-4596-bbae-ce0539c61bea@oracle.com> <2f03ca3a-bd88-e3c3-a4ba-b82b817350fc@oracle.com> Message-ID: <32E2CC8F-6C16-4540-BF07-6579108252FB@oracle.com> > On Jul 30, 2018, at 9:15 PM, David Holmes wrote: > > On 31/07/2018 11:00 AM, Ioi Lam wrote: >> There are 92 cases of '[(]NULL ==' vs 3403 cases of '== NULL[)]' in the hotspot source code. >> It's been claimed that (NULL == variable) is superior because if you type "=" by mistake, or delete one of the equal signs by mistake, the compiler will catch you. > > Yes well aware of the reason behind it :) I just find it jarring to read code expressed that way. If I'd been taught it from the beginning, or if hotspot had always used it ... > > But unless we're planning on updating the coding guidelines to require this I don't see we should be making such changes "just because? I agree with David here. Also, I recall some compilers warning about a simple assignment as a control expression, specifically to address that mistake. The workaround for the warning might be to parenthesize the expression, or not use the implicit bool conversion (as required by Hotspot style guide). From harold.seigel at oracle.com Tue Jul 31 14:44:35 2018 From: harold.seigel at oracle.com (Harold David Seigel) Date: Tue, 31 Jul 2018 10:44:35 -0400 Subject: RFR 8202171: Some oopDesc functions compare this with NULL In-Reply-To: <1FF97D32-0997-4E1D-9188-6726940C7CEE@oracle.com> References: <733004c3-2cd6-c700-adba-f4305f3de8a9@oracle.com> <37e16040-a847-eefe-fe41-088891d4ae07@oracle.com> <1FF97D32-0997-4E1D-9188-6726940C7CEE@oracle.com> Message-ID: Thanks Kim! Harold On 7/31/2018 10:30 AM, Kim Barrett wrote: >> On Jul 25, 2018, at 2:56 PM, Harold David Seigel wrote: >> >> Hi, >> >> Please review this updated webrev: >> >> http://cr.openjdk.java.net/~hseigel/bug_8202171.2/webrev/index.html >> >> This includes null checks when needed for callers to nonstatic oopDesc::print() and oopDesc::print_on() functions and changes the oopDesc verify() functions to static. >> >> Thanks, Harold >> > Looks good. > From chris.plummer at oracle.com Tue Jul 31 16:13:19 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Jul 2018 09:13:19 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: > > > On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> Good catch. >> It is possible that this webrev does not fix the JDK-8202896. >> The JDK-8202896 is about timeouts which are normally intermittent (is >> it right?). >> >> There are two options here: >> ? A: close 8202896 as a dup of 8208074 >> ? B: keep the test problem listed and labeled with 8202896 >> >> Let's wait for Coleen's answer. > > I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts > with -Xcomp) > ?as a duplicate of > https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took > InMemoryCompiler out of the threads) > because that's where the attempted fix was. > > I think > https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many > open files intermittently) > should be closed as a duplicate too because it's the same root cause. > > And this one: > https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) > fixes my fix and will remove the test from the ProblemList.txt. > > I believe it should be removed fromt he problem list because I don't > think it will time out or intermittently fail again for the same > reason.? If it times out or fails for a different reason, we should > file a whole new bug, with that specific analysis. > > Thanks, > Coleen Hi Coleen, That all sounds reasonable. Thanks for cleaning up the bug situation. Chris > >> >> Thanks, >> Serguei >> >> >> On 7/31/18 00:16, Chris Plummer wrote: >>> Sorry, I thought this had been pushed already, but it hasn't. But it >>> still looks like JDK-8202896 should be closed as a dup, and it's >>> unclear to me if JDK-8206076 has been fixed and this test can be >>> removed from the problem list. >>> >>> Chris >>> >>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>> Hi Coleen, >>>> >>>> Now that this had been pushed, I assume JDK-8202896 should be >>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>> change also? >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>> >>>>> Ran the test in mach5 on all Oracle supported platforms. Also took >>>>> the test out of ProblemList.txt because JDK-8203820 fixes >>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>>> >>>> >>> >>> >> > From chris.plummer at oracle.com Tue Jul 31 17:43:31 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Jul 2018 10:43:31 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: Hi Coleen, I just realized that there is also https://bugs.openjdk.java.net/browse/JDK-8208234 filed for this test last week. It results in an OOME. I think it's the same issue, but just want check with you. Please close it as a dup if you think it is the same. thanks, Chris On 7/31/18 9:13 AM, Chris Plummer wrote: > On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> Good catch. >>> It is possible that this webrev does not fix the JDK-8202896. >>> The JDK-8202896 is about timeouts which are normally intermittent >>> (is it right?). >>> >>> There are two options here: >>> ? A: close 8202896 as a dup of 8208074 >>> ? B: keep the test problem listed and labeled with 8202896 >>> >>> Let's wait for Coleen's answer. >> >> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >> with -Xcomp) >> ?as a duplicate of >> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >> InMemoryCompiler out of the threads) >> because that's where the attempted fix was. >> >> I think >> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >> open files intermittently) >> should be closed as a duplicate too because it's the same root cause. >> >> And this one: >> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >> fixes my fix and will remove the test from the ProblemList.txt. >> >> I believe it should be removed fromt he problem list because I don't >> think it will time out or intermittently fail again for the same >> reason.? If it times out or fails for a different reason, we should >> file a whole new bug, with that specific analysis. >> >> Thanks, >> Coleen > > Hi Coleen, > > That all sounds reasonable. Thanks for cleaning up the bug situation. > > Chris >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/31/18 00:16, Chris Plummer wrote: >>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>> it still looks like JDK-8202896 should be closed as a dup, and it's >>>> unclear to me if JDK-8206076 has been fixed and this test can be >>>> removed from the problem list. >>>> >>>> Chris >>>> >>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>> Hi Coleen, >>>>> >>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>> change also? >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>> >>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>>> >>>>> >>>> >>>> >>> >> > > From serguei.spitsyn at oracle.com Tue Jul 31 18:07:35 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 11:07:35 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> On 7/31/18 09:13, Chris Plummer wrote: > On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> Good catch. >>> It is possible that this webrev does not fix the JDK-8202896. >>> The JDK-8202896 is about timeouts which are normally intermittent >>> (is it right?). >>> >>> There are two options here: >>> ? A: close 8202896 as a dup of 8208074 >>> ? B: keep the test problem listed and labeled with 8202896 >>> >>> Let's wait for Coleen's answer. >> >> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >> with -Xcomp) >> ?as a duplicate of >> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >> InMemoryCompiler out of the threads) >> because that's where the attempted fix was. >> >> I think >> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >> open files intermittently) >> should be closed as a duplicate too because it's the same root cause. >> >> And this one: >> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >> fixes my fix and will remove the test from the ProblemList.txt. >> >> I believe it should be removed fromt he problem list because I don't >> think it will time out or intermittently fail again for the same >> reason.? If it times out or fails for a different reason, we should >> file a whole new bug, with that specific analysis. >> >> Thanks, >> Coleen > > Hi Coleen, > > That all sounds reasonable. Thanks for cleaning up the bug situation. +1 Thanks, Serguei > > Chris >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/31/18 00:16, Chris Plummer wrote: >>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>> it still looks like JDK-8202896 should be closed as a dup, and it's >>>> unclear to me if JDK-8206076 has been fixed and this test can be >>>> removed from the problem list. >>>> >>>> Chris >>>> >>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>> Hi Coleen, >>>>> >>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>> change also? >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>> >>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>>> >>>>> >>>> >>>> >>> >> > > From coleen.phillimore at oracle.com Tue Jul 31 20:07:25 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 16:07:25 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: On 7/31/18 1:43 PM, Chris Plummer wrote: > Hi Coleen, > > I just realized that there is also > https://bugs.openjdk.java.net/browse/JDK-8208234 filed for this test > last week. It results in an OOME. I think it's the same issue, but > just want check with you. Please close it as a dup if you think it is > the same. Yes, I think this is the same thing.? One call to InMemoryCompiler shouldn't OOME but multiple concurrent calls could. thanks, Coleen > > thanks, > > Chris > > On 7/31/18 9:13 AM, Chris Plummer wrote: >> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> Good catch. >>>> It is possible that this webrev does not fix the JDK-8202896. >>>> The JDK-8202896 is about timeouts which are normally intermittent >>>> (is it right?). >>>> >>>> There are two options here: >>>> ? A: close 8202896 as a dup of 8208074 >>>> ? B: keep the test problem listed and labeled with 8202896 >>>> >>>> Let's wait for Coleen's answer. >>> >>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >>> with -Xcomp) >>> ?as a duplicate of >>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >>> InMemoryCompiler out of the threads) >>> because that's where the attempted fix was. >>> >>> I think >>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >>> open files intermittently) >>> should be closed as a duplicate too because it's the same root cause. >>> >>> And this one: >>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >>> fixes my fix and will remove the test from the ProblemList.txt. >>> >>> I believe it should be removed fromt he problem list because I don't >>> think it will time out or intermittently fail again for the same >>> reason.? If it times out or fails for a different reason, we should >>> file a whole new bug, with that specific analysis. >>> >>> Thanks, >>> Coleen >> >> Hi Coleen, >> >> That all sounds reasonable. Thanks for cleaning up the bug situation. >> >> Chris >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/31/18 00:16, Chris Plummer wrote: >>>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>>> it still looks like JDK-8202896 should be closed as a dup, and >>>>> it's unclear to me if JDK-8206076 has been fixed and this test can >>>>> be removed from the problem list. >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>>> change also? >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>>> >>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> > > From coleen.phillimore at oracle.com Tue Jul 31 20:09:20 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 16:09:20 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> Message-ID: <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com> On 7/31/18 2:07 PM, serguei.spitsyn at oracle.com wrote: > On 7/31/18 09:13, Chris Plummer wrote: >> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> Good catch. >>>> It is possible that this webrev does not fix the JDK-8202896. >>>> The JDK-8202896 is about timeouts which are normally intermittent >>>> (is it right?). >>>> >>>> There are two options here: >>>> ? A: close 8202896 as a dup of 8208074 >>>> ? B: keep the test problem listed and labeled with 8202896 >>>> >>>> Let's wait for Coleen's answer. >>> >>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >>> with -Xcomp) >>> ?as a duplicate of >>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >>> InMemoryCompiler out of the threads) >>> because that's where the attempted fix was. >>> >>> I think >>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >>> open files intermittently) >>> should be closed as a duplicate too because it's the same root cause. >>> >>> And this one: >>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >>> fixes my fix and will remove the test from the ProblemList.txt. >>> >>> I believe it should be removed fromt he problem list because I don't >>> think it will time out or intermittently fail again for the same >>> reason.? If it times out or fails for a different reason, we should >>> file a whole new bug, with that specific analysis. >>> >>> Thanks, >>> Coleen >> >> Hi Coleen, >> >> That all sounds reasonable. Thanks for cleaning up the bug situation. > > +1 Thanks Chris and Serguei for your discussion of this bug.? Hopefully this test becomes stable and useful now. Coleen > > Thanks, > Serguei >> >> Chris >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/31/18 00:16, Chris Plummer wrote: >>>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>>> it still looks like JDK-8202896 should be closed as a dup, and >>>>> it's unclear to me if JDK-8206076 has been fixed and this test can >>>>> be removed from the problem list. >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>>> change also? >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>>> >>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> > From serguei.spitsyn at oracle.com Tue Jul 31 22:55:39 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 15:55:39 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com> Message-ID: <306e1860-9066-34e3-036e-1ded191d0cd4@oracle.com> On 7/31/18 13:09, coleen.phillimore at oracle.com wrote: > > > On 7/31/18 2:07 PM, serguei.spitsyn at oracle.com wrote: >> On 7/31/18 09:13, Chris Plummer wrote: >>> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Chris, >>>>> >>>>> Good catch. >>>>> It is possible that this webrev does not fix the JDK-8202896. >>>>> The JDK-8202896 is about timeouts which are normally intermittent >>>>> (is it right?). >>>>> >>>>> There are two options here: >>>>> ? A: close 8202896 as a dup of 8208074 >>>>> ? B: keep the test problem listed and labeled with 8202896 >>>>> >>>>> Let's wait for Coleen's answer. >>>> >>>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >>>> with -Xcomp) >>>> ?as a duplicate of >>>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >>>> InMemoryCompiler out of the threads) >>>> because that's where the attempted fix was. >>>> >>>> I think >>>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >>>> open files intermittently) >>>> should be closed as a duplicate too because it's the same root cause. >>>> >>>> And this one: >>>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >>>> fixes my fix and will remove the test from the ProblemList.txt. >>>> >>>> I believe it should be removed fromt he problem list because I >>>> don't think it will time out or intermittently fail again for the >>>> same reason.? If it times out or fails for a different reason, we >>>> should file a whole new bug, with that specific analysis. >>>> >>>> Thanks, >>>> Coleen >>> >>> Hi Coleen, >>> >>> That all sounds reasonable. Thanks for cleaning up the bug situation. >> >> +1 > > Thanks Chris and Serguei for your discussion of this bug. Hopefully > this test becomes stable and useful now. Thanks a lot for taking care about this issue, Coleen! Thanks, Serguei > Coleen > >> >> Thanks, >> Serguei >>> >>> Chris >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 7/31/18 00:16, Chris Plummer wrote: >>>>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>>>> it still looks like JDK-8202896 should be closed as a dup, and >>>>>> it's unclear to me if JDK-8206076 has been fixed and this test >>>>>> can be removed from the problem list. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>>>> change also? >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>>>> >>>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >> >