From goetz.lindenmaier at sap.com Fri Sep 1 13:05:35 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 1 Sep 2017 13:05:35 +0000 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack Message-ID: Hi, I found that not all libraries are linked with -z,noexecstack. This lead to errors with our linuxppc64 build. The linker omitted the flag altogether, which is interpreted as a lib with execstack. This change contains a small test that scans all libraries in the tested VM to have the noexecstack flag set. It utilizes the elf parser in the VM for this. Further -z,noexecstack is now passed to all libraries. Please review this change. I please need a sponsor. http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.01/ http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.01-hs/ Best regards, Goetz. From mikael.gerdin at oracle.com Fri Sep 1 15:23:54 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Fri, 1 Sep 2017 17:23:54 +0200 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 Message-ID: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> Hi, Please review this small fix to ThreadCritical. When working on a piece of code which allocates memory early on I noticed that it crashed if I enabled NMT. The reason is that NMT uses ThreadCritical and os::Solaris sets the ThreadCritical::_initialized flag before it actually sets up the function pointers the flag is supposed to guard. os::Solaris::_mutex_lock is not initialized until the init_2 phase (after command line flag parsing). My suggested fix is to replace the current short-circuit of ThreadCritical with a flag set when the Solaris mutex code is initialized and thereby getting rid of the initialize function on all platforms. Additionally, ThreadCritical::release is unreachable code and from my research has never actually been called, we might as well get rid of it. Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 Testing: JPRT Thanks /Mikael From thomas.stuefe at gmail.com Fri Sep 1 17:31:35 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 1 Sep 2017 19:31:35 +0200 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> Message-ID: Hi Mikael, (I never understood why we cannot just use pthread mutexes on Solaris. Why all this dynamic loading magic, are pthread functions not available on all Solaris versions?) Small nit, instead of adding a new variable _synchronization_initialized, how about _mutex_lock != NULL (in ThreadCritical()) and _mutex_unlock != NULL (in ~ThreadCritical())? I am okay with the removal of ::release(). Even if it were used, it is really safer to let the mutex live until process end. I leave it up to you if you take my suggestion above. The patch is fine for me in the current form. Kind Regards, Thomas On Fri, Sep 1, 2017 at 5:23 PM, Mikael Gerdin wrote: > Hi, > > Please review this small fix to ThreadCritical. > When working on a piece of code which allocates memory early on I noticed > that it crashed if I enabled NMT. > The reason is that NMT uses ThreadCritical and os::Solaris sets the > ThreadCritical::_initialized flag before it actually sets up the function > pointers the flag is supposed to guard. > os::Solaris::_mutex_lock is not initialized until the init_2 phase (after > command line flag parsing). > > My suggested fix is to replace the current short-circuit of ThreadCritical > with a flag set when the Solaris mutex code is initialized and thereby > getting rid of the initialize function on all platforms. > Additionally, ThreadCritical::release is unreachable code and from my > research has never actually been called, we might as well get rid of it. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 > Testing: JPRT > > Thanks > /Mikael > From jiangli.zhou at Oracle.COM Fri Sep 1 19:26:15 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Fri, 1 Sep 2017 12:26:15 -0700 Subject: RFR: 8186789: CDS dump crashes at ConstantPool::resolve_class_constants Message-ID: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> Hi, Please review the following fix for 8186789. webrev: http://cr.openjdk.java.net/~jiangli/8186789/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8186789 If a class fails verification due to missing dependencies at dump time, the constant pool _cache may be NULL. ConstantPool::resolve_class_constants() needs to check for that case. Also moved the function under #if INCLUDE_CDS_JAVA_HEAP, since it is only used when INCLUDE_CDS_JAVA_HEAP is enabled. Tested with JPRT and unit test case. Thanks, Jiangli From serguei.spitsyn at oracle.com Sat Sep 2 04:45:31 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 1 Sep 2017 21:45:31 -0700 Subject: RFR: 8186789: CDS dump crashes at ConstantPool::resolve_class_constants In-Reply-To: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> References: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> Message-ID: Hi Jiangli, It looks good to me. Thanks, Serguei On 9/1/17 12:26, Jiangli Zhou wrote: > Hi, > > Please review the following fix for 8186789. > > webrev: http://cr.openjdk.java.net/~jiangli/8186789/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8186789 > > If a class fails verification due to missing dependencies at dump time, the constant pool _cache may be NULL. ConstantPool::resolve_class_constants() needs to check for that case. Also moved the function under #if INCLUDE_CDS_JAVA_HEAP, since it is only used when INCLUDE_CDS_JAVA_HEAP is enabled. > > Tested with JPRT and unit test case. > > Thanks, > Jiangli > From serguei.spitsyn at oracle.com Sat Sep 2 08:34:36 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sat, 2 Sep 2017 01:34:36 -0700 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Message-ID: Hi Coleen, It looks good. At least, I do not see any issues with this fix. Thanks, Serguei On 8/30/17 04:14, coleen.phillimore at oracle.com wrote: > > Hi, I changed the edit for David to only use ordering semantics in the > places where needed in the lock free access to pd_set. Since only > contains_protection_domain is read lock free, it should be ok. > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > Thanks, > Coleen > > On 8/29/17 2:28 AM, David Holmes wrote: >> Hi Coleen, >> >> On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Here is the third webrev with the names of pd_set and set_pd_set >>>> renamed to pd_set_acquire and release_set_pd_set. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> This API should also be renamed: >> >> ! ProtectionDomainEntry* pd_set() const { return >> _inner.pd_set_acquire(); } >> ! void set_pd_set(ProtectionDomainEntry* new_head) { >> _inner.release_set_pd_set(new_head); } >> >> These are the ones that need to give visibility to the fact we're >> accessing things lock-free (if indeed we are). >> >> More below ... >> >>>> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >>>>> On 8/28/17 12:25 AM, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Thank you Zhengyu for noticing this change was wrong, and >>>>>>> Christian for the idea. New webrev: >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>> >>>>>> The idea of a load-acquire accessor and release_store-setter is >>>>>> fine in principal, but it seems to me that we now use these >>>>>> everywhere, even if we may not need them because there is no >>>>>> concurrent/lock-free access. Overall I find it very difficult to >>>>>> determine what the concurrent access patterns are for a >>>>>> Dictionary versus a DictionaryEntry, and which paths are in fact >>>>>> lock and/or safepoint free, and may be racing with locked or >>>>>> safepointed code. ?? >>>>> >>>>> That's exactly the point of making them accessors. So one doesn't >>>>> have to visit each individual call site and spend time answering >>>>> the question for each case. And probably getting it wrong. The >>>>> performance delta for these accesses is minimal since it's only >>>>> getting the head of the list, not each element. >>>>> >>>>> Then it's also future proof so that if a lock is removed, then we >>>>> don't miss one of the accessors at a later time. Note that >>>>> observing bugs caused by this is very difficult to do, and can >>>>> only be done by inspection. That's why I erred on the side of >>>>> safety and consistency. >> >> Sorry, it may sound strange to say that I don't agree with "erring on >> the side of safety and consistency" but I do not agree with just >> using acquire/release semantics everywhere just in case! If we don't >> know the lock-free paths then how can we possibly know things are >> correct. The whole point of these accessors is to make it obvious >> where the lock-free accesses are. >> >>>>>> >>>>>> That aside I don't understand why you added a level of >>>>>> indirection with the ProtectionDomainSet class? >>>>> >>>>> Only the code is a level of indirection not the access. That is to >>>>> avoid what I said above. See Christian's and Zhengyu's comments. >> >> Okay - I see what you did but I would not expect to have to protect >> _pd_set from direct use within its own class - anyone messing with >> that class should be aware of the need to use the accessors. Though I >> suppose this encapsulation is little different to defining the field >> as some kind of "Atomic" type rather than a "raw" type. >> >> Thanks, >> David >> ----- >> >>>>>> >>>>>> Also we have been trying to include release/acquire in the names >>>>>> of such accessors so that it is clear when we are relying on >>>>>> memory ordering properties ie. pd_set_acquire and release_set_pd_set >>>>>> >>>>> >>>>> I will change the names of these functions. >>>>> >>>>> thanks, >>>>> Coleen >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>>> I reran parallel class loading tests and jck testing is in >>>>>>> progress, but order access requires inspection. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> There are two instances probably overlooked? >>>>>>>>>>> >>>>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>>>> >>>>>>>>>>> for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>>>> => >>>>>>>>>>> for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Oh yeah, you're right. That's embarrasing. I'll fix and retest. >>>>>>>>> Which also shows that there is a potential for future >>>>>>>>> mistakes. Can we isolate the field better so it?s only >>>>>>>>> accessible via setter and getter? >>>>>>>> >>>>>>>> Yes, great idea. >>>>>>>> Coleen >>>>>>>> >>>>>>>>>> Thank you!! >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> -Zhengyu >>>>>>>>>>> >>>>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>>>>> SystemDictionary_lock >>>>>>>>>>>> >>>>>>>>>>>> Ran parallel class loading tests that we have as well as >>>>>>>>>>>> tier1 tests. See bug for details. >>>>>>>>>>>> >>>>>>>>>>>> open webrev at >>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>> > From jiangli.zhou at oracle.com Sat Sep 2 16:24:19 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Sat, 2 Sep 2017 09:24:19 -0700 Subject: RFR: 8186789: CDS dump crashes at ConstantPool::resolve_class_constants In-Reply-To: References: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> Message-ID: <9C39BCF1-558F-41E8-AEC9-1E7997071A6E@oracle.com> Thanks, Serguei! Thanks, Jiangli > On Sep 1, 2017, at 9:45 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Jiangli, > > It looks good to me. > > Thanks, > Serguei > > >> On 9/1/17 12:26, Jiangli Zhou wrote: >> Hi, >> >> Please review the following fix for 8186789. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8186789/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8186789 >> >> If a class fails verification due to missing dependencies at dump time, the constant pool _cache may be NULL. ConstantPool::resolve_class_constants() needs to check for that case. Also moved the function under #if INCLUDE_CDS_JAVA_HEAP, since it is only used when INCLUDE_CDS_JAVA_HEAP is enabled. >> >> Tested with JPRT and unit test case. >> >> Thanks, >> Jiangli > From dmitry.samersoff at bell-sw.com Sun Sep 3 18:12:31 2017 From: dmitry.samersoff at bell-sw.com (Dmitry Samersoff) Date: Sun, 3 Sep 2017 21:12:31 +0300 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: Andrew, On 08/31/2017 11:56 AM, Andrew Dinn wrote: > On 31/08/17 08:49, dmitry.samersov wrote: >> Please review: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ >> >> I would propose different approach to fix JDK-8133740 >> platform-independent way: record all frames but strip unnecessary >> NMT-internal ones on printing. >> >> This approach is safe (we don't depend to compiler inlining and we never >> strip non-NMT frames) and platform independent, but cost us some extra >> memory. > I don't think this is going to work well when symbols are not present > (meaning you cannot resolve return pc addresses to function names). On elf platforms, NMT uses .symtab section of libjvm.so and it's hard to me to imagine the situation where someone has stripped slowdebug build. With the current solution we skip two frames ever if they don't belong to NMT (e.g. call to os::attempt_reserve_memory_at was skipped in the example below). IMHO, it's worse than print unwilling NMT symbols. With current SKIP machinery 1. ReservedHeapSpace::try_reserve_heap(unsigned long, unsigned long, bool, char*)+0x1ed 2. ReservedHeapSpace::try_reserve_range(char*, char*, unsigned long, char*, char*, unsigned long, unsigned long, bool)+0x121 3. ReservedHeapSpace::initialize_compressed_heap(unsigned long, unsigned long, bool)+0x3fa 4. ReservedHeapSpace::ReservedHeapSpace(unsigned long, unsigned long, bool)+0xa8 (reserved=946176KB, committed=59392KB) Without current SKIP machinery: 1. NativeCallStack::NativeCallStack(int, bool)+0x68 2. os::attempt_reserve_memory_at(unsigned long, char*)+0x6b 3. ReservedHeapSpace::try_reserve_heap(unsigned long, unsigned long, bool, char*)+0x1ed 4. ReservedHeapSpace::try_reserve_range(char*, char*, unsigned long, char*, char*, unsigned long, unsigned long, bool)+0x121 (reserved=946176KB, committed=59392KB) > In that case the NMT frames will be printed that would otherwise get > skipped, leading to differences in what calls are in the displayed in > the caller stack relative to the case where symbols are present. What is > more these changes would vary across architectures which use different > inlining strategies. > That may seem unimportant; one could take the view that an address which > is not associated with a symbolic name is just a meaningless hex value. > However, even without names it is still possible for someone who > understands the NMT code to correlate allocations which have the same > pattern of caller addresses, including correlation of such patterns > across builds or architectures. Throwing one or more NMT addresses into > the stack in place of a genuine caller will change these call patterns > in ways that might make it impossible to spot such correlations. If different architecture implements different inlining strategy, I would expect difficulty in finding correlations between them with or without this patch. -Dmitry > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From david.holmes at oracle.com Mon Sep 4 00:57:26 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 4 Sep 2017 10:57:26 +1000 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> Message-ID: <073b63b3-0532-f096-7984-8221b1bdbdb9@oracle.com> Hi Mikael, This cleanup looks good. Obviously things got out of sync on solaris when we've split os::init into pieces in the past. And this solaris-only requirement should never have leaked through to the other ports. So good to see the cleanup. As Thomas notes there may be something that can be implicitly checked instead of adding the new field, but that is a minor concern. Thanks, David On 2/09/2017 1:23 AM, Mikael Gerdin wrote: > Hi, > > Please review this small fix to ThreadCritical. > When working on a piece of code which allocates memory early on I > noticed that it crashed if I enabled NMT. > The reason is that NMT uses ThreadCritical and os::Solaris sets the > ThreadCritical::_initialized flag before it actually sets up the > function pointers the flag is supposed to guard. > os::Solaris::_mutex_lock is not initialized until the init_2 phase > (after command line flag parsing). > > My suggested fix is to replace the current short-circuit of > ThreadCritical with a flag set when the Solaris mutex code is > initialized and thereby getting rid of the initialize function on all > platforms. > Additionally, ThreadCritical::release is unreachable code and from my > research has never actually been called, we might as well get rid of it. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 > Testing: JPRT > > Thanks > /Mikael From david.holmes at oracle.com Mon Sep 4 01:04:59 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 4 Sep 2017 11:04:59 +1000 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> Message-ID: <78c151dc-3095-15ea-edaf-bcbb8ec30c44@oracle.com> On 2/09/2017 3:31 AM, Thomas St?fe wrote: > Hi Mikael, > > (I never understood why we cannot just use pthread mutexes on Solaris. Why > all this dynamic loading magic, are pthread functions not available on all > Solaris versions?) Depends how far back you want to go with "all" :) This is obviously strongly historical. We have three potential sync API's on Solaris (pthreads, UI threads and kernel LWPs). LWP sync tended to perform better** - and that's what we still use. We had a project a few years back to convert from UI threads to pthreads, but it was too big too manage at the time and was shelved. ** Unfortunately I have no idea how this was actually measured, so can't say whether this is still the case. Cheers, David > Small nit, instead of adding a new variable _synchronization_initialized, > how about _mutex_lock != NULL (in ThreadCritical()) and _mutex_unlock != > NULL (in ~ThreadCritical())? > > I am okay with the removal of ::release(). Even if it were used, it is > really safer to let the mutex live until process end. > > I leave it up to you if you take my suggestion above. The patch is fine for > me in the current form. > > Kind Regards, Thomas > > > On Fri, Sep 1, 2017 at 5:23 PM, Mikael Gerdin > wrote: > >> Hi, >> >> Please review this small fix to ThreadCritical. >> When working on a piece of code which allocates memory early on I noticed >> that it crashed if I enabled NMT. >> The reason is that NMT uses ThreadCritical and os::Solaris sets the >> ThreadCritical::_initialized flag before it actually sets up the function >> pointers the flag is supposed to guard. >> os::Solaris::_mutex_lock is not initialized until the init_2 phase (after >> command line flag parsing). >> >> My suggested fix is to replace the current short-circuit of ThreadCritical >> with a flag set when the Solaris mutex code is initialized and thereby >> getting rid of the initialize function on all platforms. >> Additionally, ThreadCritical::release is unreachable code and from my >> research has never actually been called, we might as well get rid of it. >> >> Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 >> Testing: JPRT >> >> Thanks >> /Mikael >> From mikael.gerdin at oracle.com Mon Sep 4 07:21:11 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 4 Sep 2017 09:21:11 +0200 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: <073b63b3-0532-f096-7984-8221b1bdbdb9@oracle.com> References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> <073b63b3-0532-f096-7984-8221b1bdbdb9@oracle.com> Message-ID: <9df28780-c3ba-534e-f7ee-b014b36611dc@oracle.com> Hi David, On 2017-09-04 02:57, David Holmes wrote: > Hi Mikael, > > This cleanup looks good. Obviously things got out of sync on solaris > when we've split os::init into pieces in the past. And this solaris-only > requirement should never have leaked through to the other ports. So good > to see the cleanup. Thanks for the quick review, David. > > As Thomas notes there may be something that can be implicitly checked > instead of adding the new field, but that is a minor concern. The reason I added the new field was to make it clear to the casual reader of os::Solaris::synchronization_init() that other parts of the code may be depending on the initialization. If I just added accessors for the _mutex_lock and _mutex_unlock function pointers I think that would be less clear. I'm not strongly in favor of any approach, let's see what Thomas has to say. /Mikael > > Thanks, > David > > On 2/09/2017 1:23 AM, Mikael Gerdin wrote: >> Hi, >> >> Please review this small fix to ThreadCritical. >> When working on a piece of code which allocates memory early on I >> noticed that it crashed if I enabled NMT. >> The reason is that NMT uses ThreadCritical and os::Solaris sets the >> ThreadCritical::_initialized flag before it actually sets up the >> function pointers the flag is supposed to guard. >> os::Solaris::_mutex_lock is not initialized until the init_2 phase >> (after command line flag parsing). >> >> My suggested fix is to replace the current short-circuit of >> ThreadCritical with a flag set when the Solaris mutex code is >> initialized and thereby getting rid of the initialize function on all >> platforms. >> Additionally, ThreadCritical::release is unreachable code and from my >> research has never actually been called, we might as well get rid of it. >> >> Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 >> Testing: JPRT >> >> Thanks >> /Mikael From mikael.gerdin at oracle.com Mon Sep 4 07:26:03 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 4 Sep 2017 09:26:03 +0200 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> Message-ID: <7b018037-bfba-6699-a41a-e63fa96c3be5@oracle.com> Hi Thomas, On 2017-09-01 19:31, Thomas St?fe wrote: > Hi Mikael, > > (I never understood why we cannot just use pthread mutexes on Solaris. > Why all this dynamic loading magic, are pthread functions not available > on all Solaris versions?) Beats me, but I suspect that David is right. At some point someone thought one was better and now nobody remembers how and nobody wants to put in the performance work to determine if it actually makes a difference. > > Small nit, instead of adding a new variable > _synchronization_initialized, how about _mutex_lock != NULL (in > ThreadCritical()) and _mutex_unlock != NULL (in ~ThreadCritical())? I didn't want to expose the function pointers through accessors in os::Solaris and I'm worried that if we check a different thing in the lock versus unlock paths we can end up with a ThreadCritical which tries to unlock a lock which was never locked (because the TC was created before _mutex_lock was set). Also, I think it's clearer to the reader of os::Solaris::synchronization_init that the "initialization completed" state is exposed to an external caller. > > I am okay with the removal of ::release(). Even if it were used, it is > really safer to let the mutex live until process end. I agree. > > I leave it up to you if you take my suggestion above. The patch is fine > for me in the current form. Thanks for the review, Thomas. /Mikael > > Kind Regards, Thomas > > > On Fri, Sep 1, 2017 at 5:23 PM, Mikael Gerdin > wrote: > > Hi, > > Please review this small fix to ThreadCritical. > When working on a piece of code which allocates memory early on I > noticed that it crashed if I enabled NMT. > The reason is that NMT uses ThreadCritical and os::Solaris sets the > ThreadCritical::_initialized flag before it actually sets up the > function pointers the flag is supposed to guard. > os::Solaris::_mutex_lock is not initialized until the init_2 phase > (after command line flag parsing). > > My suggested fix is to replace the current short-circuit of > ThreadCritical with a flag set when the Solaris mutex code is > initialized and thereby getting rid of the initialize function on > all platforms. > Additionally, ThreadCritical::release is unreachable code and from > my research has never actually been called, we might as well get rid > of it. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 > > Testing: JPRT > > Thanks > /Mikael > > From adinn at redhat.com Mon Sep 4 08:11:24 2017 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 4 Sep 2017 09:11:24 +0100 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: On 03/09/17 19:12, Dmitry Samersoff wrote: > On 08/31/2017 11:56 AM, Andrew Dinn wrote: >> I don't think this is going to work well when symbols are not present >> (meaning you cannot resolve return pc addresses to function names). > > On elf platforms, NMT uses .symtab section of libjvm.so and it's hard to > me to imagine the situation where someone has stripped slowdebug build. Yes, but NMT also works (and is meant to work) on product builds where the required symbols are not available. > If different architecture implements different inlining strategy, I > would expect difficulty in finding correlations between them with or > without this patch. Perhaps, but the status quo is code which avoids any differences in where the stack trace starts caused by inlining. The current problem is that this is fragile wrt to changes in the C++ compiler. Your fix introduces an extra type of variation in the stack traces when symbols are not present (it does not mitigate other effects of different inlining strategies). So, I am merely pointing out that this is a use case that will occur (people /will/ use NMT in production deployments even if only in a sandbox) and identifying the variation in output as a limitation to be weighed in the balance. I think, on balance, I prefer the status quo to your attempt to mitigate a /potential/ future risk. I don't suppose most users will care about which hex addresses they see in their backtraces so to them it will be moot. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From magnus.ihse.bursie at oracle.com Mon Sep 4 08:42:27 2017 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 4 Sep 2017 10:42:27 +0200 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: References: Message-ID: <0dbf4e19-988c-3257-58df-474a51373ed4@oracle.com> Hi Goetz, Since this is mostly a build change, it need to be reviewed on build-dev. However, it looks good to me from a build perspective. I have not reviewed the hotspot test files. /Magnus On 2017-09-01 15:05, Lindenmaier, Goetz wrote: > Hi, > > I found that not all libraries are linked with -z,noexecstack. > This lead to errors with our linuxppc64 build. The linker omitted > the flag altogether, which is interpreted as a lib with execstack. > > This change contains a small test that scans all libraries in the tested VM > to have the noexecstack flag set. It utilizes the elf parser in the VM for this. > Further -z,noexecstack is now passed to all libraries. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.01/ > http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.01-hs/ > > Best regards, > Goetz. From thomas.stuefe at gmail.com Mon Sep 4 09:28:28 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 4 Sep 2017 11:28:28 +0200 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: <7b018037-bfba-6699-a41a-e63fa96c3be5@oracle.com> References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> <7b018037-bfba-6699-a41a-e63fa96c3be5@oracle.com> Message-ID: Hi Mikael, On Mon, Sep 4, 2017 at 9:26 AM, Mikael Gerdin wrote: > Hi Thomas, > > On 2017-09-01 19:31, Thomas St?fe wrote: > >> Hi Mikael, >> >> (I never understood why we cannot just use pthread mutexes on Solaris. >> Why all this dynamic loading magic, are pthread functions not available on >> all Solaris versions?) >> > > Beats me, but I suspect that David is right. At some point someone thought > one was better and now nobody remembers how and nobody wants to put in the > performance work to determine if it actually makes a difference. > > >> Small nit, instead of adding a new variable _synchronization_initialized, >> how about _mutex_lock != NULL (in ThreadCritical()) and _mutex_unlock != >> NULL (in ~ThreadCritical())? >> > > I didn't want to expose the function pointers through accessors in > os::Solaris and I'm worried that if we check a different thing in the lock > versus unlock paths we can end up with a ThreadCritical which tries to > unlock a lock which was never locked (because the TC was created before > _mutex_lock was set). > Also, I think it's clearer to the reader of os::Solaris::synchronization_init > that the "initialization completed" state is exposed to an external caller. > > That sounds reasonable. > > > >> I am okay with the removal of ::release(). Even if it were used, it is >> really safer to let the mutex live until process end. >> > > I agree. > > >> I leave it up to you if you take my suggestion above. The patch is fine >> for me in the current form. >> > > Thanks for the review, Thomas. > /Mikael > > Sure! ..Thomas > >> Kind Regards, Thomas >> >> >> On Fri, Sep 1, 2017 at 5:23 PM, Mikael Gerdin > > wrote: >> >> Hi, >> >> Please review this small fix to ThreadCritical. >> When working on a piece of code which allocates memory early on I >> noticed that it crashed if I enabled NMT. >> The reason is that NMT uses ThreadCritical and os::Solaris sets the >> ThreadCritical::_initialized flag before it actually sets up the >> function pointers the flag is supposed to guard. >> os::Solaris::_mutex_lock is not initialized until the init_2 phase >> (after command line flag parsing). >> >> My suggested fix is to replace the current short-circuit of >> ThreadCritical with a flag set when the Solaris mutex code is >> initialized and thereby getting rid of the initialize function on >> all platforms. >> Additionally, ThreadCritical::release is unreachable code and from >> my research has never actually been called, we might as well get rid >> of it. >> >> Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 >> >> Testing: JPRT >> >> Thanks >> /Mikael >> >> >> From thomas.stuefe at gmail.com Mon Sep 4 09:36:09 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 4 Sep 2017 11:36:09 +0200 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: <78c151dc-3095-15ea-edaf-bcbb8ec30c44@oracle.com> References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> <78c151dc-3095-15ea-edaf-bcbb8ec30c44@oracle.com> Message-ID: Hi David, On Mon, Sep 4, 2017 at 3:04 AM, David Holmes wrote: > On 2/09/2017 3:31 AM, Thomas St?fe wrote: > >> Hi Mikael, >> >> (I never understood why we cannot just use pthread mutexes on Solaris. Why >> all this dynamic loading magic, are pthread functions not available on all >> Solaris versions?) >> > > Depends how far back you want to go with "all" :) This is obviously > strongly historical. We have three potential sync API's on Solaris > (pthreads, UI threads and kernel LWPs). LWP sync tended to perform better** > - and that's what we still use. We had a project a few years back to > convert from UI threads to pthreads, but it was too big too manage at the > time and was shelved. > > Thanks for that history lesson! One thing is to use prefer lwp over pthread, but another is to link all of that dynamically. I wondered whether we could not just call pthread_mutex_link() directly, without dlsyming it first. This would be a small thing to cleanup in comparison to switching to pthread APIs completely. Cheers, Thomas > ** Unfortunately I have no idea how this was actually measured, so can't > say whether this is still the case. > > Cheers, > David > > > Small nit, instead of adding a new variable _synchronization_initialized, >> how about _mutex_lock != NULL (in ThreadCritical()) and _mutex_unlock != >> NULL (in ~ThreadCritical())? >> >> I am okay with the removal of ::release(). Even if it were used, it is >> really safer to let the mutex live until process end. >> >> I leave it up to you if you take my suggestion above. The patch is fine >> for >> me in the current form. >> >> Kind Regards, Thomas >> >> >> On Fri, Sep 1, 2017 at 5:23 PM, Mikael Gerdin >> wrote: >> >> Hi, >>> >>> Please review this small fix to ThreadCritical. >>> When working on a piece of code which allocates memory early on I noticed >>> that it crashed if I enabled NMT. >>> The reason is that NMT uses ThreadCritical and os::Solaris sets the >>> ThreadCritical::_initialized flag before it actually sets up the function >>> pointers the flag is supposed to guard. >>> os::Solaris::_mutex_lock is not initialized until the init_2 phase (after >>> command line flag parsing). >>> >>> My suggested fix is to replace the current short-circuit of >>> ThreadCritical >>> with a flag set when the Solaris mutex code is initialized and thereby >>> getting rid of the initialize function on all platforms. >>> Additionally, ThreadCritical::release is unreachable code and from my >>> research has never actually been called, we might as well get rid of it. >>> >>> Webrev: http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 >>> Testing: JPRT >>> >>> Thanks >>> /Mikael >>> >>> From goetz.lindenmaier at sap.com Mon Sep 4 09:47:19 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 4 Sep 2017 09:47:19 +0000 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: <0dbf4e19-988c-3257-58df-474a51373ed4@oracle.com> References: <0dbf4e19-988c-3257-58df-474a51373ed4@oracle.com> Message-ID: <07c3435bc97e4119a0a358a0bfc81fe9@sap.com> Hi Magnus, thanks for looking at my change. Thanks for forwarding to the proper list. I'm often not sure what's the right one, and sometimes it's ambiguous anyways. Like this one, which concerns stack overflow handling. The same problem exists with all the categories of the Jira bugs. Best regards, Goetz. > -----Original Message----- > From: Magnus Ihse Bursie [mailto:magnus.ihse.bursie at oracle.com] > Sent: Montag, 4. September 2017 10:42 > To: Lindenmaier, Goetz ; hotspot-runtime- > dev at openjdk.java.net; build-dev > Subject: Re: RFR(M): 8187045: [linux] Not all libraries in the VM are linked > with -z,noexecstack > > Hi Goetz, > > Since this is mostly a build change, it need to be reviewed on build-dev. > > However, it looks good to me from a build perspective. I have not > reviewed the hotspot test files. > > /Magnus > > On 2017-09-01 15:05, Lindenmaier, Goetz wrote: > > Hi, > > > > I found that not all libraries are linked with -z,noexecstack. > > This lead to errors with our linuxppc64 build. The linker omitted > > the flag altogether, which is interpreted as a lib with execstack. > > > > This change contains a small test that scans all libraries in the tested VM > > to have the noexecstack flag set. It utilizes the elf parser in the VM for this. > > Further -z,noexecstack is now passed to all libraries. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8187045- > execstackLink/webrev.01/ > > http://cr.openjdk.java.net/~goetz/wr17/8187045- > execstackLink/webrev.01-hs/ > > > > Best regards, > > Goetz. From goetz.lindenmaier at sap.com Mon Sep 4 09:51:40 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 4 Sep 2017 09:51:40 +0000 Subject: RFR(M): 8186978: Introduce configure argument enable-cds In-Reply-To: References: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> <08f05d7b-14fd-a51a-41e3-2c6d09201cd5@oracle.com> <0cf2865e-bfc0-826f-8c6f-350a70b87ba7@oracle.com> <0de4b0ee7c804280a29b76a6000f95e7@sap.com> Message-ID: <8cade291816042a2b5de899391b99bb3@sap.com> Hi David, thanks for sponsoring the change! Best regards, Goetz. > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 31. August 2017 23:15 > To: Lindenmaier, Goetz ; 'Magnus Ihse Bursie' > ; hotspot-runtime- > dev at openjdk.java.net; build-dev (build-dev at openjdk.java.net) dev at openjdk.java.net> > Subject: Re: RFR(M): 8186978: Introduce configure argument enable-cds > > Hi Goetz, > > I will sponsor this. > > Thanks, > David > > On 1/09/2017 12:49 AM, Lindenmaier, Goetz wrote: > > Hi, > > > > thanks for reviewing everybody! > > Yes, works fine without that assignment. New webrev: > > http://cr.openjdk.java.net/~goetz/wr17/8186978-disableCDS/webrev.02/ > > > > Could someone please sponsor? I think autogen.sh needs to be run > > before submitting. > > > > Best regards, > > Goetz. > > > >> -----Original Message----- > >> From: Magnus Ihse Bursie [mailto:magnus.ihse.bursie at oracle.com] > >> Sent: Thursday, August 31, 2017 3:35 PM > >> To: David Holmes ; Lindenmaier, Goetz > >> ; hotspot-runtime-dev at openjdk.java.net; > >> build-dev (build-dev at openjdk.java.net) > >> Subject: Re: RFR(M): 8186978: Introduce configure argument enable-cds > >> > >> > >> > >> On 2017-08-31 14:47, David Holmes wrote: > >>> Hi Goetz, > >>> > >>> On 31/08/2017 10:29 PM, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> Tests for class data sharing (cds) are enabled if @requires vm.cds is > >>>> true. > >>>> The property vm.cds depends on the preprocessor macro > ENABLE_CDS. > >> ... but you mean INCLUDE_CDS. :-) > >> > >>>> This can not yet be switched by configure. It's only disabled > >>>> automatically > >>>> for the minimal build. > >>>> > >>>> This change introduces enable-cds with default true, which only takes > >>>> effect > >>>> in the non-minimal build. If disabled, generate-classlist is > >>>> disabled, too. > >>>> > >>>> Please review this change. I please need a sponsor. > >>>> http://cr.openjdk.java.net/~goetz/wr17/8186978- > >> disableCDS/webrev.01/index.html > >>>> > >>> > >>> I'll let the build guys comment in detail, but the structure for this > >>> doesn't quite look right to me. I don't understand why you have in > >>> spec.gmk.in: > >>> > >>> + ENABLE_CDS:=@ENABLE_CDS@ > >>> > >>> when in the hotspot build CDS is controlled via the feature setting: > >>> > >>> ifneq ($(call check-jvm-feature, cds), true) > >>> > >>> which you are already handling. ?? > >> > >> Agree, the ENABLE_CDS variable is only used internally in the configure > >> script and need not/should not be exported in spec.gmk.in. As David > >> says, the test ($(call check-jvm-feature, cds), true) is enough to > >> determine if to send the -DINCLUDE_CDS to the compiler. > >> > >> Just remove the changes to spec.gmk.in, and I'm ok with the patch. > >> > >> /Magnus > >> > >> > >>> > >>> Thanks, > >>> David > >>> > >>> > >>>> Best regards, > >>>> Goetz. > >>>> > > From magnus.ihse.bursie at oracle.com Mon Sep 4 09:58:51 2017 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 4 Sep 2017 11:58:51 +0200 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: <07c3435bc97e4119a0a358a0bfc81fe9@sap.com> References: <0dbf4e19-988c-3257-58df-474a51373ed4@oracle.com> <07c3435bc97e4119a0a358a0bfc81fe9@sap.com> Message-ID: On 2017-09-04 11:47, Lindenmaier, Goetz wrote: > Hi Magnus, > > thanks for looking at my change. > > Thanks for forwarding to the proper list. I'm often not sure what's the right one, > and sometimes it's ambiguous anyways. Like this one, which concerns stack overflow > handling. The same problem exists with all the categories of the Jira bugs. The rule of thumb is: Does the patch touch files in the make or autoconf directories? If so, cc build-dev. (For really trivial changes you can skip this, but then you must be *certain* that the change is trivial -- not always easy to say when dealing with makefiles) The good thing with mailing lists, as opposted to categories in Jira, is that you can cc multiple lists. /Magnus > > Best regards, > Goetz. > >> -----Original Message----- >> From: Magnus Ihse Bursie [mailto:magnus.ihse.bursie at oracle.com] >> Sent: Montag, 4. September 2017 10:42 >> To: Lindenmaier, Goetz ; hotspot-runtime- >> dev at openjdk.java.net; build-dev >> Subject: Re: RFR(M): 8187045: [linux] Not all libraries in the VM are linked >> with -z,noexecstack >> >> Hi Goetz, >> >> Since this is mostly a build change, it need to be reviewed on build-dev. >> >> However, it looks good to me from a build perspective. I have not >> reviewed the hotspot test files. >> >> /Magnus >> >> On 2017-09-01 15:05, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> I found that not all libraries are linked with -z,noexecstack. >>> This lead to errors with our linuxppc64 build. The linker omitted >>> the flag altogether, which is interpreted as a lib with execstack. >>> >>> This change contains a small test that scans all libraries in the tested VM >>> to have the noexecstack flag set. It utilizes the elf parser in the VM for this. >>> Further -z,noexecstack is now passed to all libraries. >>> >>> Please review this change. I please need a sponsor. >>> http://cr.openjdk.java.net/~goetz/wr17/8187045- >> execstackLink/webrev.01/ >>> http://cr.openjdk.java.net/~goetz/wr17/8187045- >> execstackLink/webrev.01-hs/ >>> Best regards, >>> Goetz. From dmitry.samersoff at bell-sw.com Mon Sep 4 13:49:00 2017 From: dmitry.samersoff at bell-sw.com (dmitry.samersov) Date: Mon, 4 Sep 2017 16:49:00 +0300 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: <21e5e949-9238-eafc-4080-72b7f29f95c4@bell-sw.com> Andrew, > Yes, but NMT also works (and is meant to work) on product builds where > the required symbols are not available. 1. This patch doesn't affect product build. On product build we have all NMT frames inlined and don't need to skip anything. I'll put these changes under #ifndef PRODUCT to make it clear visible. 2. NMT uses .symtab section and these symbols also available in release build by default, unless someone manually strip libjvm.so -Dmitry On 04.09.2017 11:11, Andrew Dinn wrote: > On 03/09/17 19:12, Dmitry Samersoff wrote: >> On 08/31/2017 11:56 AM, Andrew Dinn wrote: >>> I don't think this is going to work well when symbols are not present >>> (meaning you cannot resolve return pc addresses to function names). >> >> On elf platforms, NMT uses .symtab section of libjvm.so and it's hard to >> me to imagine the situation where someone has stripped slowdebug build. > > Yes, but NMT also works (and is meant to work) on product builds where > the required symbols are not available. > >> If different architecture implements different inlining strategy, I >> would expect difficulty in finding correlations between them with or >> without this patch. > Perhaps, but the status quo is code which avoids any differences in > where the stack trace starts caused by inlining. The current problem is > that this is fragile wrt to changes in the C++ compiler. Your fix > introduces an extra type of variation in the stack traces when symbols > are not present (it does not mitigate other effects of different > inlining strategies). > > So, I am merely pointing out that this is a use case that will occur > (people /will/ use NMT in production deployments even if only in a > sandbox) and identifying the variation in output as a limitation to be > weighed in the balance. I think, on balance, I prefer the status quo to > your attempt to mitigate a /potential/ future risk. I don't suppose most > users will care about which hex addresses they see in their backtraces > so to them it will be moot. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From adinn at redhat.com Mon Sep 4 14:28:13 2017 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 4 Sep 2017 15:28:13 +0100 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <21e5e949-9238-eafc-4080-72b7f29f95c4@bell-sw.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> <21e5e949-9238-eafc-4080-72b7f29f95c4@bell-sw.com> Message-ID: <5caa0c53-bf9a-2275-b3ca-d2bd04fa5feb@redhat.com> On 04/09/17 14:49, dmitry.samersov wrote: >> Yes, but NMT also works (and is meant to work) on product builds where >> the required symbols are not available. > > 1. This patch doesn't affect product build. On product build we have all > NMT frames inlined and don't need to skip anything. > > I'll put these changes under #ifndef PRODUCT to make it clear visible. Ok, thank you for correcting my misunderstanding here. > 2. NMT uses .symtab section and these symbols also available in release > build by default, unless someone manually strip libjvm.so Hmm, apologies once again if I have misunderstood how things work in release builds. I have been using NMT in the last 2 weeks, albeit under jdk8, and symbols were *not* displayed when running my local product build (the traces simply displayed hex addresses). I configured the build with --with-debug-level=release --disable-zip-debug-info Am I missing something obvious to account for the lack of symbols in my trace output? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From goetz.lindenmaier at sap.com Mon Sep 4 15:08:17 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 4 Sep 2017 15:08:17 +0000 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: Message-ID: <1400c761fcc34c37aa1e374790bb7d39@sap.com> Hi Thomas, I had a look at your change. Great somebody finally fixes the windows symbol printing, thanks a lot! The code looks good, I'm just not sure whether you need new files symbolengine.c|hpp. Isn't that just what should go to decoder_windows.h|cpp and class Decoder? You would also get rid of the redirections in decoder_windows.cpp. In shutdown() you comment // There is no reason ever to shut down the decoder. ... I think you can remove that function altogether, i.e. also from the shared code, I don't see where it is ever called. Also, I think, you can just delete Decoder::can_decode_C_frame_in_vm() from the code. The only place where it is used, in frame.cpp, calls dll_address_to_duntion_name(). This returns useful information also in the case of the NullDecoder, which now is the only one to return false in that function. Globals_windows.hpp needs Copyright adaption, please. This is not introduced by your change, but maybe you can also fix the copyright in decoder.hpp, which says " 1997, 2015, 2017" ... should only name two years ... Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Thomas St?fe > Sent: Mittwoch, 30. August 2017 14:34 > To: hotspot-runtime-dev at openjdk.java.net > Subject: RFR(m): 8185712: [windows] Improve native symbol decoder > > Hi all, > > May I please have reviews for the following change. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > native-symbol-resolver/webrev.01/webrev/ > > (This is the followup to: https://bugs.openjdk.java.net/browse/JDK-8186349) > > ------------- > > Basically, this is a reimplementation of the layer around the Windows > Symbol API (the API used to resolve debug symbols). The old > implementation > had a number of errors and shortcomings which together caused the > Windows > native symbol resolution (and hence callstacks in error logs) to be a bit > of a lottery. The aim of this reimplementation is to make the code more > robust and easier to maintain. > > The problems with the existing implementation are listed in detail in the > bug description. > > The new implementation: > > - uses the new centralized WindowsDbgHelper class, which wraps the > dbghelp.dll loading, introduced with JDK-8186349 > > - Completely bypasses the "create two instances of AbstractDecoder class > and synchronize access to them" scheme in decoder.cpp. It does not make > sense for windows, where we have to synchronize each access to the > dbghelp.dll anyway - this is done one layer below in WindowsDbgHelper. The > static methods of the shared Decoder class now directly access the static > methods in the new SymbolEngine class, see decoder_windows.cpp. > > - The layer wrapping the Symbol API lives in the new symbolengine.cpp/hpp > files. The coding takes care of properly initializing (once) the symbol API > and of assembling the pdb search path. > > - Pdb search path construction is changed: where before we just added jdk > and jvm bin directories, we now just add all directories of all loaded DLLs > (which, of course, include the jdk and jvm bin directories). That way we > have a high chance of catching pdb files of third party libraries, as long > as they follow the convention of putting the pdb files beside the dlls. > This means it is easier to analyse crashes where third party DLLs are > involved. > > - On Windows, we now have source file and line number in the callstack. > > - There is a new parameter, diagnostic and windows-only, > called "InitializeDbgHelpEarly". That parameter is by default off. If on, > it causes the symbol engine to be initialized early, which increases the > chance of good callstacks later on (because the initialization does not > have to run in an error situation). > > - Added tests: gtests and a jtreg test which tests the callstack printing. > All tests windows only. There is no technical reason for making them > windows only, but I wanted to keep disturbances to other platforms to a > minimum and these kind of tests can be shaky. > > Thanks a lot for reviewing this! > > Kind Regards, Thomas From david.holmes at oracle.com Mon Sep 4 21:40:20 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 5 Sep 2017 07:40:20 +1000 Subject: RFR(S) 8187040: ThreadCritical crashes on Solaris if used between os::init and os::init_2 In-Reply-To: References: <8cfa75c6-ebbc-07f4-91d7-5aaa111e92ed@oracle.com> <78c151dc-3095-15ea-edaf-bcbb8ec30c44@oracle.com> Message-ID: On 4/09/2017 7:36 PM, Thomas St?fe wrote: > Hi David, > > On Mon, Sep 4, 2017 at 3:04 AM, David Holmes > wrote: > > On 2/09/2017 3:31 AM, Thomas St?fe wrote: > > Hi Mikael, > > (I never understood why we cannot just use pthread mutexes on > Solaris. Why > all this dynamic loading magic, are pthread functions not > available on all > Solaris versions?) > > > Depends how far back you want to go with "all" :) This is obviously > strongly historical. We have three potential sync API's on Solaris > (pthreads, UI threads and kernel LWPs). LWP sync tended to perform > better** - and that's what we still use. We had a project a few > years back to convert from UI threads to pthreads, but it was too > big too manage at the time and was shelved. > > > Thanks for that history lesson! > > One thing is to use prefer lwp over pthread, but another is to link all > of that dynamically. I wondered whether we could not just call > pthread_mutex_link() directly, without dlsyming it first. This would be > a small thing to cleanup in comparison to switching to pthread APIs > completely. I agree - the dlsym'ing dates back to separate thread libraries, and is not needed. I also noticed we don't use the proper pthread_cond/mutex_init functions, but have a raw memset - obviously dealing with an early omission from the API! But I don't even know if switching to use pthreads sync even works these days - it is never tested AFAIK. David > Cheers, Thomas > > ** Unfortunately I have no idea how this was actually measured, so > can't say whether this is still the case. > > Cheers, > David > > > Small nit, instead of adding a new variable > _synchronization_initialized, > how about _mutex_lock != NULL (in ThreadCritical()) and > _mutex_unlock != > NULL (in ~ThreadCritical())? > > I am okay with the removal of ::release(). Even if it were used, > it is > really safer to let the mutex live until process end. > > I leave it up to you if you take my suggestion above. The patch > is fine for > me in the current form. > > Kind Regards, Thomas > > > On Fri, Sep 1, 2017 at 5:23 PM, Mikael Gerdin > > > wrote: > > Hi, > > Please review this small fix to ThreadCritical. > When working on a piece of code which allocates memory early > on I noticed > that it crashed if I enabled NMT. > The reason is that NMT uses ThreadCritical and os::Solaris > sets the > ThreadCritical::_initialized flag before it actually sets up > the function > pointers the flag is supposed to guard. > os::Solaris::_mutex_lock is not initialized until the init_2 > phase (after > command line flag parsing). > > My suggested fix is to replace the current short-circuit of > ThreadCritical > with a flag set when the Solaris mutex code is initialized > and thereby > getting rid of the initialize function on all platforms. > Additionally, ThreadCritical::release is unreachable code > and from my > research has never actually been called, we might as well > get rid of it. > > Webrev: > http://cr.openjdk.java.net/~mgerdin/8187040/webrev.0/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187040 > > Testing: JPRT > > Thanks > /Mikael > > From david.holmes at oracle.com Tue Sep 5 04:27:46 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 5 Sep 2017 14:27:46 +1000 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: References: Message-ID: Hi Goetz, On 1/09/2017 11:05 PM, Lindenmaier, Goetz wrote: > Hi, > > I found that not all libraries are linked with -z,noexecstack. > This lead to errors with our linuxppc64 build. The linker omitted > the flag altogether, which is interpreted as a lib with execstack. > > This change contains a small test that scans all libraries in the tested VM > to have the noexecstack flag set. It utilizes the elf parser in the VM for this. > Further -z,noexecstack is now passed to all libraries. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.01/ So IIUC presently we only set noexecstack for gcc on linux when building libjvm - via the JVM_LDFLAGS settings. With this change we also set it for building JDK libraries via the LDFLAGS_JDKLIB setting. But this seems to be unconditional, not limited to gcc and linux ?? In addition we want to build libjsig with noexecstack, and we do that by exposing LDFLAGS_NO_EXEC_STACK in spec.gmk, and using it in CompileLibjsig.gmk. I don't have an issue with the use of noexecstack but I think it could just have been hard-wired for linux just as the bulk of the flags set in that file are. Granted you copied what is done for LDFLAGS_HASH_STYLE - but in that case I'm assuming it is important that the same hash style be used throughout. Anyway minor stylistic nit which may be moot soon as once we have the consolidated repo I think libjsig could be handled the same as others libs? > http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.01-hs/ Test changes look okay to me. Thanks, David > Best regards, > Goetz. > From goetz.lindenmaier at sap.com Tue Sep 5 08:04:57 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 5 Sep 2017 08:04:57 +0000 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: References: Message-ID: <0559eb3655cc42bb8b6cb37fb4370da8@sap.com> Hi David, thanks for looking at my change! > Hi Goetz, > > On 1/09/2017 11:05 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > I found that not all libraries are linked with -z,noexecstack. > > This lead to errors with our linuxppc64 build. The linker omitted > > the flag altogether, which is interpreted as a lib with execstack. > > > > This change contains a small test that scans all libraries in the tested VM > > to have the noexecstack flag set. It utilizes the elf parser in the VM for this. > > Further -z,noexecstack is now passed to all libraries. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8187045- > execstackLink/webrev.01/ > > So IIUC presently we only set noexecstack for gcc on linux when building > libjvm - via the JVM_LDFLAGS settings. Yes. > With this change we also set it for building JDK libraries via the > LDFLAGS_JDKLIB setting. But this seems to be unconditional, not limited > to gcc and linux ?? LDFLAGS_NO_EXEC_STACK="-Wl,-z,noexecstack" is only assigned on linux, on other platforms its empty. > In addition we want to build libjsig with noexecstack, and we do that by > exposing LDFLAGS_NO_EXEC_STACK in spec.gmk, and using it in > CompileLibjsig.gmk. I don't have an issue with the use of noexecstack > but I think it could just have been hard-wired for linux just as the > bulk of the flags set in that file are. Granted you copied what is done > for LDFLAGS_HASH_STYLE - but in that case I'm assuming it is important > that the same hash style be used throughout. Anyway minor stylistic nit > which may be moot soon as once we have the consolidated repo I think > libjsig could be handled the same as others libs? I had hoped to find a location where flags that should be used in all linking steps are assembled. Noexecstack should really be set in any lib we build. But I didn't find that, so I implemented it as with the HASH_STYLE. I don't really like it this way because if a new lib is added it might be forgotten to add the noexecstack. But I assume after the repo consolidation the build will be reshaped, so now is not the right time to seek for optimal setups. Best regards, Goetz. > > > http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.01- > hs/ > > Test changes look okay to me. > > Thanks, > David > > > Best regards, > > Goetz. > > From dmitry.samersoff at bell-sw.com Tue Sep 5 09:25:58 2017 From: dmitry.samersoff at bell-sw.com (Dmitry Samersov) Date: Tue, 5 Sep 2017 12:25:58 +0300 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <5caa0c53-bf9a-2275-b3ca-d2bd04fa5feb@redhat.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> <21e5e949-9238-eafc-4080-72b7f29f95c4@bell-sw.com> <5caa0c53-bf9a-2275-b3ca-d2bd04fa5feb@redhat.com> Message-ID: Andrew, > Am I missing something obvious to account for the lack of symbols in > my trace output? It looks like your libjvm.so get stripped for some reason. I don't have jdk8 build setup but: I'd checked 1. Official jdk8/x86_64 downloaded from oracle site 2. jdk10/x86_64 built with --with-debug-level=release --with-native-debug-symbols=none 3. jdk10/aarch64 built with --with-debug-level=release --with-native-debug-symbols=none In all three cases .symtab is present and output of ${TESTJAVA}/bin/java -XX:NativeMemoryTracking=detail -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics -version contains symbols (e.g): [0x0000ffffae39ebf8] ReservedSpace::ReservedSpace(unsigned long, unsigned long)+0x94 [0x0000ffffadf4dd5c] CodeHeap::reserve(ReservedSpace, unsigned long, unsigned long)+0x194 [0x0000ffffaddb2de4] CodeCache::add_heap(ReservedSpace, char const*, int)+0x100 [0x0000ffffaddb3140] CodeCache::initialize_heaps()+0x2fc (reserved=48KB, committed=20KB) -Dmitry On 04.09.2017 17:28, Andrew Dinn wrote: > On 04/09/17 14:49, dmitry.samersov wrote: >>> Yes, but NMT also works (and is meant to work) on product builds where >>> the required symbols are not available. >> >> 1. This patch doesn't affect product build. On product build we have all >> NMT frames inlined and don't need to skip anything. >> >> I'll put these changes under #ifndef PRODUCT to make it clear visible. > > Ok, thank you for correcting my misunderstanding here. > >> 2. NMT uses .symtab section and these symbols also available in release >> build by default, unless someone manually strip libjvm.so > > Hmm, apologies once again if I have misunderstood how things work in > release builds. I have been using NMT in the last 2 weeks, albeit under > jdk8, and symbols were *not* displayed when running my local product > build (the traces simply displayed hex addresses). I configured the > build with > > --with-debug-level=release --disable-zip-debug-info > > Am I missing something obvious to account for the lack of symbols in my > trace output? > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From adinn at redhat.com Tue Sep 5 10:04:09 2017 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 5 Sep 2017 11:04:09 +0100 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> <21e5e949-9238-eafc-4080-72b7f29f95c4@bell-sw.com> <5caa0c53-bf9a-2275-b3ca-d2bd04fa5feb@redhat.com> Message-ID: <4ca0961e-f6fe-77f3-7184-1258c582f777@redhat.com> Hi Dmitry, On 05/09/17 10:25, Dmitry Samersov wrote: >> Am I missing something obvious to account for the lack of symbols in >> my trace output? > > It looks like your libjvm.so get stripped for some reason. > > . . . > In all three cases .symtab is present and output of > > ${TESTJAVA}/bin/java -XX:NativeMemoryTracking=detail > -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics -version > > contains symbols (e.g): > . . . In which case, my apologies for raising invalid concerns. The patch is a good improvement. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From thomas.stuefe at gmail.com Tue Sep 5 13:05:31 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 5 Sep 2017 15:05:31 +0200 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: <1400c761fcc34c37aa1e374790bb7d39@sap.com> References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> Message-ID: Hi Goetz, thank you for your review! New Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.02 Delta to last: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.01-to-02/webrev/ The only change is that I removed the -XX:InitializeDbgHelpEarly switch to avoid having to file a CSR. Please find further comments inline: On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi Thomas, > > I had a look at your change. Great somebody finally fixes > the windows symbol printing, thanks a lot! > > The code looks good, I'm just not sure whether you > need new files symbolengine.c|hpp. Isn't that > just what should go to decoder_windows.h|cpp and > class Decoder? > You would also get rid of the redirections in decoder_windows.cpp. > > As we discussed, I see your point, but would prefer to leave the change for the moment as it is. A similar change to this one - doing away with the AbstractDecoder object instantiation layer - will be coming for AIX, where it does not make much sense either, and I propose to do a separate cleanup or simplification change once that is done, merging decoder_windows.cpp and symbolengine.cpp/hpp. Unless I hear more objections from other reviewers, I'd prefer to do this in a later patch. > In shutdown() you comment > // There is no reason ever to shut down the decoder. > ... I think you can remove that function altogether, i.e. also > from the shared code, I don't see where it is ever called. > > Totally agree... > Also, I think, you can just delete Decoder::can_decode_C_frame_in_vm() > from the code. The only place where it is used, in frame.cpp, > calls dll_address_to_duntion_name(). This returns useful information > also in the case of the NullDecoder, which now is the only one to > return false in that function. > totally agree also here, but would also prefer both issues in a separate change. In fact, Ioi opened a bug for this a while ago: https://bugs.openjdk.java.net/browse/JDK-8144855 - and I would like to fix it under that bug. Reason is, in this change, I'd like to avoid changing shared sources as much as possible and keep this change windows only. > > Globals_windows.hpp needs Copyright adaption, please. > This is not introduced by your change, but maybe > you can also fix the copyright in decoder.hpp, which > says " 1997, 2015, 2017" ... should only name two > years ... > > Not needed anymore: since I removed the -XX:InitializeDbgHelpEarly switch, globals_windows.hpp is reverted to its original state. Do you still want me to fix the date? Thanks for the review work! ..Thomas > Best regards, > Goetz. > > > > > > > > -----Original Message----- > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > bounces at openjdk.java.net] On Behalf Of Thomas St?fe > > Sent: Mittwoch, 30. August 2017 14:34 > > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR(m): 8185712: [windows] Improve native symbol decoder > > > > Hi all, > > > > May I please have reviews for the following change. > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > > Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > > native-symbol-resolver/webrev.01/webrev/ > > > > (This is the followup to: https://bugs.openjdk.java.net/ > browse/JDK-8186349) > > > > ------------- > > > > Basically, this is a reimplementation of the layer around the Windows > > Symbol API (the API used to resolve debug symbols). The old > > implementation > > had a number of errors and shortcomings which together caused the > > Windows > > native symbol resolution (and hence callstacks in error logs) to be a bit > > of a lottery. The aim of this reimplementation is to make the code more > > robust and easier to maintain. > > > > The problems with the existing implementation are listed in detail in the > > bug description. > > > > The new implementation: > > > > - uses the new centralized WindowsDbgHelper class, which wraps the > > dbghelp.dll loading, introduced with JDK-8186349 > > > > - Completely bypasses the "create two instances of AbstractDecoder class > > and synchronize access to them" scheme in decoder.cpp. It does not make > > sense for windows, where we have to synchronize each access to the > > dbghelp.dll anyway - this is done one layer below in WindowsDbgHelper. > The > > static methods of the shared Decoder class now directly access the static > > methods in the new SymbolEngine class, see decoder_windows.cpp. > > > > - The layer wrapping the Symbol API lives in the new symbolengine.cpp/hpp > > files. The coding takes care of properly initializing (once) the symbol > API > > and of assembling the pdb search path. > > > > - Pdb search path construction is changed: where before we just added jdk > > and jvm bin directories, we now just add all directories of all loaded > DLLs > > (which, of course, include the jdk and jvm bin directories). That way we > > have a high chance of catching pdb files of third party libraries, as > long > > as they follow the convention of putting the pdb files beside the dlls. > > This means it is easier to analyse crashes where third party DLLs are > > involved. > > > > - On Windows, we now have source file and line number in the callstack. > > > > - There is a new parameter, diagnostic and windows-only, > > called "InitializeDbgHelpEarly". That parameter is by default off. If on, > > it causes the symbol engine to be initialized early, which increases the > > chance of good callstacks later on (because the initialization does not > > have to run in an error situation). > > > > - Added tests: gtests and a jtreg test which tests the callstack > printing. > > All tests windows only. There is no technical reason for making them > > windows only, but I wanted to keep disturbances to other platforms to a > > minimum and these kind of tests can be shaky. > > > > Thanks a lot for reviewing this! > > > > Kind Regards, Thomas > From zgu at redhat.com Tue Sep 5 13:11:34 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 5 Sep 2017 09:11:34 -0400 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <5caa0c53-bf9a-2275-b3ca-d2bd04fa5feb@redhat.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> <21e5e949-9238-eafc-4080-72b7f29f95c4@bell-sw.com> <5caa0c53-bf9a-2275-b3ca-d2bd04fa5feb@redhat.com> Message-ID: > >> 2. NMT uses .symtab section and these symbols also available in release >> build by default, unless someone manually strip libjvm.so > > Hmm, apologies once again if I have misunderstood how things work in > release builds. I have been using NMT in the last 2 weeks, albeit under > jdk8, and symbols were *not* displayed when running my local product > build (the traces simply displayed hex addresses). I configured the > build with > > --with-debug-level=release --disable-zip-debug-info Humm ... this should be enough to give you symbols. -Zhengyu > > Am I missing something obvious to account for the lack of symbols in my > trace output? > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From coleen.phillimore at oracle.com Tue Sep 5 14:13:06 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 5 Sep 2017 10:13:06 -0400 Subject: RFR: 8186789: CDS dump crashes at ConstantPool::resolve_class_constants In-Reply-To: <9C39BCF1-558F-41E8-AEC9-1E7997071A6E@oracle.com> References: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> <9C39BCF1-558F-41E8-AEC9-1E7997071A6E@oracle.com> Message-ID: This looks good to me also. Thanks, Coleen On 9/2/17 12:24 PM, Jiangli Zhou wrote: > Thanks, Serguei! > > Thanks, > Jiangli > >> On Sep 1, 2017, at 9:45 PM, "serguei.spitsyn at oracle.com" wrote: >> >> Hi Jiangli, >> >> It looks good to me. >> >> Thanks, >> Serguei >> >> >>> On 9/1/17 12:26, Jiangli Zhou wrote: >>> Hi, >>> >>> Please review the following fix for 8186789. >>> >>> webrev: http://cr.openjdk.java.net/~jiangli/8186789/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8186789 >>> >>> If a class fails verification due to missing dependencies at dump time, the constant pool _cache may be NULL. ConstantPool::resolve_class_constants() needs to check for that case. Also moved the function under #if INCLUDE_CDS_JAVA_HEAP, since it is only used when INCLUDE_CDS_JAVA_HEAP is enabled. >>> >>> Tested with JPRT and unit test case. >>> >>> Thanks, >>> Jiangli From zgu at redhat.com Tue Sep 5 14:22:39 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 5 Sep 2017 10:22:39 -0400 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> Message-ID: <57fe156e-3d43-8e64-d6d5-bae9f55fd994@redhat.com> Hi Thomas, Looks good overall. symbolengine.cpp: Is there reason to use ::malloc()/::free() instead of os::malloc()/os::free() ? Line #602: buf might not be null-terminated if filename is longer than buffer len. Thanks, -Zhengyu On 09/05/2017 09:05 AM, Thomas St?fe wrote: > Hi Goetz, > > thank you for your review! > > New Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.02 > > Delta to last: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.01-to-02/webrev/ > > The only change is that I removed the -XX:InitializeDbgHelpEarly switch to > avoid having to file a CSR. > > Please find further comments inline: > > > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz < > goetz.lindenmaier at sap.com> wrote: > >> Hi Thomas, >> >> I had a look at your change. Great somebody finally fixes >> the windows symbol printing, thanks a lot! >> >> The code looks good, I'm just not sure whether you >> need new files symbolengine.c|hpp. Isn't that >> just what should go to decoder_windows.h|cpp and >> class Decoder? >> You would also get rid of the redirections in decoder_windows.cpp. >> >> > As we discussed, I see your point, but would prefer to leave the change for > the moment as it is. > > A similar change to this one - doing away with the AbstractDecoder object > instantiation layer - will be coming for AIX, where it does not make much > sense either, and I propose to do a separate cleanup or simplification > change once that is done, merging decoder_windows.cpp and > symbolengine.cpp/hpp. Unless I hear more objections from other reviewers, > I'd prefer to do this in a later patch. > > >> In shutdown() you comment >> // There is no reason ever to shut down the decoder. >> ... I think you can remove that function altogether, i.e. also >> from the shared code, I don't see where it is ever called. >> >> > Totally agree... > > >> Also, I think, you can just delete Decoder::can_decode_C_frame_in_vm() >> from the code. The only place where it is used, in frame.cpp, >> calls dll_address_to_duntion_name(). This returns useful information >> also in the case of the NullDecoder, which now is the only one to >> return false in that function. >> > > totally agree also here, but would also prefer both issues in a separate > change. In fact, Ioi opened a bug for this a while ago: > https://bugs.openjdk.java.net/browse/JDK-8144855 - and I would like to fix > it under that bug. Reason is, in this change, I'd like to avoid changing > shared sources as much as possible and keep this change windows only. > > >> >> Globals_windows.hpp needs Copyright adaption, please. >> This is not introduced by your change, but maybe >> you can also fix the copyright in decoder.hpp, which >> says " 1997, 2015, 2017" ... should only name two >> years ... >> >> > Not needed anymore: since I removed the -XX:InitializeDbgHelpEarly switch, > globals_windows.hpp is reverted to its original state. Do you still want me > to fix the date? > > Thanks for the review work! > > ..Thomas > > >> Best regards, >> Goetz. >> >> >> >> >> >> >>> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> bounces at openjdk.java.net] On Behalf Of Thomas St?fe >>> Sent: Mittwoch, 30. August 2017 14:34 >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: RFR(m): 8185712: [windows] Improve native symbol decoder >>> >>> Hi all, >>> >>> May I please have reviews for the following change. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 >>> Webrev: >>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- >>> native-symbol-resolver/webrev.01/webrev/ >>> >>> (This is the followup to: https://bugs.openjdk.java.net/ >> browse/JDK-8186349) >>> >>> ------------- >>> >>> Basically, this is a reimplementation of the layer around the Windows >>> Symbol API (the API used to resolve debug symbols). The old >>> implementation >>> had a number of errors and shortcomings which together caused the >>> Windows >>> native symbol resolution (and hence callstacks in error logs) to be a bit >>> of a lottery. The aim of this reimplementation is to make the code more >>> robust and easier to maintain. >>> >>> The problems with the existing implementation are listed in detail in the >>> bug description. >>> >>> The new implementation: >>> >>> - uses the new centralized WindowsDbgHelper class, which wraps the >>> dbghelp.dll loading, introduced with JDK-8186349 >>> >>> - Completely bypasses the "create two instances of AbstractDecoder class >>> and synchronize access to them" scheme in decoder.cpp. It does not make >>> sense for windows, where we have to synchronize each access to the >>> dbghelp.dll anyway - this is done one layer below in WindowsDbgHelper. >> The >>> static methods of the shared Decoder class now directly access the static >>> methods in the new SymbolEngine class, see decoder_windows.cpp. >>> >>> - The layer wrapping the Symbol API lives in the new symbolengine.cpp/hpp >>> files. The coding takes care of properly initializing (once) the symbol >> API >>> and of assembling the pdb search path. >>> >>> - Pdb search path construction is changed: where before we just added jdk >>> and jvm bin directories, we now just add all directories of all loaded >> DLLs >>> (which, of course, include the jdk and jvm bin directories). That way we >>> have a high chance of catching pdb files of third party libraries, as >> long >>> as they follow the convention of putting the pdb files beside the dlls. >>> This means it is easier to analyse crashes where third party DLLs are >>> involved. >>> >>> - On Windows, we now have source file and line number in the callstack. >>> >>> - There is a new parameter, diagnostic and windows-only, >>> called "InitializeDbgHelpEarly". That parameter is by default off. If on, >>> it causes the symbol engine to be initialized early, which increases the >>> chance of good callstacks later on (because the initialization does not >>> have to run in an error situation). >>> >>> - Added tests: gtests and a jtreg test which tests the callstack >> printing. >>> All tests windows only. There is no technical reason for making them >>> windows only, but I wanted to keep disturbances to other platforms to a >>> minimum and these kind of tests can be shaky. >>> >>> Thanks a lot for reviewing this! >>> >>> Kind Regards, Thomas >> From dmitry.samersoff at bell-sw.com Tue Sep 5 14:49:32 2017 From: dmitry.samersoff at bell-sw.com (dmitry.samersov) Date: Tue, 5 Sep 2017 17:49:32 +0300 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: Everybody, Please, review updated webrev: http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.06/ Only files below different from the previous webrev. src/share/vm/services/nmtCommon.hpp src/share/vm/utilities/nativeCallStack.cpp src/share/vm/utilities/nativeCallStack.hpp 1. Changes guarded by #ifndef PRODUCT 2. Addressed Thomas comments -Dmitry On 31.08.2017 10:49, dmitry.samersov wrote: > Everybody, > > Please review: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ > > I would propose different approach to fix JDK-8133740 > platform-independent way: record all frames but strip unnecessary > NMT-internal ones on printing. > > This approach is safe (we don't depend to compiler inlining and we never > strip non-NMT frames) and platform independent, but cost us some extra > memory. > > -Dmitry > > From adinn at redhat.com Tue Sep 5 14:58:57 2017 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 5 Sep 2017 15:58:57 +0100 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: <6a4b25c4-f841-dd99-1ede-29a42fe34c01@redhat.com> On 05/09/17 15:49, dmitry.samersov wrote: > Please, review updated webrev: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.06/ > > Only files below different from the previous webrev. > > src/share/vm/services/nmtCommon.hpp > src/share/vm/utilities/nativeCallStack.cpp > src/share/vm/utilities/nativeCallStack.hpp > > > 1. Changes guarded by #ifndef PRODUCT > 2. Addressed Thomas comments Looks good. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From adinn at redhat.com Tue Sep 5 15:18:40 2017 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 5 Sep 2017 16:18:40 +0100 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> <6f01f8a3-8911-d5fc-9208-7dcac5d1874b@redhat.com> Message-ID: <1e5afb73-8cb3-35aa-dad3-5fc7f8b25a43@redhat.com> On 29/08/17 17:31, Zhengyu Gu wrote: > Okay, I see what you mean. But in this case, capacity = committed. Well, it does not always seem to be exactly the same. If you add up all the pieces to derive the capacity then it sometimes seems to fall short of committed. I looked deeper into this and found that sometimes the difference is down to rounding up/down. However, there also seems occasionally to be more space unaccounted for that cannot be explained by rounding errors. I looked into your suggestion that this might be accounted for by 'dark matter' i.e. tail ends of a chunk left unused when the last block is carved out and the chunk retired because the tail is too small to insert into the block dictionary. However, from my reading of the code I think that any such 'dark matter' will still to show up in the waste space count. Rather than hold up this current change I'd prefer to see it pushed and address the arithmetic problem in a follow-up issue. Even with an occasional small disparity in the reported figures I think it is really helpful to have this detailed info available as part of the NMT output. > I wonder if it is cleaner that just reports free, used and waste, e.g. > > ???????????????????????? (? Metadata:??????????????????????????? ) > ???????????????????????? (??? reserved=22528KB,? committed=21504KB) > ???????????????????????? (??? used=20654KB) > ???????????????????????? (??? free=786KBKB) > ???????????????????????? (??? waste=64KB =0.30%) > > where free = (capacity - used) + free_chunks + available > ????? waste = committed - capacity - free_chunks - available > ????? total = committed Yes, I agree that it's ok to leave the available figure implicit -- it is easily computed from the committed total by subtracting used and waste (that's only correct modulo the occasional small disparity between capacity and committed but the difference is small enough not to be significant). So, I'm happy with this version. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From jiangli.zhou at oracle.com Tue Sep 5 15:21:22 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 5 Sep 2017 08:21:22 -0700 Subject: RFR: 8186789: CDS dump crashes at ConstantPool::resolve_class_constants In-Reply-To: References: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> <9C39BCF1-558F-41E8-AEC9-1E7997071A6E@oracle.com> Message-ID: Thanks, Coleen! Jiangli > On Sep 5, 2017, at 7:13 AM, coleen.phillimore at oracle.com wrote: > > This looks good to me also. > Thanks, > Coleen > > On 9/2/17 12:24 PM, Jiangli Zhou wrote: >> Thanks, Serguei! >> >> Thanks, >> Jiangli >> >>> On Sep 1, 2017, at 9:45 PM, "serguei.spitsyn at oracle.com" wrote: >>> >>> Hi Jiangli, >>> >>> It looks good to me. >>> >>> Thanks, >>> Serguei >>> >>> >>>> On 9/1/17 12:26, Jiangli Zhou wrote: >>>> Hi, >>>> >>>> Please review the following fix for 8186789. >>>> >>>> webrev: http://cr.openjdk.java.net/~jiangli/8186789/webrev.00/ >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186789 >>>> >>>> If a class fails verification due to missing dependencies at dump time, the constant pool _cache may be NULL. ConstantPool::resolve_class_constants() needs to check for that case. Also moved the function under #if INCLUDE_CDS_JAVA_HEAP, since it is only used when INCLUDE_CDS_JAVA_HEAP is enabled. >>>> >>>> Tested with JPRT and unit test case. >>>> >>>> Thanks, >>>> Jiangli > From coleen.phillimore at oracle.com Tue Sep 5 15:23:15 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 5 Sep 2017 11:23:15 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Message-ID: <9a165cd2-b766-2b8d-e37a-424cd1ce2c52@oracle.com> Thanks Serguei, Coleen On 9/2/17 4:34 AM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > It looks good. > At least, I do not see any issues with this fix. > > Thanks, > Serguei > > > On 8/30/17 04:14, coleen.phillimore at oracle.com wrote: >> >> Hi, I changed the edit for David to only use ordering semantics in >> the places where needed in the lock free access to pd_set. Since only >> contains_protection_domain is read lock free, it should be ok. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> Thanks, >> Coleen >> >> On 8/29/17 2:28 AM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Here is the third webrev with the names of pd_set and set_pd_set >>>>> renamed to pd_set_acquire and release_set_pd_set. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>> >>> This API should also be renamed: >>> >>> !?? ProtectionDomainEntry* pd_set() const??????????? { return >>> _inner.pd_set_acquire(); } >>> !?? void set_pd_set(ProtectionDomainEntry* new_head) { >>> _inner.release_set_pd_set(new_head); } >>> >>> These are the ones that need to give visibility to the fact we're >>> accessing things lock-free (if indeed we are). >>> >>> More below ... >>> >>>>> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >>>>>> On 8/28/17 12:25 AM, David Holmes wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Thank you Zhengyu for noticing this change was wrong, and >>>>>>>> Christian for the idea.?? New webrev: >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>> >>>>>>> The idea of a load-acquire accessor and release_store-setter is >>>>>>> fine in principal, but it seems to me that we now use these >>>>>>> everywhere, even if we may not need them because there is no >>>>>>> concurrent/lock-free access. Overall I find it very difficult to >>>>>>> determine what the concurrent access patterns are for a >>>>>>> Dictionary versus a DictionaryEntry, and which paths are in fact >>>>>>> lock and/or safepoint free, and may be racing with locked or >>>>>>> safepointed code. ?? >>>>>> >>>>>> That's exactly the point of making them accessors.? So one >>>>>> doesn't have to visit each individual call site and spend time >>>>>> answering the question for each case.? And probably getting it >>>>>> wrong.?? The performance delta for these accesses is minimal >>>>>> since it's only getting the head of the list, not each element. >>>>>> >>>>>> Then it's also future proof so that if a lock is removed, then we >>>>>> don't miss one of the accessors at a later time. Note that >>>>>> observing bugs caused by this is very difficult to do, and can >>>>>> only be done by inspection.?? That's why I erred on the side of >>>>>> safety and consistency. >>> >>> Sorry, it may sound strange to say that I don't agree with "erring >>> on the side of safety and consistency" but I do not agree with just >>> using acquire/release semantics everywhere just in case! If we don't >>> know the lock-free paths then how can we possibly know things are >>> correct. The whole point of these accessors is to make it obvious >>> where the lock-free accesses are. >>> >>>>>>> >>>>>>> That aside I don't understand why you added a level of >>>>>>> indirection with the ProtectionDomainSet class? >>>>>> >>>>>> Only the code is a level of indirection not the access. That is >>>>>> to avoid what I said above.? See Christian's and Zhengyu's comments. >>> >>> Okay - I see what you did but I would not expect to have to protect >>> _pd_set from direct use within its own class - anyone messing with >>> that class should be aware of the need to use the accessors. Though >>> I suppose this encapsulation is little different to defining the >>> field as some kind of "Atomic" type rather than a "raw" type. >>> >>> Thanks, >>> David >>> ----- >>> >>>>>>> >>>>>>> Also we have been trying to include release/acquire in the names >>>>>>> of such accessors so that it is clear when we are relying on >>>>>>> memory ordering properties ie. pd_set_acquire and >>>>>>> release_set_pd_set >>>>>>> >>>>>> >>>>>> I will change the names of these functions. >>>>>> >>>>>> thanks, >>>>>> Coleen >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>>> I reran parallel class loading tests and jck testing is in >>>>>>>> progress, but order access requires inspection. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>> >>>>>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> There are two instances probably overlooked? >>>>>>>>>>>> >>>>>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>>>>> >>>>>>>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>>>>> => >>>>>>>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> Oh yeah, you're right.? That's embarrasing. I'll fix and >>>>>>>>>>> retest. >>>>>>>>>> Which also shows that there is a potential for future >>>>>>>>>> mistakes. Can we isolate the field better so it?s only >>>>>>>>>> accessible via setter and getter? >>>>>>>>> >>>>>>>>> Yes, great idea. >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>>> Thank you!! >>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> -Zhengyu >>>>>>>>>>>> >>>>>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>>>>>> SystemDictionary_lock >>>>>>>>>>>>> >>>>>>>>>>>>> Ran parallel class loading tests that we have as well as >>>>>>>>>>>>> tier1 tests. See bug for details. >>>>>>>>>>>>> >>>>>>>>>>>>> open webrev at >>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Coleen >>>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >> > From zgu at redhat.com Tue Sep 5 15:28:52 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 5 Sep 2017 11:28:52 -0400 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: <1a181bcf-0672-b5ba-3975-f6517a0dd4a9@redhat.com> Hi Dmitry, I have concerns on this change: Although, you only extend tracking stacks for none-production build, eliminating _NMT_NOINLINE_ actually affect production code. Have you tested production build? Chris (cc'ed) worked on fixing stack walking before this change, we should get a feedback from him. Thanks, -Zhengyu On 09/05/2017 10:49 AM, dmitry.samersov wrote: > Everybody, > > Please, review updated webrev: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.06/ > > Only files below different from the previous webrev. > > src/share/vm/services/nmtCommon.hpp > src/share/vm/utilities/nativeCallStack.cpp > src/share/vm/utilities/nativeCallStack.hpp > > > 1. Changes guarded by #ifndef PRODUCT > 2. Addressed Thomas comments > > -Dmitry > > On 31.08.2017 10:49, dmitry.samersov wrote: >> Everybody, >> >> Please review: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ >> >> I would propose different approach to fix JDK-8133740 >> platform-independent way: record all frames but strip unnecessary >> NMT-internal ones on printing. >> >> This approach is safe (we don't depend to compiler inlining and we never >> strip non-NMT frames) and platform independent, but cost us some extra >> memory. >> >> -Dmitry >> >> > > From zgu at redhat.com Tue Sep 5 17:43:32 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 5 Sep 2017 13:43:32 -0400 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: <1e5afb73-8cb3-35aa-dad3-5fc7f8b25a43@redhat.com> References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> <6f01f8a3-8911-d5fc-9208-7dcac5d1874b@redhat.com> <1e5afb73-8cb3-35aa-dad3-5fc7f8b25a43@redhat.com> Message-ID: Hi Andrew, Thanks for the review and suggestions. The webrev is updated according to the discussions. Webrev: http://cr.openjdk.java.net/~zgu/8186770/webrev.01/index.html The sample outputs: Summary: - Class (reserved=1074360KB, committed=28856KB) (classes #4028) (malloc=1208KB #16218) (mmap: reserved=1073152KB, committed=27648KB) ( Metadata: ) ( reserved=24576KB, committed=24320KB) ( used=20914KB) ( free=3295KB) ( waste=111KB =0.46%) ( Class space:) ( reserved=1048576KB, committed=3328KB) ( used=2649KB) ( free=679KB) ( waste=0KB =0.00%) Summary diff: - Class (reserved=1076455KB +2129KB, committed=29415KB +849KB) (classes #4037 +13) (malloc=1255KB +81KB #17477 +2214) (mmap: reserved=1075200KB +2048KB, committed=28160KB +768KB) ( Metadata: ) ( reserved=26624KB +2048KB, committed=24832KB +768KB) ( used=21368KB +718KB) ( free=3336KB -21KB) ( waste=128KB =0.52% +71KB) ( Class space:) ( reserved=1048576KB, committed=3328KB) ( used=2654KB +7KB) ( free=674KB -7KB) ( waste=0KB =0.00%) Thanks, -Zhengyu On 09/05/2017 11:18 AM, Andrew Dinn wrote: > On 29/08/17 17:31, Zhengyu Gu wrote: >> Okay, I see what you mean. But in this case, capacity = committed. > > Well, it does not always seem to be exactly the same. If you add up all > the pieces to derive the capacity then it sometimes seems to fall short > of committed. I looked deeper into this and found that sometimes the > difference is down to rounding up/down. However, there also seems > occasionally to be more space unaccounted for that cannot be explained > by rounding errors. > > I looked into your suggestion that this might be accounted for by 'dark > matter' i.e. tail ends of a chunk left unused when the last block is > carved out and the chunk retired because the tail is too small to insert > into the block dictionary. However, from my reading of the code I think > that any such 'dark matter' will still to show up in the waste space count. > > Rather than hold up this current change I'd prefer to see it pushed and > address the arithmetic problem in a follow-up issue. Even with an > occasional small disparity in the reported figures I think it is really > helpful to have this detailed info available as part of the NMT output. > >> I wonder if it is cleaner that just reports free, used and waste, e.g. >> >> ( Metadata: ) >> ( reserved=22528KB, committed=21504KB) >> ( used=20654KB) >> ( free=786KBKB) >> ( waste=64KB =0.30%) >> >> where free = (capacity - used) + free_chunks + available >> waste = committed - capacity - free_chunks - available >> total = committed > Yes, I agree that it's ok to leave the available figure implicit -- it > is easily computed from the committed total by subtracting used and > waste (that's only correct modulo the occasional small disparity between > capacity and committed but the difference is small enough not to be > significant). So, I'm happy with this version. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From thomas.stuefe at gmail.com Tue Sep 5 18:20:27 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 5 Sep 2017 20:20:27 +0200 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: <57fe156e-3d43-8e64-d6d5-bae9f55fd994@redhat.com> References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <57fe156e-3d43-8e64-d6d5-bae9f55fd994@redhat.com> Message-ID: Hi Zengyu, thanks a lot for the review! New webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.03/webrev/ Delta to last: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.02-to-03/webrev/ Please find comments inline. On Tue, Sep 5, 2017 at 4:22 PM, Zhengyu Gu wrote: > Hi Thomas, > > Looks good overall. > > symbolengine.cpp: > > Is there reason to use ::malloc()/::free() instead of > os::malloc()/os::free() ? > > Yes, this is deliberate, as is the use of Windows CriticalSection objects instead of os::mutex. This coding gets executed during error handling, and I want to prevent circular errors. os::malloc() may crash and cause error handling to be reentered etc. > > Line #602: > buf might not be null-terminated if filename is longer than buffer len. > > > Good catch! Fixed and added regression tests (gtests). Thanks, Thomas > Thanks, > > -Zhengyu > > New Webrev: > > > > > > > > On 09/05/2017 09:05 AM, Thomas St?fe wrote: > >> Hi Goetz, >> >> thank you for your review! >> >> New Webrev: >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >> improve-native-symbol-resolver/webrev.02 >> >> Delta to last: >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >> improve-native-symbol-resolver/webrev.01-to-02/webrev/ >> >> The only change is that I removed the -XX:InitializeDbgHelpEarly switch to >> avoid having to file a CSR. >> >> Please find further comments inline: >> >> >> On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz < >> goetz.lindenmaier at sap.com> wrote: >> >> Hi Thomas, >>> >>> I had a look at your change. Great somebody finally fixes >>> the windows symbol printing, thanks a lot! >>> >>> The code looks good, I'm just not sure whether you >>> need new files symbolengine.c|hpp. Isn't that >>> just what should go to decoder_windows.h|cpp and >>> class Decoder? >>> You would also get rid of the redirections in decoder_windows.cpp. >>> >>> >>> As we discussed, I see your point, but would prefer to leave the change >> for >> the moment as it is. >> >> A similar change to this one - doing away with the AbstractDecoder object >> instantiation layer - will be coming for AIX, where it does not make much >> sense either, and I propose to do a separate cleanup or simplification >> change once that is done, merging decoder_windows.cpp and >> symbolengine.cpp/hpp. Unless I hear more objections from other reviewers, >> I'd prefer to do this in a later patch. >> >> >> In shutdown() you comment >>> // There is no reason ever to shut down the decoder. >>> ... I think you can remove that function altogether, i.e. also >>> from the shared code, I don't see where it is ever called. >>> >>> >>> Totally agree... >> >> >> Also, I think, you can just delete Decoder::can_decode_C_frame_in_vm() >>> from the code. The only place where it is used, in frame.cpp, >>> calls dll_address_to_duntion_name(). This returns useful information >>> also in the case of the NullDecoder, which now is the only one to >>> return false in that function. >>> >>> >> totally agree also here, but would also prefer both issues in a separate >> change. In fact, Ioi opened a bug for this a while ago: >> https://bugs.openjdk.java.net/browse/JDK-8144855 - and I would like to >> fix >> it under that bug. Reason is, in this change, I'd like to avoid changing >> shared sources as much as possible and keep this change windows only. >> >> >> >>> Globals_windows.hpp needs Copyright adaption, please. >>> This is not introduced by your change, but maybe >>> you can also fix the copyright in decoder.hpp, which >>> says " 1997, 2015, 2017" ... should only name two >>> years ... >>> >>> >>> Not needed anymore: since I removed the -XX:InitializeDbgHelpEarly >> switch, >> globals_windows.hpp is reverted to its original state. Do you still want >> me >> to fix the date? >> >> Thanks for the review work! >> >> ..Thomas >> >> >> Best regards, >>> Goetz. >>> >>> >>> >>> >>> >>> >>> -----Original Message----- >>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> bounces at openjdk.java.net] On Behalf Of Thomas St?fe >>>> Sent: Mittwoch, 30. August 2017 14:34 >>>> To: hotspot-runtime-dev at openjdk.java.net >>>> Subject: RFR(m): 8185712: [windows] Improve native symbol decoder >>>> >>>> Hi all, >>>> >>>> May I please have reviews for the following change. >>>> >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 >>>> Webrev: >>>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- >>>> native-symbol-resolver/webrev.01/webrev/ >>>> >>>> (This is the followup to: https://bugs.openjdk.java.net/ >>>> >>> browse/JDK-8186349) >>> >>>> >>>> ------------- >>>> >>>> Basically, this is a reimplementation of the layer around the Windows >>>> Symbol API (the API used to resolve debug symbols). The old >>>> implementation >>>> had a number of errors and shortcomings which together caused the >>>> Windows >>>> native symbol resolution (and hence callstacks in error logs) to be a >>>> bit >>>> of a lottery. The aim of this reimplementation is to make the code more >>>> robust and easier to maintain. >>>> >>>> The problems with the existing implementation are listed in detail in >>>> the >>>> bug description. >>>> >>>> The new implementation: >>>> >>>> - uses the new centralized WindowsDbgHelper class, which wraps the >>>> dbghelp.dll loading, introduced with JDK-8186349 >>>> >>>> - Completely bypasses the "create two instances of AbstractDecoder class >>>> and synchronize access to them" scheme in decoder.cpp. It does not make >>>> sense for windows, where we have to synchronize each access to the >>>> dbghelp.dll anyway - this is done one layer below in WindowsDbgHelper. >>>> >>> The >>> >>>> static methods of the shared Decoder class now directly access the >>>> static >>>> methods in the new SymbolEngine class, see decoder_windows.cpp. >>>> >>>> - The layer wrapping the Symbol API lives in the new >>>> symbolengine.cpp/hpp >>>> files. The coding takes care of properly initializing (once) the symbol >>>> >>> API >>> >>>> and of assembling the pdb search path. >>>> >>>> - Pdb search path construction is changed: where before we just added >>>> jdk >>>> and jvm bin directories, we now just add all directories of all loaded >>>> >>> DLLs >>> >>>> (which, of course, include the jdk and jvm bin directories). That way we >>>> have a high chance of catching pdb files of third party libraries, as >>>> >>> long >>> >>>> as they follow the convention of putting the pdb files beside the dlls. >>>> This means it is easier to analyse crashes where third party DLLs are >>>> involved. >>>> >>>> - On Windows, we now have source file and line number in the callstack. >>>> >>>> - There is a new parameter, diagnostic and windows-only, >>>> called "InitializeDbgHelpEarly". That parameter is by default off. If >>>> on, >>>> it causes the symbol engine to be initialized early, which increases the >>>> chance of good callstacks later on (because the initialization does not >>>> have to run in an error situation). >>>> >>>> - Added tests: gtests and a jtreg test which tests the callstack >>>> >>> printing. >>> >>>> All tests windows only. There is no technical reason for making them >>>> windows only, but I wanted to keep disturbances to other platforms to a >>>> minimum and these kind of tests can be shaky. >>>> >>>> Thanks a lot for reviewing this! >>>> >>>> Kind Regards, Thomas >>>> >>> >>> From chris.plummer at oracle.com Tue Sep 5 18:34:28 2017 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 5 Sep 2017 11:34:28 -0700 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <1a181bcf-0672-b5ba-3975-f6517a0dd4a9@redhat.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> <1a181bcf-0672-b5ba-3975-f6517a0dd4a9@redhat.com> Message-ID: Hi Dmitry, I've looked over the changes and some of the comments so far, and do agree with Zhengyu regarding removal _NMT_NOINLINE_, but I also have concerns about other platform dependent code you have removed. _NMT_NOINLINE is only defined for slowdebug builds. You now are instead trying to change the frame skipping logic to be based on the PRODUCT flag. fastdebug builds do not set the PRODUCT flag, so you cannot possibly get the frame skipping for both slowdebug and fastdebug builds by checking the PRODUCT flag since they each have different frame skipping requirements. Since we have/had no other way of telling the difference between slowdebug and fastdebug builds, I continued to leverage the _NMT_NOINLINE flag for this. NMT_InternalFrames will should not always be 3 for non product builds. It should not only vary between slowdebug and fastdebug, but it also vary between platforms. This is due both to compiler differences and code differences. That's why there is code like this: 36 // We need to skip the NativeCallStack::NativeCallStack frame if a tail call is NOT used 37 // to call os::get_native_stack. A tail call is used if _NMT_NOINLINE_ is not defined 38 // (which means this is not a slowdebug build), and we are on 64-bit (except Windows). 39 // This is not necessarily a rule, but what has been obvserved to date. 40 #define TAIL_CALL (!defined(_NMT_NOINLINE_) && !defined(WINDOWS) && defined(_LP64)) 41 #if !TAIL_CALL 42 toSkip++; 43 #if (defined(_NMT_NOINLINE_) && defined(BSD) && defined(_LP64)) 44 // Mac OS X slowdebug builds have this odd behavior where NativeCallStack::NativeCallStack 45 // appears as two frames, so we need to skip an extra frame. 46 toSkip++; 47 #endif 48 #endif And also in _get_previous_fp(), which varies in name and implementation for various os/cpu combos, the logic for the number of frames to skip is not always the same. You noted that: > 1. This patch doesn't affect product build. On product build we have all > NMT frames inlined and don't need to skip anything. It's not true that for product builds we don't need to skip anything. The code above indicates the tail call differences on some platforms, which requires skipping a frame in product builds on some platforms, but not others. Thomas wrote: > Code is easier to read now and less vulnerable > to compiler decisions. When the reality is that with your changes compiler decisions are being ignored, as are implementation decisions. You are always skipping 3 frames when it's a non-product build, and that isn't uniformly correct for all platforms and for slowdebug vs fastdebug. I think your NMT_InternalFrames solution makes parts of the code easier to read, but in order to be correct its computation needs to be platform dependent, and also account for slowdebug/fastdebug differences. I did write an NMT test to try to make sure frame skipping is correct. It's called CheckForProperDetailStackTrace.java. However, I doubt it's 100% reliable. You should run it with all 3 build flavors: release, slowdebug, fastdebug. And you need to run it on all supported platforms. For linux-arm32, it would be good to actually use an -marm build instead of -mthumb since we don't even get stack traces with -mthumb. thanks, Chris On 9/5/17 8:28 AM, Zhengyu Gu wrote: > Hi Dmitry, > > I have concerns on this change: > > Although, you only extend tracking stacks for none-production build, > eliminating _NMT_NOINLINE_ actually affect production code. Have you > tested production build? > > Chris (cc'ed) worked on fixing stack walking before this change, we > should get a feedback from him. > > Thanks, > > -Zhengyu > > > On 09/05/2017 10:49 AM, dmitry.samersov wrote: >> Everybody, >> >> Please, review updated webrev: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.06/ >> >> Only files below different from the previous webrev. >> >> src/share/vm/services/nmtCommon.hpp >> src/share/vm/utilities/nativeCallStack.cpp >> src/share/vm/utilities/nativeCallStack.hpp >> >> >> 1. Changes guarded by #ifndef PRODUCT >> 2. Addressed Thomas comments >> >> -Dmitry >> >> On 31.08.2017 10:49, dmitry.samersov wrote: >>> Everybody, >>> >>> Please review: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ >>> >>> I would propose different approach to fix JDK-8133740 >>> platform-independent way: record all frames but strip unnecessary >>> NMT-internal ones on printing. >>> >>> This approach is safe (we don't depend to compiler inlining and we >>> never >>> strip non-NMT frames) and platform independent, but cost us some extra >>> memory. >>> >>> -Dmitry >>> >>> >> >> From dmitry.samersoff at bell-sw.com Wed Sep 6 07:26:44 2017 From: dmitry.samersoff at bell-sw.com (dmitry.samersov) Date: Wed, 6 Sep 2017 10:26:44 +0300 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> <1a181bcf-0672-b5ba-3975-f6517a0dd4a9@redhat.com> Message-ID: <7f190cb7-c5b8-8f15-f26f-5e558666e7f5@bell-sw.com> Chris, > You are always skipping 3 > frames when it's a non-product build, and that isn't uniformly correct > for all platforms and for slowdebug vs fastdebug. NMT_InternalFrames means *maximum possible* number of internal frames. We check *a name of the frame* before we skip it, so we don't need to know how many frames got inlined on in particular build by particular compiler and this is the main idea of the fix. i.e. It's safe to set (e.g.) NMT_InternalFrames = 10 and keep this logic in production. I guard the changes by #ifndef PRODUCT to avoid memory overhead in production but if we can afford memory/performance penalty of storing 3 or 4 extra frames, it's better to keep this logic enabled ever in a product build. *Testing:* I'd tested 3 builds (release, fastdebug, slowdebug) on two platforms (Linux/x86_64, Linux/aarch64). All hotspot/runtime/NMT tests are passed including CheckForProperDetailStackTrace. Also, manual comparison of stacktraces on Linux/x86_64 don't show any changes in output. -Dmitry On 05.09.2017 21:34, Chris Plummer wrote: > Hi Dmitry, > > I've looked over the changes and some of the comments so far, and do > agree with Zhengyu regarding removal _NMT_NOINLINE_, but I also have > concerns about other platform dependent code you have removed. > > _NMT_NOINLINE is only defined for slowdebug builds. You now are instead > trying to change the frame skipping logic to be based on the PRODUCT > flag. fastdebug builds do not set the PRODUCT flag, so you cannot > possibly get the frame skipping for both slowdebug and fastdebug builds > by checking the PRODUCT flag since they each have different frame > skipping requirements. Since we have/had no other way of telling the > difference between slowdebug and fastdebug builds, I continued to > leverage the _NMT_NOINLINE flag for this. > > NMT_InternalFrames will should not always be 3 for non product builds. > It should not only vary between slowdebug and fastdebug, but it also > vary between platforms. This is due both to compiler differences and > code differences. That's why there is code like this: > > 36 // We need to skip the NativeCallStack::NativeCallStack frame > if a tail call is NOT used > 37 // to call os::get_native_stack. A tail call is used if > _NMT_NOINLINE_ is not defined > 38 // (which means this is not a slowdebug build), and we are on > 64-bit (except Windows). > 39 // This is not necessarily a rule, but what has been obvserved > to date. > 40 #define TAIL_CALL (!defined(_NMT_NOINLINE_) && !defined(WINDOWS) && > defined(_LP64)) > 41 #if !TAIL_CALL > 42 toSkip++; > 43 #if (defined(_NMT_NOINLINE_) && defined(BSD) && defined(_LP64)) > 44 // Mac OS X slowdebug builds have this odd behavior where > NativeCallStack::NativeCallStack > 45 // appears as two frames, so we need to skip an extra frame. > 46 toSkip++; > 47 #endif > 48 #endif > > And also in _get_previous_fp(), which varies in name and implementation > for various os/cpu combos, the logic for the number of frames to skip is > not always the same. > > You noted that: >> 1. This patch doesn't affect product build. On product build we have all >> NMT frames inlined and don't need to skip anything. > It's not true that for product builds we don't need to skip anything. > The code above indicates the tail call differences on some platforms, > which requires skipping a frame in product builds on some platforms, but > not others. > > Thomas wrote: >> Code is easier to read now and less vulnerable >> to compiler decisions. > When the reality is that with your changes compiler decisions are being > ignored, as are implementation decisions. You are always skipping 3 > frames when it's a non-product build, and that isn't uniformly correct > for all platforms and for slowdebug vs fastdebug. > > I think your NMT_InternalFrames solution makes parts of the code easier > to read, but in order to be correct its computation needs to be platform > dependent, and also account for slowdebug/fastdebug differences. > > I did write an NMT test to try to make sure frame skipping is correct. > It's called CheckForProperDetailStackTrace.java. However, I doubt it's > 100% reliable. You should run it with all 3 build flavors: release, > slowdebug, fastdebug. And you need to run it on all supported platforms. > For linux-arm32, it would be good to actually use an -marm build instead > of -mthumb since we don't even get stack traces with -mthumb. > > thanks, > > Chris > > On 9/5/17 8:28 AM, Zhengyu Gu wrote: >> Hi Dmitry, >> >> I have concerns on this change: >> >> Although, you only extend tracking stacks for none-production build, >> eliminating _NMT_NOINLINE_ actually affect production code. Have you >> tested production build? >> >> Chris (cc'ed) worked on fixing stack walking before this change, we >> should get a feedback from him. >> >> Thanks, >> >> -Zhengyu >> >> >> On 09/05/2017 10:49 AM, dmitry.samersov wrote: >>> Everybody, >>> >>> Please, review updated webrev: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.06/ >>> >>> Only files below different from the previous webrev. >>> >>> src/share/vm/services/nmtCommon.hpp >>> src/share/vm/utilities/nativeCallStack.cpp >>> src/share/vm/utilities/nativeCallStack.hpp >>> >>> >>> 1. Changes guarded by #ifndef PRODUCT >>> 2. Addressed Thomas comments >>> >>> -Dmitry >>> >>> On 31.08.2017 10:49, dmitry.samersov wrote: >>>> Everybody, >>>> >>>> Please review: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ >>>> >>>> I would propose different approach to fix JDK-8133740 >>>> platform-independent way: record all frames but strip unnecessary >>>> NMT-internal ones on printing. >>>> >>>> This approach is safe (we don't depend to compiler inlining and we >>>> never >>>> strip non-NMT frames) and platform independent, but cost us some extra >>>> memory. >>>> >>>> -Dmitry >>>> >>>> >>> >>> > From goetz.lindenmaier at sap.com Wed Sep 6 08:18:58 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 6 Sep 2017 08:18:58 +0000 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> Message-ID: Hi Thomas, I had a look at the new webrev you sent after Zhengyu's comments. I appreciate the new tests. Looks good. I still think removal of Decoder::can_decode_C_frame_in_vm() should go into this change, because windows was the only platform to use this. If you insist put it in a change of its own, but to me it seems odd to leave this in the code in your change. Best regards, Goetz. > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Dienstag, 5. September 2017 15:06 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net; Ioi Lam > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder > > Hi Goetz, > > thank you for your review! > > New Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > native-symbol-resolver/webrev.02 > > > Delta to last: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > native-symbol-resolver/webrev.01-to-02/webrev/ > > > The only change is that I removed the -XX:InitializeDbgHelpEarly switch to > avoid having to file a CSR. > > Please find further comments inline: > > > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz > > > wrote: > > > Hi Thomas, > > I had a look at your change. Great somebody finally fixes > the windows symbol printing, thanks a lot! > > The code looks good, I'm just not sure whether you > need new files symbolengine.c|hpp. Isn't that > just what should go to decoder_windows.h|cpp and > class Decoder? > You would also get rid of the redirections in decoder_windows.cpp. > > > > > As we discussed, I see your point, but would prefer to leave the change for > the moment as it is. > > A similar change to this one - doing away with the AbstractDecoder object > instantiation layer - will be coming for AIX, where it does not make much > sense either, and I propose to do a separate cleanup or simplification change > once that is done, merging decoder_windows.cpp and > symbolengine.cpp/hpp. Unless I hear more objections from other reviewers, > I'd prefer to do this in a later patch. > > > > In shutdown() you comment > // There is no reason ever to shut down the decoder. > ... I think you can remove that function altogether, i.e. also > from the shared code, I don't see where it is ever called. > > > > > Totally agree... > > > Also, I think, you can just delete > Decoder::can_decode_C_frame_in_vm() > from the code. The only place where it is used, in frame.cpp, > calls dll_address_to_duntion_name(). This returns useful information > also in the case of the NullDecoder, which now is the only one to > return false in that function. > > > > totally agree also here, but would also prefer both issues in a separate > change. In fact, Ioi opened a bug for this a while ago: > https://bugs.openjdk.java.net/browse/JDK-8144855 - and I would like to fix > it under that bug. Reason is, in this change, I'd like to avoid changing shared > sources as much as possible and keep this change windows only. > > > > Globals_windows.hpp needs Copyright adaption, please. > This is not introduced by your change, but maybe > you can also fix the copyright in decoder.hpp, which > says " 1997, 2015, 2017" ... should only name two > years ... > > > > > Not needed anymore: since I removed the -XX:InitializeDbgHelpEarly switch, > globals_windows.hpp is reverted to its original state. Do you still want me to > fix the date? > > Thanks for the review work! > > ..Thomas > > > Best regards, > Goetz. > > > > > > > > > -----Original Message----- > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > > bounces at openjdk.java.net ] > On Behalf Of Thomas St?fe > > Sent: Mittwoch, 30. August 2017 14:34 > > To: hotspot-runtime-dev at openjdk.java.net runtime-dev at openjdk.java.net> > > Subject: RFR(m): 8185712: [windows] Improve native symbol > decoder > > > > Hi all, > > > > May I please have reviews for the following change. > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > > > Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > improve- improve-> > > native-symbol-resolver/webrev.01/webrev/ > > > > (This is the followup to: > https://bugs.openjdk.java.net/browse/JDK-8186349 > ) > > > > ------------- > > > > Basically, this is a reimplementation of the layer around the > Windows > > Symbol API (the API used to resolve debug symbols). The old > > implementation > > had a number of errors and shortcomings which together caused > the > > Windows > > native symbol resolution (and hence callstacks in error logs) to be a > bit > > of a lottery. The aim of this reimplementation is to make the code > more > > robust and easier to maintain. > > > > The problems with the existing implementation are listed in detail > in the > > bug description. > > > > The new implementation: > > > > - uses the new centralized WindowsDbgHelper class, which wraps > the > > dbghelp.dll loading, introduced with JDK-8186349 > > > > - Completely bypasses the "create two instances of > AbstractDecoder class > > and synchronize access to them" scheme in decoder.cpp. It does > not make > > sense for windows, where we have to synchronize each access to > the > > dbghelp.dll anyway - this is done one layer below in > WindowsDbgHelper. The > > static methods of the shared Decoder class now directly access the > static > > methods in the new SymbolEngine class, see > decoder_windows.cpp. > > > > - The layer wrapping the Symbol API lives in the new > symbolengine.cpp/hpp > > files. The coding takes care of properly initializing (once) the symbol > API > > and of assembling the pdb search path. > > > > - Pdb search path construction is changed: where before we just > added jdk > > and jvm bin directories, we now just add all directories of all loaded > DLLs > > (which, of course, include the jdk and jvm bin directories). That way > we > > have a high chance of catching pdb files of third party libraries, as > long > > as they follow the convention of putting the pdb files beside the > dlls. > > This means it is easier to analyse crashes where third party DLLs are > > involved. > > > > - On Windows, we now have source file and line number in the > callstack. > > > > - There is a new parameter, diagnostic and windows-only, > > called "InitializeDbgHelpEarly". That parameter is by default off. If > on, > > it causes the symbol engine to be initialized early, which increases > the > > chance of good callstacks later on (because the initialization does > not > > have to run in an error situation). > > > > - Added tests: gtests and a jtreg test which tests the callstack > printing. > > All tests windows only. There is no technical reason for making > them > > windows only, but I wanted to keep disturbances to other > platforms to a > > minimum and these kind of tests can be shaky. > > > > Thanks a lot for reviewing this! > > > > Kind Regards, Thomas > > From zgu at redhat.com Wed Sep 6 12:32:48 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 6 Sep 2017 08:32:48 -0400 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <57fe156e-3d43-8e64-d6d5-bae9f55fd994@redhat.com> Message-ID: <785edd28-0e47-5d2d-26ad-9d63fbabe791@redhat.com> Looks good to me. -Zhengyu On 09/05/2017 02:20 PM, Thomas St?fe wrote: > Hi Zengyu, > > thanks a lot for the review! > > New webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.03/webrev/ > > Delta to last: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.02-to-03/webrev/ > > Please find comments inline. > > On Tue, Sep 5, 2017 at 4:22 PM, Zhengyu Gu > wrote: > > Hi Thomas,Lo > > Looks good overall. > > symbolengine.cpp: > > Is there reason to use ::malloc()/::free() instead of > os::malloc()/os::free() ? > > > Yes, this is deliberate, as is the use of Windows CriticalSection > objects instead of os::mutex. This coding gets executed during error > handling, and I want to prevent circular errors. os::malloc() may crash > and cause error handling to be reentered etc. > > > Line #602: > buf might not be null-terminated if filename is longer than > buffer len. > > > > Good catch! Fixed and added regression tests (gtests). > > Thanks, Thomas > > Thanks, > > -Zhengyu > > > New Webrev: > > > > > > > > > > On 09/05/2017 09:05 AM, Thomas St?fe wrote: > > Hi Goetz, > > thank you for your review! > > New Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.02 > > > Delta to last: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.01-to-02/webrev/ > > > The only change is that I removed the -XX:InitializeDbgHelpEarly > switch to > avoid having to file a CSR. > > Please find further comments inline: > > > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz < > goetz.lindenmaier at sap.com > wrote: > > Hi Thomas, > > I had a look at your change. Great somebody finally fixes > the windows symbol printing, thanks a lot! > > The code looks good, I'm just not sure whether you > need new files symbolengine.c|hpp. Isn't that > just what should go to decoder_windows.h|cpp and > class Decoder? > You would also get rid of the redirections in > decoder_windows.cpp. > > > As we discussed, I see your point, but would prefer to leave the > change for > the moment as it is. > > A similar change to this one - doing away with the > AbstractDecoder object > instantiation layer - will be coming for AIX, where it does not > make much > sense either, and I propose to do a separate cleanup or > simplification > change once that is done, merging decoder_windows.cpp and > symbolengine.cpp/hpp. Unless I hear more objections from other > reviewers, > I'd prefer to do this in a later patch. > > > In shutdown() you comment > // There is no reason ever to shut down the decoder. > ... I think you can remove that function altogether, i.e. also > from the shared code, I don't see where it is ever called. > > > Totally agree... > > > Also, I think, you can just delete > Decoder::can_decode_C_frame_in_vm() > from the code. The only place where it is used, in frame.cpp, > calls dll_address_to_duntion_name(). This returns useful > information > also in the case of the NullDecoder, which now is the only > one to > return false in that function. > > > totally agree also here, but would also prefer both issues in a > separate > change. In fact, Ioi opened a bug for this a while ago: > https://bugs.openjdk.java.net/browse/JDK-8144855 > - and I would > like to fix > it under that bug. Reason is, in this change, I'd like to avoid > changing > shared sources as much as possible and keep this change windows > only. > > > > Globals_windows.hpp needs Copyright adaption, please. > This is not introduced by your change, but maybe > you can also fix the copyright in decoder.hpp, which > says " 1997, 2015, 2017" ... should only name two > years ... > > > Not needed anymore: since I removed the > -XX:InitializeDbgHelpEarly switch, > globals_windows.hpp is reverted to its original state. Do you > still want me > to fix the date? > > Thanks for the review work! > > ..Thomas > > > Best regards, > Goetz. > > > > > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > bounces at openjdk.java.net > ] On Behalf Of Thomas St?fe > Sent: Mittwoch, 30. August 2017 14:34 > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR(m): 8185712: [windows] Improve native > symbol decoder > > Hi all, > > May I please have reviews for the following change. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > > Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > > native-symbol-resolver/webrev.01/webrev/ > > (This is the followup to: https://bugs.openjdk.java.net/ > > browse/JDK-8186349) > > > ------------- > > Basically, this is a reimplementation of the layer > around the Windows > Symbol API (the API used to resolve debug symbols). The old > implementation > had a number of errors and shortcomings which together > caused the > Windows > native symbol resolution (and hence callstacks in error > logs) to be a bit > of a lottery. The aim of this reimplementation is to > make the code more > robust and easier to maintain. > > The problems with the existing implementation are listed > in detail in the > bug description. > > The new implementation: > > - uses the new centralized WindowsDbgHelper class, which > wraps the > dbghelp.dll loading, introduced with JDK-8186349 > > - Completely bypasses the "create two instances of > AbstractDecoder class > and synchronize access to them" scheme in decoder.cpp. > It does not make > sense for windows, where we have to synchronize each > access to the > dbghelp.dll anyway - this is done one layer below in > WindowsDbgHelper. > > The > > static methods of the shared Decoder class now directly > access the static > methods in the new SymbolEngine class, see > decoder_windows.cpp. > > - The layer wrapping the Symbol API lives in the new > symbolengine.cpp/hpp > files. The coding takes care of properly initializing > (once) the symbol > > API > > and of assembling the pdb search path. > > - Pdb search path construction is changed: where before > we just added jdk > and jvm bin directories, we now just add all directories > of all loaded > > DLLs > > (which, of course, include the jdk and jvm bin > directories). That way we > have a high chance of catching pdb files of third party > libraries, as > > long > > as they follow the convention of putting the pdb files > beside the dlls. > This means it is easier to analyse crashes where third > party DLLs are > involved. > > - On Windows, we now have source file and line number in > the callstack. > > - There is a new parameter, diagnostic and windows-only, > called "InitializeDbgHelpEarly". That parameter is by > default off. If on, > it causes the symbol engine to be initialized early, > which increases the > chance of good callstacks later on (because the > initialization does not > have to run in an error situation). > > - Added tests: gtests and a jtreg test which tests the > callstack > > printing. > > All tests windows only. There is no technical reason for > making them > windows only, but I wanted to keep disturbances to other > platforms to a > minimum and these kind of tests can be shaky. > > Thanks a lot for reviewing this! > > Kind Regards, Thomas > > > From thomas.stuefe at gmail.com Wed Sep 6 12:37:30 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 6 Sep 2017 14:37:30 +0200 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> Message-ID: Hi Goetz, On Wed, Sep 6, 2017 at 10:18 AM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi Thomas, > > I had a look at the new webrev you sent after Zhengyu's comments. > I appreciate the new tests. Looks good. > > I still think removal of Decoder::can_decode_C_frame_in_vm() should > go into this change, because windows was the only platform to use this. > If you insist put it in a change of its own, but to me it seems odd to > leave > this in the code in your change. > > Best regards, > Goetz. > Okay, you convinced me. I removed both Decoder::can_decode_C_frame_in_vm() and Decoder::shutdown() as you suggested in your earlier review. New Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.04/webrev/index.html Delta: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.03-to-04/webrev/index.html Note to other reviewers: This new webrev just removes dead code, it should not have any function change over webrev.03. I did build on Linux x64, Aix, MacOS and Windows (32/64bit) and ran gtests on these platforms. Will run jtreg tests tonight. Thanks, Thomas > > > > -----Original Message----- > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > > Sent: Dienstag, 5. September 2017 15:06 > > To: Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net; Ioi Lam > > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder > > > > Hi Goetz, > > > > thank you for your review! > > > > New Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > > native-symbol-resolver/webrev.02 > > > > > > Delta to last: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > > native-symbol-resolver/webrev.01-to-02/webrev/ > > > > > > The only change is that I removed the -XX:InitializeDbgHelpEarly switch > to > > avoid having to file a CSR. > > > > Please find further comments inline: > > > > > > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz > > > > > wrote: > > > > > > Hi Thomas, > > > > I had a look at your change. Great somebody finally fixes > > the windows symbol printing, thanks a lot! > > > > The code looks good, I'm just not sure whether you > > need new files symbolengine.c|hpp. Isn't that > > just what should go to decoder_windows.h|cpp and > > class Decoder? > > You would also get rid of the redirections in decoder_windows.cpp. > > > > > > > > > > As we discussed, I see your point, but would prefer to leave the change > for > > the moment as it is. > > > > A similar change to this one - doing away with the AbstractDecoder object > > instantiation layer - will be coming for AIX, where it does not make much > > sense either, and I propose to do a separate cleanup or simplification > change > > once that is done, merging decoder_windows.cpp and > > symbolengine.cpp/hpp. Unless I hear more objections from other reviewers, > > I'd prefer to do this in a later patch. > > > > > > > > In shutdown() you comment > > // There is no reason ever to shut down the decoder. > > ... I think you can remove that function altogether, i.e. also > > from the shared code, I don't see where it is ever called. > > > > > > > > > > Totally agree... > > > > > > Also, I think, you can just delete > > Decoder::can_decode_C_frame_in_vm() > > from the code. The only place where it is used, in frame.cpp, > > calls dll_address_to_duntion_name(). This returns useful > information > > also in the case of the NullDecoder, which now is the only one to > > return false in that function. > > > > > > > > totally agree also here, but would also prefer both issues in a separate > > change. In fact, Ioi opened a bug for this a while ago: > > https://bugs.openjdk.java.net/browse/JDK-8144855 - and I would like to > fix > > it under that bug. Reason is, in this change, I'd like to avoid changing > shared > > sources as much as possible and keep this change windows only. > > > > > > > > Globals_windows.hpp needs Copyright adaption, please. > > This is not introduced by your change, but maybe > > you can also fix the copyright in decoder.hpp, which > > says " 1997, 2015, 2017" ... should only name two > > years ... > > > > > > > > > > Not needed anymore: since I removed the -XX:InitializeDbgHelpEarly > switch, > > globals_windows.hpp is reverted to its original state. Do you still want > me to > > fix the date? > > > > Thanks for the review work! > > > > ..Thomas > > > > > > Best regards, > > Goetz. > > > > > > > > > > > > > > > > > -----Original Message----- > > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > > > > bounces at openjdk.java.net ] > > On Behalf Of Thomas St?fe > > > Sent: Mittwoch, 30. August 2017 14:34 > > > To: hotspot-runtime-dev at openjdk.java.net > runtime-dev at openjdk.java.net> > > > Subject: RFR(m): 8185712: [windows] Improve native symbol > > decoder > > > > > > Hi all, > > > > > > May I please have reviews for the following change. > > > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > > > > > Webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > > improve- > improve-> > > > native-symbol-resolver/webrev.01/webrev/ > > > > > > (This is the followup to: > > https://bugs.openjdk.java.net/browse/JDK-8186349 > > ) > > > > > > ------------- > > > > > > Basically, this is a reimplementation of the layer around the > > Windows > > > Symbol API (the API used to resolve debug symbols). The old > > > implementation > > > had a number of errors and shortcomings which together caused > > the > > > Windows > > > native symbol resolution (and hence callstacks in error logs) to > be a > > bit > > > of a lottery. The aim of this reimplementation is to make the > code > > more > > > robust and easier to maintain. > > > > > > The problems with the existing implementation are listed in > detail > > in the > > > bug description. > > > > > > The new implementation: > > > > > > - uses the new centralized WindowsDbgHelper class, which wraps > > the > > > dbghelp.dll loading, introduced with JDK-8186349 > > > > > > - Completely bypasses the "create two instances of > > AbstractDecoder class > > > and synchronize access to them" scheme in decoder.cpp. It does > > not make > > > sense for windows, where we have to synchronize each access to > > the > > > dbghelp.dll anyway - this is done one layer below in > > WindowsDbgHelper. The > > > static methods of the shared Decoder class now directly access > the > > static > > > methods in the new SymbolEngine class, see > > decoder_windows.cpp. > > > > > > - The layer wrapping the Symbol API lives in the new > > symbolengine.cpp/hpp > > > files. The coding takes care of properly initializing (once) the > symbol > > API > > > and of assembling the pdb search path. > > > > > > - Pdb search path construction is changed: where before we just > > added jdk > > > and jvm bin directories, we now just add all directories of all > loaded > > DLLs > > > (which, of course, include the jdk and jvm bin directories). > That way > > we > > > have a high chance of catching pdb files of third party > libraries, as > > long > > > as they follow the convention of putting the pdb files beside the > > dlls. > > > This means it is easier to analyse crashes where third party > DLLs are > > > involved. > > > > > > - On Windows, we now have source file and line number in the > > callstack. > > > > > > - There is a new parameter, diagnostic and windows-only, > > > called "InitializeDbgHelpEarly". That parameter is by default > off. If > > on, > > > it causes the symbol engine to be initialized early, which > increases > > the > > > chance of good callstacks later on (because the initialization > does > > not > > > have to run in an error situation). > > > > > > - Added tests: gtests and a jtreg test which tests the callstack > > printing. > > > All tests windows only. There is no technical reason for making > > them > > > windows only, but I wanted to keep disturbances to other > > platforms to a > > > minimum and these kind of tests can be shaky. > > > > > > Thanks a lot for reviewing this! > > > > > > Kind Regards, Thomas > > > > > > From goetz.lindenmaier at sap.com Wed Sep 6 13:11:26 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 6 Sep 2017 13:11:26 +0000 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> Message-ID: <85c13adbd5564c88b3d4cb70b0523180@sap.com> HI Thomas, thanks for removing all that useless code. Looks perfect now :) Best regards, Goetz. > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Mittwoch, 6. September 2017 14:38 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net; Ioi Lam ; > Zhengyu Gu > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder > > Hi Goetz, > > On Wed, Sep 6, 2017 at 10:18 AM, Lindenmaier, Goetz > > > wrote: > > > Hi Thomas, > > I had a look at the new webrev you sent after Zhengyu's comments. > I appreciate the new tests. Looks good. > > I still think removal of Decoder::can_decode_C_frame_in_vm() > should > go into this change, because windows was the only platform to use > this. > If you insist put it in a change of its own, but to me it seems odd to > leave > this in the code in your change. > > Best regards, > Goetz. > > > > Okay, you convinced me. I removed both > Decoder::can_decode_C_frame_in_vm() and Decoder::shutdown() as you > suggested in your earlier review. > > New Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > native-symbol-resolver/webrev.04/webrev/index.html > > > Delta: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > native-symbol-resolver/webrev.03-to-04/webrev/index.html > > > Note to other reviewers: This new webrev just removes dead code, it should > not have any function change over webrev.03. > > I did build on Linux x64, Aix, MacOS and Windows (32/64bit) and ran gtests on > these platforms. Will run jtreg tests tonight. > > Thanks, Thomas > > > > > > -----Original Message----- > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > ] > > Sent: Dienstag, 5. September 2017 15:06 > > To: Lindenmaier, Goetz > > > Cc: hotspot-runtime-dev at openjdk.java.net runtime-dev at openjdk.java.net> ; Ioi Lam > > > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol > decoder > > > > Hi Goetz, > > > > thank you for your review! > > > > New Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > improve- improve-> > > native-symbol-resolver/webrev.02 > > > > > > Delta to last: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > improve- improve-> > > native-symbol-resolver/webrev.01-to-02/webrev/ > > > > > > The only change is that I removed the -XX:InitializeDbgHelpEarly > switch to > > avoid having to file a CSR. > > > > Please find further comments inline: > > > > > > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz > > > > > > > wrote: > > > > > > Hi Thomas, > > > > I had a look at your change. Great somebody finally fixes > > the windows symbol printing, thanks a lot! > > > > The code looks good, I'm just not sure whether you > > need new files symbolengine.c|hpp. Isn't that > > just what should go to decoder_windows.h|cpp and > > class Decoder? > > You would also get rid of the redirections in > decoder_windows.cpp. > > > > > > > > > > As we discussed, I see your point, but would prefer to leave the > change for > > the moment as it is. > > > > A similar change to this one - doing away with the AbstractDecoder > object > > instantiation layer - will be coming for AIX, where it does not make > much > > sense either, and I propose to do a separate cleanup or > simplification change > > once that is done, merging decoder_windows.cpp and > > symbolengine.cpp/hpp. Unless I hear more objections from other > reviewers, > > I'd prefer to do this in a later patch. > > > > > > > > In shutdown() you comment > > // There is no reason ever to shut down the decoder. > > ... I think you can remove that function altogether, i.e. also > > from the shared code, I don't see where it is ever called. > > > > > > > > > > Totally agree... > > > > > > Also, I think, you can just delete > > Decoder::can_decode_C_frame_in_vm() > > from the code. The only place where it is used, in frame.cpp, > > calls dll_address_to_duntion_name(). This returns useful > information > > also in the case of the NullDecoder, which now is the only one to > > return false in that function. > > > > > > > > totally agree also here, but would also prefer both issues in a > separate > > change. In fact, Ioi opened a bug for this a while ago: > > https://bugs.openjdk.java.net/browse/JDK-8144855 > - and I would like to > fix > > it under that bug. Reason is, in this change, I'd like to avoid changing > shared > > sources as much as possible and keep this change windows only. > > > > > > > > Globals_windows.hpp needs Copyright adaption, please. > > This is not introduced by your change, but maybe > > you can also fix the copyright in decoder.hpp, which > > says " 1997, 2015, 2017" ... should only name two > > years ... > > > > > > > > > > Not needed anymore: since I removed the - > XX:InitializeDbgHelpEarly switch, > > globals_windows.hpp is reverted to its original state. Do you still > want me to > > fix the date? > > > > Thanks for the review work! > > > > ..Thomas > > > > > > Best regards, > > Goetz. > > > > > > > > > > > > > > > > > -----Original Message----- > > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > > > > > > > bounces at openjdk.java.net > > ] > > On Behalf Of Thomas St?fe > > > Sent: Mittwoch, 30. August 2017 14:34 > > > To: hotspot-runtime-dev at openjdk.java.net runtime-dev at openjdk.java.net> > > runtime-dev at openjdk.java.net dev at openjdk.java.net> > > > > Subject: RFR(m): 8185712: [windows] Improve native symbol > > decoder > > > > > > Hi all, > > > > > > May I please have reviews for the following change. > > > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > > > > > > > Webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712- > windows- > > > improve- windows- > > > improve-> > > > native-symbol-resolver/webrev.01/webrev/ > > > > > > (This is the followup to: > > https://bugs.openjdk.java.net/browse/JDK-8186349 > > > > ) > > > > > > > ------------- > > > > > > Basically, this is a reimplementation of the layer around the > > Windows > > > Symbol API (the API used to resolve debug symbols). The old > > > implementation > > > had a number of errors and shortcomings which together > caused > > the > > > Windows > > > native symbol resolution (and hence callstacks in error logs) to > be a > > bit > > > of a lottery. The aim of this reimplementation is to make the > code > > more > > > robust and easier to maintain. > > > > > > The problems with the existing implementation are listed in > detail > > in the > > > bug description. > > > > > > The new implementation: > > > > > > - uses the new centralized WindowsDbgHelper class, which > wraps > > the > > > dbghelp.dll loading, introduced with JDK-8186349 > > > > > > - Completely bypasses the "create two instances of > > AbstractDecoder class > > > and synchronize access to them" scheme in decoder.cpp. It > does > > not make > > > sense for windows, where we have to synchronize each access > to > > the > > > dbghelp.dll anyway - this is done one layer below in > > WindowsDbgHelper. The > > > static methods of the shared Decoder class now directly access > the > > static > > > methods in the new SymbolEngine class, see > > decoder_windows.cpp. > > > > > > - The layer wrapping the Symbol API lives in the new > > symbolengine.cpp/hpp > > > files. The coding takes care of properly initializing (once) the > symbol > > API > > > and of assembling the pdb search path. > > > > > > - Pdb search path construction is changed: where before we > just > > added jdk > > > and jvm bin directories, we now just add all directories of all > loaded > > DLLs > > > (which, of course, include the jdk and jvm bin directories). That > way > > we > > > have a high chance of catching pdb files of third party libraries, > as > > long > > > as they follow the convention of putting the pdb files beside > the > > dlls. > > > This means it is easier to analyse crashes where third party > DLLs are > > > involved. > > > > > > - On Windows, we now have source file and line number in the > > callstack. > > > > > > - There is a new parameter, diagnostic and windows-only, > > > called "InitializeDbgHelpEarly". That parameter is by default > off. If > > on, > > > it causes the symbol engine to be initialized early, which > increases > > the > > > chance of good callstacks later on (because the initialization > does > > not > > > have to run in an error situation). > > > > > > - Added tests: gtests and a jtreg test which tests the callstack > > printing. > > > All tests windows only. There is no technical reason for making > > them > > > windows only, but I wanted to keep disturbances to other > > platforms to a > > > minimum and these kind of tests can be shaky. > > > > > > Thanks a lot for reviewing this! > > > > > > Kind Regards, Thomas > > > > > > > From thomas.stuefe at gmail.com Wed Sep 6 13:16:32 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 6 Sep 2017 15:16:32 +0200 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: <85c13adbd5564c88b3d4cb70b0523180@sap.com> References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <85c13adbd5564c88b3d4cb70b0523180@sap.com> Message-ID: Great, thank you! On Wed, Sep 6, 2017 at 3:11 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > HI Thomas, > > thanks for removing all that useless code. Looks perfect now :) > > Best regards, > Goetz. > > > -----Original Message----- > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > > Sent: Mittwoch, 6. September 2017 14:38 > > To: Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net; Ioi Lam ; > > Zhengyu Gu > > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder > > > > Hi Goetz, > > > > On Wed, Sep 6, 2017 at 10:18 AM, Lindenmaier, Goetz > > > > > wrote: > > > > > > Hi Thomas, > > > > I had a look at the new webrev you sent after Zhengyu's comments. > > I appreciate the new tests. Looks good. > > > > I still think removal of Decoder::can_decode_C_frame_in_vm() > > should > > go into this change, because windows was the only platform to use > > this. > > If you insist put it in a change of its own, but to me it seems > odd to > > leave > > this in the code in your change. > > > > Best regards, > > Goetz. > > > > > > > > Okay, you convinced me. I removed both > > Decoder::can_decode_C_frame_in_vm() and Decoder::shutdown() as you > > suggested in your earlier review. > > > > New Webrev: > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > > native-symbol-resolver/webrev.04/webrev/index.html > > > > > > Delta: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > > native-symbol-resolver/webrev.03-to-04/webrev/index.html > > > > > > Note to other reviewers: This new webrev just removes dead code, it > should > > not have any function change over webrev.03. > > > > I did build on Linux x64, Aix, MacOS and Windows (32/64bit) and ran > gtests on > > these platforms. Will run jtreg tests tonight. > > > > Thanks, Thomas > > > > > > > > > > > -----Original Message----- > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > > ] > > > Sent: Dienstag, 5. September 2017 15:06 > > > To: Lindenmaier, Goetz > > > > > Cc: hotspot-runtime-dev at openjdk.java.net > runtime-dev at openjdk.java.net> ; Ioi Lam > > > > > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol > > decoder > > > > > > Hi Goetz, > > > > > > thank you for your review! > > > > > > New Webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > > improve- > improve-> > > > native-symbol-resolver/webrev.02 > > > > > > > > > Delta to last: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > > improve- > improve-> > > > native-symbol-resolver/webrev.01-to-02/webrev/ > > > > > > > > > The only change is that I removed the -XX:InitializeDbgHelpEarly > > switch to > > > avoid having to file a CSR. > > > > > > Please find further comments inline: > > > > > > > > > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz > > > > > > > > > > > > wrote: > > > > > > > > > Hi Thomas, > > > > > > I had a look at your change. Great somebody finally fixes > > > the windows symbol printing, thanks a lot! > > > > > > The code looks good, I'm just not sure whether you > > > need new files symbolengine.c|hpp. Isn't that > > > just what should go to decoder_windows.h|cpp and > > > class Decoder? > > > You would also get rid of the redirections in > > decoder_windows.cpp. > > > > > > > > > > > > > > > As we discussed, I see your point, but would prefer to leave the > > change for > > > the moment as it is. > > > > > > A similar change to this one - doing away with the > AbstractDecoder > > object > > > instantiation layer - will be coming for AIX, where it does not > make > > much > > > sense either, and I propose to do a separate cleanup or > > simplification change > > > once that is done, merging decoder_windows.cpp and > > > symbolengine.cpp/hpp. Unless I hear more objections from other > > reviewers, > > > I'd prefer to do this in a later patch. > > > > > > > > > > > > In shutdown() you comment > > > // There is no reason ever to shut down the decoder. > > > ... I think you can remove that function altogether, i.e. > also > > > from the shared code, I don't see where it is ever called. > > > > > > > > > > > > > > > Totally agree... > > > > > > > > > Also, I think, you can just delete > > > Decoder::can_decode_C_frame_in_vm() > > > from the code. The only place where it is used, in > frame.cpp, > > > calls dll_address_to_duntion_name(). This returns useful > > information > > > also in the case of the NullDecoder, which now is the only > one to > > > return false in that function. > > > > > > > > > > > > totally agree also here, but would also prefer both issues in a > > separate > > > change. In fact, Ioi opened a bug for this a while ago: > > > https://bugs.openjdk.java.net/browse/JDK-8144855 > > - and I would like > to > > fix > > > it under that bug. Reason is, in this change, I'd like to avoid > changing > > shared > > > sources as much as possible and keep this change windows only. > > > > > > > > > > > > Globals_windows.hpp needs Copyright adaption, please. > > > This is not introduced by your change, but maybe > > > you can also fix the copyright in decoder.hpp, which > > > says " 1997, 2015, 2017" ... should only name two > > > years ... > > > > > > > > > > > > > > > Not needed anymore: since I removed the - > > XX:InitializeDbgHelpEarly switch, > > > globals_windows.hpp is reverted to its original state. Do you > still > > want me to > > > fix the date? > > > > > > Thanks for the review work! > > > > > > ..Thomas > > > > > > > > > Best regards, > > > Goetz. > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > > > > > > > > > > > > bounces at openjdk.java.net > > > > ] > > > On Behalf Of Thomas St?fe > > > > Sent: Mittwoch, 30. August 2017 14:34 > > > > To: hotspot-runtime-dev at openjdk.java.net hotspot- > > runtime-dev at openjdk.java.net> > > > runtime-dev at openjdk.java.net > dev at openjdk.java.net> > > > > > Subject: RFR(m): 8185712: [windows] Improve native symbol > > > decoder > > > > > > > > Hi all, > > > > > > > > May I please have reviews for the following change. > > > > > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > > > > > > > > > > > Webrev: > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712- > > windows- > > > > > improve- > windows- > > > > > improve-> > > > > native-symbol-resolver/webrev.01/webrev/ > > > > > > > > (This is the followup to: > > > https://bugs.openjdk.java.net/browse/JDK-8186349 > > > > > > > ) > > > > > > > > > > ------------- > > > > > > > > Basically, this is a reimplementation of the layer > around the > > > Windows > > > > Symbol API (the API used to resolve debug symbols). The > old > > > > implementation > > > > had a number of errors and shortcomings which together > > caused > > > the > > > > Windows > > > > native symbol resolution (and hence callstacks in error > logs) to > > be a > > > bit > > > > of a lottery. The aim of this reimplementation is to > make the > > code > > > more > > > > robust and easier to maintain. > > > > > > > > The problems with the existing implementation are listed > in > > detail > > > in the > > > > bug description. > > > > > > > > The new implementation: > > > > > > > > - uses the new centralized WindowsDbgHelper class, which > > wraps > > > the > > > > dbghelp.dll loading, introduced with JDK-8186349 > > > > > > > > - Completely bypasses the "create two instances of > > > AbstractDecoder class > > > > and synchronize access to them" scheme in decoder.cpp. It > > does > > > not make > > > > sense for windows, where we have to synchronize each > access > > to > > > the > > > > dbghelp.dll anyway - this is done one layer below in > > > WindowsDbgHelper. The > > > > static methods of the shared Decoder class now directly > access > > the > > > static > > > > methods in the new SymbolEngine class, see > > > decoder_windows.cpp. > > > > > > > > - The layer wrapping the Symbol API lives in the new > > > symbolengine.cpp/hpp > > > > files. The coding takes care of properly initializing > (once) the > > symbol > > > API > > > > and of assembling the pdb search path. > > > > > > > > - Pdb search path construction is changed: where before > we > > just > > > added jdk > > > > and jvm bin directories, we now just add all directories > of all > > loaded > > > DLLs > > > > (which, of course, include the jdk and jvm bin > directories). That > > way > > > we > > > > have a high chance of catching pdb files of third party > libraries, > > as > > > long > > > > as they follow the convention of putting the pdb files > beside > > the > > > dlls. > > > > This means it is easier to analyse crashes where third > party > > DLLs are > > > > involved. > > > > > > > > - On Windows, we now have source file and line number in > the > > > callstack. > > > > > > > > - There is a new parameter, diagnostic and windows-only, > > > > called "InitializeDbgHelpEarly". That parameter is by > default > > off. If > > > on, > > > > it causes the symbol engine to be initialized early, > which > > increases > > > the > > > > chance of good callstacks later on (because the > initialization > > does > > > not > > > > have to run in an error situation). > > > > > > > > - Added tests: gtests and a jtreg test which tests the > callstack > > > printing. > > > > All tests windows only. There is no technical reason for > making > > > them > > > > windows only, but I wanted to keep disturbances to other > > > platforms to a > > > > minimum and these kind of tests can be shaky. > > > > > > > > Thanks a lot for reviewing this! > > > > > > > > Kind Regards, Thomas > > > > > > > > > > > > > > From coleen.phillimore at oracle.com Wed Sep 6 13:40:46 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 6 Sep 2017 09:40:46 -0400 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <85c13adbd5564c88b3d4cb70b0523180@sap.com> Message-ID: <161eca29-6060-7896-4bf0-d3b334466e4b@oracle.com> I will sponsor this for you, but remind me. Thanks, Coleen On 9/6/17 9:16 AM, Thomas St?fe wrote: > Great, thank you! > > On Wed, Sep 6, 2017 at 3:11 PM, Lindenmaier, Goetz < > goetz.lindenmaier at sap.com> wrote: > >> HI Thomas, >> >> thanks for removing all that useless code. Looks perfect now :) >> >> Best regards, >> Goetz. >> >>> -----Original Message----- >>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >>> Sent: Mittwoch, 6. September 2017 14:38 >>> To: Lindenmaier, Goetz >>> Cc: hotspot-runtime-dev at openjdk.java.net; Ioi Lam ; >>> Zhengyu Gu >>> Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder >>> >>> Hi Goetz, >>> >>> On Wed, Sep 6, 2017 at 10:18 AM, Lindenmaier, Goetz >>> > >>> wrote: >>> >>> >>> Hi Thomas, >>> >>> I had a look at the new webrev you sent after Zhengyu's comments. >>> I appreciate the new tests. Looks good. >>> >>> I still think removal of Decoder::can_decode_C_frame_in_vm() >>> should >>> go into this change, because windows was the only platform to use >>> this. >>> If you insist put it in a change of its own, but to me it seems >> odd to >>> leave >>> this in the code in your change. >>> >>> Best regards, >>> Goetz. >>> >>> >>> >>> Okay, you convinced me. I removed both >>> Decoder::can_decode_C_frame_in_vm() and Decoder::shutdown() as you >>> suggested in your earlier review. >>> >>> New Webrev: >>> >>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- >>> native-symbol-resolver/webrev.04/webrev/index.html >>> >>> >>> Delta: >>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- >>> native-symbol-resolver/webrev.03-to-04/webrev/index.html >>> >>> >>> Note to other reviewers: This new webrev just removes dead code, it >> should >>> not have any function change over webrev.03. >>> >>> I did build on Linux x64, Aix, MacOS and Windows (32/64bit) and ran >> gtests on >>> these platforms. Will run jtreg tests tonight. >>> >>> Thanks, Thomas >>> >>> >>> >>> >>> > -----Original Message----- >>> > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>> ] >>> > Sent: Dienstag, 5. September 2017 15:06 >>> > To: Lindenmaier, Goetz >> > >>> > Cc: hotspot-runtime-dev at openjdk.java.net >> runtime-dev at openjdk.java.net> ; Ioi Lam >> > >>> > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol >>> decoder >>> > >>> > Hi Goetz, >>> > >>> > thank you for your review! >>> > >>> > New Webrev: >>> > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >>> improve- >> improve-> >>> > native-symbol-resolver/webrev.02 >>> > >>> > >>> > Delta to last: >>> > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >>> improve- >> improve-> >>> > native-symbol-resolver/webrev.01-to-02/webrev/ >>> > >>> > >>> > The only change is that I removed the -XX:InitializeDbgHelpEarly >>> switch to >>> > avoid having to file a CSR. >>> > >>> > Please find further comments inline: >>> > >>> > >>> > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz >>> > >> >> > > >>> >>> > wrote: >>> > >>> > >>> > Hi Thomas, >>> > >>> > I had a look at your change. Great somebody finally fixes >>> > the windows symbol printing, thanks a lot! >>> > >>> > The code looks good, I'm just not sure whether you >>> > need new files symbolengine.c|hpp. Isn't that >>> > just what should go to decoder_windows.h|cpp and >>> > class Decoder? >>> > You would also get rid of the redirections in >>> decoder_windows.cpp. >>> > >>> > >>> > >>> > >>> > As we discussed, I see your point, but would prefer to leave the >>> change for >>> > the moment as it is. >>> > >>> > A similar change to this one - doing away with the >> AbstractDecoder >>> object >>> > instantiation layer - will be coming for AIX, where it does not >> make >>> much >>> > sense either, and I propose to do a separate cleanup or >>> simplification change >>> > once that is done, merging decoder_windows.cpp and >>> > symbolengine.cpp/hpp. Unless I hear more objections from other >>> reviewers, >>> > I'd prefer to do this in a later patch. >>> > >>> > >>> > >>> > In shutdown() you comment >>> > // There is no reason ever to shut down the decoder. >>> > ... I think you can remove that function altogether, i.e. >> also >>> > from the shared code, I don't see where it is ever called. >>> > >>> > >>> > >>> > >>> > Totally agree... >>> > >>> > >>> > Also, I think, you can just delete >>> > Decoder::can_decode_C_frame_in_vm() >>> > from the code. The only place where it is used, in >> frame.cpp, >>> > calls dll_address_to_duntion_name(). This returns useful >>> information >>> > also in the case of the NullDecoder, which now is the only >> one to >>> > return false in that function. >>> > >>> > >>> > >>> > totally agree also here, but would also prefer both issues in a >>> separate >>> > change. In fact, Ioi opened a bug for this a while ago: >>> > https://bugs.openjdk.java.net/browse/JDK-8144855 >>> - and I would like >> to >>> fix >>> > it under that bug. Reason is, in this change, I'd like to avoid >> changing >>> shared >>> > sources as much as possible and keep this change windows only. >>> > >>> > >>> > >>> > Globals_windows.hpp needs Copyright adaption, please. >>> > This is not introduced by your change, but maybe >>> > you can also fix the copyright in decoder.hpp, which >>> > says " 1997, 2015, 2017" ... should only name two >>> > years ... >>> > >>> > >>> > >>> > >>> > Not needed anymore: since I removed the - >>> XX:InitializeDbgHelpEarly switch, >>> > globals_windows.hpp is reverted to its original state. Do you >> still >>> want me to >>> > fix the date? >>> > >>> > Thanks for the review work! >>> > >>> > ..Thomas >>> > >>> > >>> > Best regards, >>> > Goetz. >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > > -----Original Message----- >>> > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> >>> >>> > > >>> > > bounces at openjdk.java.net >>> >> > ] >>> > On Behalf Of Thomas St?fe >>> > > Sent: Mittwoch, 30. August 2017 14:34 >>> > > To: hotspot-runtime-dev at openjdk.java.net > hotspot- >>> runtime-dev at openjdk.java.net> >>> > runtime-dev at openjdk.java.net >> dev at openjdk.java.net> > >>> > > Subject: RFR(m): 8185712: [windows] Improve native symbol >>> > decoder >>> > > >>> > > Hi all, >>> > > >>> > > May I please have reviews for the following change. >>> > > >>> > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 >>> >>> > >> > >>> > > Webrev: >>> > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712- >>> windows- >> > improve- >> windows- >> > improve-> >>> > > native-symbol-resolver/webrev.01/webrev/ >>> > > >>> > > (This is the followup to: >>> > https://bugs.openjdk.java.net/browse/JDK-8186349 >>> >>> > >> > ) >>> >>> > > >>> > > ------------- >>> > > >>> > > Basically, this is a reimplementation of the layer >> around the >>> > Windows >>> > > Symbol API (the API used to resolve debug symbols). The >> old >>> > > implementation >>> > > had a number of errors and shortcomings which together >>> caused >>> > the >>> > > Windows >>> > > native symbol resolution (and hence callstacks in error >> logs) to >>> be a >>> > bit >>> > > of a lottery. The aim of this reimplementation is to >> make the >>> code >>> > more >>> > > robust and easier to maintain. >>> > > >>> > > The problems with the existing implementation are listed >> in >>> detail >>> > in the >>> > > bug description. >>> > > >>> > > The new implementation: >>> > > >>> > > - uses the new centralized WindowsDbgHelper class, which >>> wraps >>> > the >>> > > dbghelp.dll loading, introduced with JDK-8186349 >>> > > >>> > > - Completely bypasses the "create two instances of >>> > AbstractDecoder class >>> > > and synchronize access to them" scheme in decoder.cpp. It >>> does >>> > not make >>> > > sense for windows, where we have to synchronize each >> access >>> to >>> > the >>> > > dbghelp.dll anyway - this is done one layer below in >>> > WindowsDbgHelper. The >>> > > static methods of the shared Decoder class now directly >> access >>> the >>> > static >>> > > methods in the new SymbolEngine class, see >>> > decoder_windows.cpp. >>> > > >>> > > - The layer wrapping the Symbol API lives in the new >>> > symbolengine.cpp/hpp >>> > > files. The coding takes care of properly initializing >> (once) the >>> symbol >>> > API >>> > > and of assembling the pdb search path. >>> > > >>> > > - Pdb search path construction is changed: where before >> we >>> just >>> > added jdk >>> > > and jvm bin directories, we now just add all directories >> of all >>> loaded >>> > DLLs >>> > > (which, of course, include the jdk and jvm bin >> directories). That >>> way >>> > we >>> > > have a high chance of catching pdb files of third party >> libraries, >>> as >>> > long >>> > > as they follow the convention of putting the pdb files >> beside >>> the >>> > dlls. >>> > > This means it is easier to analyse crashes where third >> party >>> DLLs are >>> > > involved. >>> > > >>> > > - On Windows, we now have source file and line number in >> the >>> > callstack. >>> > > >>> > > - There is a new parameter, diagnostic and windows-only, >>> > > called "InitializeDbgHelpEarly". That parameter is by >> default >>> off. If >>> > on, >>> > > it causes the symbol engine to be initialized early, >> which >>> increases >>> > the >>> > > chance of good callstacks later on (because the >> initialization >>> does >>> > not >>> > > have to run in an error situation). >>> > > >>> > > - Added tests: gtests and a jtreg test which tests the >> callstack >>> > printing. >>> > > All tests windows only. There is no technical reason for >> making >>> > them >>> > > windows only, but I wanted to keep disturbances to other >>> > platforms to a >>> > > minimum and these kind of tests can be shaky. >>> > > >>> > > Thanks a lot for reviewing this! >>> > > >>> > > Kind Regards, Thomas >>> > >>> > >>> >>> >>> >> From chris.plummer at oracle.com Wed Sep 6 21:20:48 2017 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 6 Sep 2017 14:20:48 -0700 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <7f190cb7-c5b8-8f15-f26f-5e558666e7f5@bell-sw.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> <1a181bcf-0672-b5ba-3975-f6517a0dd4a9@redhat.com> <7f190cb7-c5b8-8f15-f26f-5e558666e7f5@bell-sw.com> Message-ID: <11067613-f47e-b1a2-45d7-884ccd01484e@oracle.com> On 9/6/17 12:26 AM, dmitry.samersov wrote: > Chris, > >> You are always skipping 3 >> frames when it's a non-product build, and that isn't uniformly correct >> for all platforms and for slowdebug vs fastdebug. > NMT_InternalFrames means *maximum possible* number of internal frames. > > We check *a name of the frame* before we skip it, so we don't need to > know how many frames got inlined on in particular build by particular > compiler and this is the main idea of the fix. > > i.e. It's safe to set (e.g.) NMT_InternalFrames = 10 and keep this logic > in production. > > I guard the changes by #ifndef PRODUCT to avoid memory overhead in > production but if we can afford memory/performance penalty of > storing 3 or 4 extra frames, it's better to keep this logic > enabled ever in a product build. > > *Testing:* > I'd tested 3 builds (release, fastdebug, slowdebug) on two platforms > (Linux/x86_64, Linux/aarch64). All hotspot/runtime/NMT tests are passed > including CheckForProperDetailStackTrace. Also, manual comparison of > stacktraces on Linux/x86_64 don't show any changes in output. Hi Dmitry, So your assumption is that product builds don't need any additional frames pruned, and debug builds at most need os::get_native_stack and NativeCallStack:: pruned. In that case should add AllocateHeap to the list and simplify CheckForProperDetailStackTrace.java: // AllocateHeap shouldn't be in the output because it is supposed to always be inlined. // We check for that here, but allow it for Aix, Solaris and Windows slowdebug builds // because the compiler ends up not inlining AllocateHeap. Boolean okToHaveAllocateHeap = Platform.isSlowDebugBuild() && (Platform.isAix() || Platform.isSolaris() || Platform.isWindows()); if (!okToHaveAllocateHeap) { output.shouldNotContain("AllocateHeap"); } A couple of questions: * Are you sure the frame name will always be demangled? I recall either macos or solaris not doing this reliably. I think this is why the test looks for patterns like " .*ModuleEntryTable.*new_entry.*\n". It looks like you didn't test either of these platforms. * What about platforms that do not do tail calls for os::get_native_stack, even for PRODUCT builds. I don't see that you've tested any of these either. I think you'll find os::get_native_stack is the NMT backtrace on these platforms after your changes. I was also going to ask about os::current_frame() and get_previous_fp(), but if I understand correctly, you always limit the frames you print to 4, so you never get to these frames. Is that correct? However I'd still be concerned that your changes cause os::current_frame() to no longer be consistent. I fixed it to always return the frame of whoever calls os:current_frame(). After your changes sometimes it will return the frame for os::current_frame(). This might have unintended side affects for other stack walking/printing code. thanks, Chris > > -Dmitry > > On 05.09.2017 21:34, Chris Plummer wrote: >> Hi Dmitry, >> >> I've looked over the changes and some of the comments so far, and do >> agree with Zhengyu regarding removal _NMT_NOINLINE_, but I also have >> concerns about other platform dependent code you have removed. >> >> _NMT_NOINLINE is only defined for slowdebug builds. You now are instead >> trying to change the frame skipping logic to be based on the PRODUCT >> flag. fastdebug builds do not set the PRODUCT flag, so you cannot >> possibly get the frame skipping for both slowdebug and fastdebug builds >> by checking the PRODUCT flag since they each have different frame >> skipping requirements. Since we have/had no other way of telling the >> difference between slowdebug and fastdebug builds, I continued to >> leverage the _NMT_NOINLINE flag for this. >> >> NMT_InternalFrames will should not always be 3 for non product builds. >> It should not only vary between slowdebug and fastdebug, but it also >> vary between platforms. This is due both to compiler differences and >> code differences. That's why there is code like this: >> >> 36 // We need to skip the NativeCallStack::NativeCallStack frame >> if a tail call is NOT used >> 37 // to call os::get_native_stack. A tail call is used if >> _NMT_NOINLINE_ is not defined >> 38 // (which means this is not a slowdebug build), and we are on >> 64-bit (except Windows). >> 39 // This is not necessarily a rule, but what has been obvserved >> to date. >> 40 #define TAIL_CALL (!defined(_NMT_NOINLINE_) && !defined(WINDOWS) && >> defined(_LP64)) >> 41 #if !TAIL_CALL >> 42 toSkip++; >> 43 #if (defined(_NMT_NOINLINE_) && defined(BSD) && defined(_LP64)) >> 44 // Mac OS X slowdebug builds have this odd behavior where >> NativeCallStack::NativeCallStack >> 45 // appears as two frames, so we need to skip an extra frame. >> 46 toSkip++; >> 47 #endif >> 48 #endif >> >> And also in _get_previous_fp(), which varies in name and implementation >> for various os/cpu combos, the logic for the number of frames to skip is >> not always the same. >> >> You noted that: >>> 1. This patch doesn't affect product build. On product build we have all >>> NMT frames inlined and don't need to skip anything. >> It's not true that for product builds we don't need to skip anything. >> The code above indicates the tail call differences on some platforms, >> which requires skipping a frame in product builds on some platforms, but >> not others. >> >> Thomas wrote: >>> Code is easier to read now and less vulnerable >>> to compiler decisions. >> When the reality is that with your changes compiler decisions are being >> ignored, as are implementation decisions. You are always skipping 3 >> frames when it's a non-product build, and that isn't uniformly correct >> for all platforms and for slowdebug vs fastdebug. >> >> I think your NMT_InternalFrames solution makes parts of the code easier >> to read, but in order to be correct its computation needs to be platform >> dependent, and also account for slowdebug/fastdebug differences. >> >> I did write an NMT test to try to make sure frame skipping is correct. >> It's called CheckForProperDetailStackTrace.java. However, I doubt it's >> 100% reliable. You should run it with all 3 build flavors: release, >> slowdebug, fastdebug. And you need to run it on all supported platforms. >> For linux-arm32, it would be good to actually use an -marm build instead >> of -mthumb since we don't even get stack traces with -mthumb. >> >> thanks, >> >> Chris >> >> On 9/5/17 8:28 AM, Zhengyu Gu wrote: >>> Hi Dmitry, >>> >>> I have concerns on this change: >>> >>> Although, you only extend tracking stacks for none-production build, >>> eliminating _NMT_NOINLINE_ actually affect production code. Have you >>> tested production build? >>> >>> Chris (cc'ed) worked on fixing stack walking before this change, we >>> should get a feedback from him. >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> >>> On 09/05/2017 10:49 AM, dmitry.samersov wrote: >>>> Everybody, >>>> >>>> Please, review updated webrev: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.06/ >>>> >>>> Only files below different from the previous webrev. >>>> >>>> src/share/vm/services/nmtCommon.hpp >>>> src/share/vm/utilities/nativeCallStack.cpp >>>> src/share/vm/utilities/nativeCallStack.hpp >>>> >>>> >>>> 1. Changes guarded by #ifndef PRODUCT >>>> 2. Addressed Thomas comments >>>> >>>> -Dmitry >>>> >>>> On 31.08.2017 10:49, dmitry.samersov wrote: >>>>> Everybody, >>>>> >>>>> Please review: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ >>>>> >>>>> I would propose different approach to fix JDK-8133740 >>>>> platform-independent way: record all frames but strip unnecessary >>>>> NMT-internal ones on printing. >>>>> >>>>> This approach is safe (we don't depend to compiler inlining and we >>>>> never >>>>> strip non-NMT frames) and platform independent, but cost us some extra >>>>> memory. >>>>> >>>>> -Dmitry >>>>> >>>>> >>>> > From thomas.stuefe at gmail.com Thu Sep 7 08:16:33 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 7 Sep 2017 10:16:33 +0200 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: <161eca29-6060-7896-4bf0-d3b334466e4b@oracle.com> References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <85c13adbd5564c88b3d4cb70b0523180@sap.com> <161eca29-6060-7896-4bf0-d3b334466e4b@oracle.com> Message-ID: Hi Goetz, as I am gone for vacation the next four weeks, could you please prepare the webrev rebased to the new repo once it is open and give it to Coleen? Thank you! (Last valid version was http://cr.openjdk.java.net/~stuefe/webrevs/8185712- windows-improve-native-symbol-resolver/webrev.04/webrev/index.html ) On Wed, Sep 6, 2017 at 3:40 PM, wrote: > > I will sponsor this for you, but remind me. > Thanks, > Coleen > > > > On 9/6/17 9:16 AM, Thomas St?fe wrote: > >> Great, thank you! >> >> On Wed, Sep 6, 2017 at 3:11 PM, Lindenmaier, Goetz < >> goetz.lindenmaier at sap.com> wrote: >> >> HI Thomas, >>> >>> thanks for removing all that useless code. Looks perfect now :) >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >>>> Sent: Mittwoch, 6. September 2017 14:38 >>>> To: Lindenmaier, Goetz >>>> Cc: hotspot-runtime-dev at openjdk.java.net; Ioi Lam ; >>>> Zhengyu Gu >>>> Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder >>>> >>>> Hi Goetz, >>>> >>>> On Wed, Sep 6, 2017 at 10:18 AM, Lindenmaier, Goetz >>>> > >>>> wrote: >>>> >>>> >>>> Hi Thomas, >>>> >>>> I had a look at the new webrev you sent after Zhengyu's comments. >>>> I appreciate the new tests. Looks good. >>>> >>>> I still think removal of Decoder::can_decode_C_frame_in_vm() >>>> should >>>> go into this change, because windows was the only platform to use >>>> this. >>>> If you insist put it in a change of its own, but to me it seems >>>> >>> odd to >>> >>>> leave >>>> this in the code in your change. >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> >>>> >>>> Okay, you convinced me. I removed both >>>> Decoder::can_decode_C_frame_in_vm() and Decoder::shutdown() as you >>>> suggested in your earlier review. >>>> >>>> New Webrev: >>>> >>>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- >>>> native-symbol-resolver/webrev.04/webrev/index.html >>>> >>>> >>>> Delta: >>>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- >>>> native-symbol-resolver/webrev.03-to-04/webrev/index.html >>>> >>>> >>>> Note to other reviewers: This new webrev just removes dead code, it >>>> >>> should >>> >>>> not have any function change over webrev.03. >>>> >>>> I did build on Linux x64, Aix, MacOS and Windows (32/64bit) and ran >>>> >>> gtests on >>> >>>> these platforms. Will run jtreg tests tonight. >>>> >>>> Thanks, Thomas >>>> >>>> >>>> >>>> >>>> > -----Original Message----- >>>> > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>>> ] >>>> > Sent: Dienstag, 5. September 2017 15:06 >>>> > To: Lindenmaier, Goetz >>> > >>>> > Cc: hotspot-runtime-dev at openjdk.java.net >>> runtime-dev at openjdk.java.net> ; Ioi Lam >>> > >>>> > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol >>>> decoder >>>> > >>>> > Hi Goetz, >>>> > >>>> > thank you for your review! >>>> > >>>> > New Webrev: >>>> > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >>>> improve- >>> improve-> >>>> > native-symbol-resolver/webrev.02 >>>> > >>>> > >>>> > Delta to last: >>>> > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >>>> improve- >>> improve-> >>>> > native-symbol-resolver/webrev.01-to-02/webrev/ >>>> > >>>> > >>>> > The only change is that I removed the >>>> -XX:InitializeDbgHelpEarly >>>> switch to >>>> > avoid having to file a CSR. >>>> > >>>> > Please find further comments inline: >>>> > >>>> > >>>> > On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz >>>> > >>> >>> > > >>>> >>>> > wrote: >>>> > >>>> > >>>> > Hi Thomas, >>>> > >>>> > I had a look at your change. Great somebody finally fixes >>>> > the windows symbol printing, thanks a lot! >>>> > >>>> > The code looks good, I'm just not sure whether you >>>> > need new files symbolengine.c|hpp. Isn't that >>>> > just what should go to decoder_windows.h|cpp and >>>> > class Decoder? >>>> > You would also get rid of the redirections in >>>> decoder_windows.cpp. >>>> > >>>> > >>>> > >>>> > >>>> > As we discussed, I see your point, but would prefer to leave >>>> the >>>> change for >>>> > the moment as it is. >>>> > >>>> > A similar change to this one - doing away with the >>>> >>> AbstractDecoder >>> >>>> object >>>> > instantiation layer - will be coming for AIX, where it does not >>>> >>> make >>> >>>> much >>>> > sense either, and I propose to do a separate cleanup or >>>> simplification change >>>> > once that is done, merging decoder_windows.cpp and >>>> > symbolengine.cpp/hpp. Unless I hear more objections from other >>>> reviewers, >>>> > I'd prefer to do this in a later patch. >>>> > >>>> > >>>> > >>>> > In shutdown() you comment >>>> > // There is no reason ever to shut down the decoder. >>>> > ... I think you can remove that function altogether, i.e. >>>> >>> also >>> >>>> > from the shared code, I don't see where it is ever >>>> called. >>>> > >>>> > >>>> > >>>> > >>>> > Totally agree... >>>> > >>>> > >>>> > Also, I think, you can just delete >>>> > Decoder::can_decode_C_frame_in_vm() >>>> > from the code. The only place where it is used, in >>>> >>> frame.cpp, >>> >>>> > calls dll_address_to_duntion_name(). This returns useful >>>> information >>>> > also in the case of the NullDecoder, which now is the >>>> only >>>> >>> one to >>> >>>> > return false in that function. >>>> > >>>> > >>>> > >>>> > totally agree also here, but would also prefer both issues in a >>>> separate >>>> > change. In fact, Ioi opened a bug for this a while ago: >>>> > https://bugs.openjdk.java.net/browse/JDK-8144855 >>>> - and I would like >>>> >>> to >>> >>>> fix >>>> > it under that bug. Reason is, in this change, I'd like to avoid >>>> >>> changing >>> >>>> shared >>>> > sources as much as possible and keep this change windows only. >>>> > >>>> > >>>> > >>>> > Globals_windows.hpp needs Copyright adaption, please. >>>> > This is not introduced by your change, but maybe >>>> > you can also fix the copyright in decoder.hpp, which >>>> > says " 1997, 2015, 2017" ... should only name two >>>> > years ... >>>> > >>>> > >>>> > >>>> > >>>> > Not needed anymore: since I removed the - >>>> XX:InitializeDbgHelpEarly switch, >>>> > globals_windows.hpp is reverted to its original state. Do you >>>> >>> still >>> >>>> want me to >>>> > fix the date? >>>> > >>>> > Thanks for the review work! >>>> > >>>> > ..Thomas >>>> > >>>> > >>>> > Best regards, >>>> > Goetz. >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > > -----Original Message----- >>>> > > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>>> >>>> >>>> > > >>>> > > bounces at openjdk.java.net >>>> >>> > ] >>>> > On Behalf Of Thomas St?fe >>>> > > Sent: Mittwoch, 30. August 2017 14:34 >>>> > > To: hotspot-runtime-dev at openjdk.java.net >>> >>> hotspot- >>> >>>> runtime-dev at openjdk.java.net> >>>> > runtime-dev at openjdk.java.net >>> dev at openjdk.java.net> > >>>> > > Subject: RFR(m): 8185712: [windows] Improve native >>>> symbol >>>> > decoder >>>> > > >>>> > > Hi all, >>>> > > >>>> > > May I please have reviews for the following change. >>>> > > >>>> > > Issue: https://bugs.openjdk.java.net/ >>>> browse/JDK-8185712 >>>> >>>> > >>> > >>>> > > Webrev: >>>> > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712- >>>> windows- >>> > improve- >>> windows- >>> > improve-> >>>> > > native-symbol-resolver/webrev.01/webrev/ >>>> > > >>>> > > (This is the followup to: >>>> > https://bugs.openjdk.java.net/browse/JDK-8186349 >>>> >>>> > >>> > ) >>>> >>>> > > >>>> > > ------------- >>>> > > >>>> > > Basically, this is a reimplementation of the layer >>>> >>> around the >>> >>>> > Windows >>>> > > Symbol API (the API used to resolve debug symbols). The >>>> >>> old >>> >>>> > > implementation >>>> > > had a number of errors and shortcomings which together >>>> caused >>>> > the >>>> > > Windows >>>> > > native symbol resolution (and hence callstacks in error >>>> >>> logs) to >>> >>>> be a >>>> > bit >>>> > > of a lottery. The aim of this reimplementation is to >>>> >>> make the >>> >>>> code >>>> > more >>>> > > robust and easier to maintain. >>>> > > >>>> > > The problems with the existing implementation are >>>> listed >>>> >>> in >>> >>>> detail >>>> > in the >>>> > > bug description. >>>> > > >>>> > > The new implementation: >>>> > > >>>> > > - uses the new centralized WindowsDbgHelper class, >>>> which >>>> wraps >>>> > the >>>> > > dbghelp.dll loading, introduced with JDK-8186349 >>>> > > >>>> > > - Completely bypasses the "create two instances of >>>> > AbstractDecoder class >>>> > > and synchronize access to them" scheme in decoder.cpp. >>>> It >>>> does >>>> > not make >>>> > > sense for windows, where we have to synchronize each >>>> >>> access >>> >>>> to >>>> > the >>>> > > dbghelp.dll anyway - this is done one layer below in >>>> > WindowsDbgHelper. The >>>> > > static methods of the shared Decoder class now directly >>>> >>> access >>> >>>> the >>>> > static >>>> > > methods in the new SymbolEngine class, see >>>> > decoder_windows.cpp. >>>> > > >>>> > > - The layer wrapping the Symbol API lives in the new >>>> > symbolengine.cpp/hpp >>>> > > files. The coding takes care of properly initializing >>>> >>> (once) the >>> >>>> symbol >>>> > API >>>> > > and of assembling the pdb search path. >>>> > > >>>> > > - Pdb search path construction is changed: where before >>>> >>> we >>> >>>> just >>>> > added jdk >>>> > > and jvm bin directories, we now just add all >>>> directories >>>> >>> of all >>> >>>> loaded >>>> > DLLs >>>> > > (which, of course, include the jdk and jvm bin >>>> >>> directories). That >>> >>>> way >>>> > we >>>> > > have a high chance of catching pdb files of third party >>>> >>> libraries, >>> >>>> as >>>> > long >>>> > > as they follow the convention of putting the pdb files >>>> >>> beside >>> >>>> the >>>> > dlls. >>>> > > This means it is easier to analyse crashes where third >>>> >>> party >>> >>>> DLLs are >>>> > > involved. >>>> > > >>>> > > - On Windows, we now have source file and line number >>>> in >>>> >>> the >>> >>>> > callstack. >>>> > > >>>> > > - There is a new parameter, diagnostic and >>>> windows-only, >>>> > > called "InitializeDbgHelpEarly". That parameter is by >>>> >>> default >>> >>>> off. If >>>> > on, >>>> > > it causes the symbol engine to be initialized early, >>>> >>> which >>> >>>> increases >>>> > the >>>> > > chance of good callstacks later on (because the >>>> >>> initialization >>> >>>> does >>>> > not >>>> > > have to run in an error situation). >>>> > > >>>> > > - Added tests: gtests and a jtreg test which tests the >>>> >>> callstack >>> >>>> > printing. >>>> > > All tests windows only. There is no technical reason >>>> for >>>> >>> making >>> >>>> > them >>>> > > windows only, but I wanted to keep disturbances to >>>> other >>>> > platforms to a >>>> > > minimum and these kind of tests can be shaky. >>>> > > >>>> > > Thanks a lot for reviewing this! >>>> > > >>>> > > Kind Regards, Thomas >>>> > >>>> > >>>> >>>> >>>> >>>> >>> > From zgu at redhat.com Thu Sep 7 17:41:12 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Sep 2017 13:41:12 -0400 Subject: RFR(XXS) 8187331: VirtualSpaceList tracks free space on wrong node Message-ID: Please review one line fix on reporting metaspace virtual space's free space (See bug for details) This bug causes metaspace statistics data not adding up. Bug: https://bugs.openjdk.java.net/browse/JDK-8187331 Webrev: http://cr.openjdk.java.net/~zgu/8187331/webrev.00/ Test: hotspot_tier1_runtime on Linux 64 (fastdebug and release) Thanks, -Zhengyu From shade at redhat.com Thu Sep 7 17:46:38 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Sep 2017 19:46:38 +0200 Subject: RFR(XXS) 8187331: VirtualSpaceList tracks free space on wrong node In-Reply-To: References: Message-ID: On 09/07/2017 07:41 PM, Zhengyu Gu wrote: > Please review one line fix on reporting metaspace virtual space's free space (See bug for details) > > This bug causes metaspace statistics data not adding up. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187331 > Webrev: http://cr.openjdk.java.net/~zgu/8187331/webrev.00/ Looks good. Seems obvious in hindsight. -Aleksey From zgu at redhat.com Thu Sep 7 17:49:20 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Sep 2017 13:49:20 -0400 Subject: RFR(XXS) 8187331: VirtualSpaceList tracks free space on wrong node In-Reply-To: References: Message-ID: <9288781a-bf80-43d7-5588-888d93c571c2@redhat.com> Thanks for the quick review, Aleksey! -Zhengyu On 09/07/2017 01:46 PM, Aleksey Shipilev wrote: > On 09/07/2017 07:41 PM, Zhengyu Gu wrote: >> Please review one line fix on reporting metaspace virtual space's free space (See bug for details) >> >> This bug causes metaspace statistics data not adding up. >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8187331 >> Webrev: http://cr.openjdk.java.net/~zgu/8187331/webrev.00/ > > Looks good. Seems obvious in hindsight. > > -Aleksey > From coleen.phillimore at oracle.com Thu Sep 7 17:54:30 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 7 Sep 2017 13:54:30 -0400 Subject: RFR(XXS) 8187331: VirtualSpaceList tracks free space on wrong node In-Reply-To: References: Message-ID: <26b066f6-650a-62d9-a7f0-8bc348ff2560@oracle.com> This looks good.? Good find.? I'll sponsor it and check it in when the repo opens.?? Can you rebase at that time and send me the export patch? Thanks, Coleen On 9/7/17 1:41 PM, Zhengyu Gu wrote: > Please review one line fix on reporting metaspace virtual space's free > space (See bug for details) > > This bug causes metaspace statistics data not adding up. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187331 > Webrev: http://cr.openjdk.java.net/~zgu/8187331/webrev.00/ > > > Test: > > hotspot_tier1_runtime on Linux 64 (fastdebug and release) > > Thanks, > > -Zhengyu From thomas.stuefe at gmail.com Thu Sep 7 17:54:58 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 7 Sep 2017 19:54:58 +0200 Subject: RFR(XXS) 8187331: VirtualSpaceList tracks free space on wrong node In-Reply-To: References: Message-ID: Hi Zhengyu, looks good. Kind Regards, Thomas On Thu, Sep 7, 2017 at 7:46 PM, Aleksey Shipilev wrote: > On 09/07/2017 07:41 PM, Zhengyu Gu wrote: > > Please review one line fix on reporting metaspace virtual space's free > space (See bug for details) > > > > This bug causes metaspace statistics data not adding up. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187331 > > Webrev: http://cr.openjdk.java.net/~zgu/8187331/webrev.00/ > > Looks good. Seems obvious in hindsight. > > -Aleksey > > From zgu at redhat.com Thu Sep 7 17:56:31 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Sep 2017 13:56:31 -0400 Subject: RFR(XXS) 8187331: VirtualSpaceList tracks free space on wrong node In-Reply-To: <26b066f6-650a-62d9-a7f0-8bc348ff2560@oracle.com> References: <26b066f6-650a-62d9-a7f0-8bc348ff2560@oracle.com> Message-ID: <15a6a2bf-08dc-9554-e4b8-31b8d08e9861@redhat.com> Thanks Coleen! On 09/07/2017 01:54 PM, coleen.phillimore at oracle.com wrote: > This looks good. Good find. I'll sponsor it and check it in when the > repo opens. Can you rebase at that time and send me the export patch? > Of course. -Zhengyu > Thanks, > Coleen > > On 9/7/17 1:41 PM, Zhengyu Gu wrote: >> Please review one line fix on reporting metaspace virtual space's free >> space (See bug for details) >> >> This bug causes metaspace statistics data not adding up. >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8187331 >> Webrev: http://cr.openjdk.java.net/~zgu/8187331/webrev.00/ >> >> >> Test: >> >> hotspot_tier1_runtime on Linux 64 (fastdebug and release) >> >> Thanks, >> >> -Zhengyu > From zgu at redhat.com Thu Sep 7 17:57:06 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Sep 2017 13:57:06 -0400 Subject: RFR(XXS) 8187331: VirtualSpaceList tracks free space on wrong node In-Reply-To: References: Message-ID: Thanks, Thomas! -Zhengyu On 09/07/2017 01:54 PM, Thomas St?fe wrote: > Hi Zhengyu, > > looks good. > > Kind Regards, Thomas > > On Thu, Sep 7, 2017 at 7:46 PM, Aleksey Shipilev > wrote: > > On 09/07/2017 07:41 PM, Zhengyu Gu wrote: > > Please review one line fix on reporting metaspace virtual space's free space (See bug for details) > > > > This bug causes metaspace statistics data not adding up. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187331 > > > Webrev: http://cr.openjdk.java.net/~zgu/8187331/webrev.00/ > > > Looks good. Seems obvious in hindsight. > > -Aleksey > > From calvin.cheung at oracle.com Thu Sep 7 20:58:12 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 07 Sep 2017 13:58:12 -0700 Subject: RFR(XS): 8187124: [TESTBUG] TestInterpreterMethodEntries.java: Unable to create shared archive file Message-ID: <59B1B2E4.6000303@oracle.com> bug: https://bugs.openjdk.java.net/browse/JDK-8187124 webrev: http://cr.openjdk.java.net/~ccheung/8187124/webrev.00/ As described in the bug report, the change is to add the current timestamp into the shared archive file name. Ran all tests under hotspot/test/runtime/SharedArchiveFile locally on linux-x64. All generated *.jsa files, except for the SharedBaseAddress.java test which sets the shared archive filenames, have a timestamp in their names. thanks, Calvin From george.triantafillou at oracle.com Thu Sep 7 21:01:41 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Thu, 7 Sep 2017 17:01:41 -0400 Subject: RFR(XS): 8187124: [TESTBUG] TestInterpreterMethodEntries.java: Unable to create shared archive file In-Reply-To: <59B1B2E4.6000303@oracle.com> References: <59B1B2E4.6000303@oracle.com> Message-ID: <26f39815-2e37-8329-9d88-f0702499e472@oracle.com> Hi Calvin, Looks good. -George On 9/7/2017 4:58 PM, Calvin Cheung wrote: > bug: https://bugs.openjdk.java.net/browse/JDK-8187124 > > webrev: http://cr.openjdk.java.net/~ccheung/8187124/webrev.00/ > > As described in the bug report, the change is to add the current > timestamp into the shared archive file name. > > Ran all tests under hotspot/test/runtime/SharedArchiveFile locally on > linux-x64. All generated *.jsa files, except for the > SharedBaseAddress.java test which sets the shared archive filenames, > have a timestamp in their names. > > thanks, > Calvin From mikhailo.seledtsov at oracle.com Thu Sep 7 21:05:06 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Thu, 07 Sep 2017 14:05:06 -0700 Subject: RFR(XS): 8187124: [TESTBUG] TestInterpreterMethodEntries.java: Unable to create shared archive file In-Reply-To: <26f39815-2e37-8329-9d88-f0702499e472@oracle.com> References: <59B1B2E4.6000303@oracle.com> <26f39815-2e37-8329-9d88-f0702499e472@oracle.com> Message-ID: <59B1B482.6050700@oracle.com> Looks good, Misha On 9/7/17, 2:01 PM, George Triantafillou wrote: > Hi Calvin, > > Looks good. > > -George > > On 9/7/2017 4:58 PM, Calvin Cheung wrote: >> bug: https://bugs.openjdk.java.net/browse/JDK-8187124 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8187124/webrev.00/ >> >> As described in the bug report, the change is to add the current >> timestamp into the shared archive file name. >> >> Ran all tests under hotspot/test/runtime/SharedArchiveFile locally on >> linux-x64. All generated *.jsa files, except for the >> SharedBaseAddress.java test which sets the shared archive filenames, >> have a timestamp in their names. >> >> thanks, >> Calvin > From calvin.cheung at oracle.com Thu Sep 7 21:44:49 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 07 Sep 2017 14:44:49 -0700 Subject: RFR(XS): 8187124: [TESTBUG] TestInterpreterMethodEntries.java: Unable to create shared archive file In-Reply-To: <26f39815-2e37-8329-9d88-f0702499e472@oracle.com> References: <59B1B2E4.6000303@oracle.com> <26f39815-2e37-8329-9d88-f0702499e472@oracle.com> Message-ID: <59B1BDD1.10002@oracle.com> Thanks George. Calvin On 9/7/17, 2:01 PM, George Triantafillou wrote: > Hi Calvin, > > Looks good. > > -George > > On 9/7/2017 4:58 PM, Calvin Cheung wrote: >> bug: https://bugs.openjdk.java.net/browse/JDK-8187124 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8187124/webrev.00/ >> >> As described in the bug report, the change is to add the current >> timestamp into the shared archive file name. >> >> Ran all tests under hotspot/test/runtime/SharedArchiveFile locally on >> linux-x64. All generated *.jsa files, except for the >> SharedBaseAddress.java test which sets the shared archive filenames, >> have a timestamp in their names. >> >> thanks, >> Calvin > From calvin.cheung at oracle.com Thu Sep 7 21:45:32 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 07 Sep 2017 14:45:32 -0700 Subject: RFR(XS): 8187124: [TESTBUG] TestInterpreterMethodEntries.java: Unable to create shared archive file In-Reply-To: <59B1B482.6050700@oracle.com> References: <59B1B2E4.6000303@oracle.com> <26f39815-2e37-8329-9d88-f0702499e472@oracle.com> <59B1B482.6050700@oracle.com> Message-ID: <59B1BDFC.4070500@oracle.com> Thanks Misha. Calvin On 9/7/17, 2:05 PM, Mikhailo Seledtsov wrote: > Looks good, > > Misha > > On 9/7/17, 2:01 PM, George Triantafillou wrote: >> Hi Calvin, >> >> Looks good. >> >> -George >> >> On 9/7/2017 4:58 PM, Calvin Cheung wrote: >>> bug: https://bugs.openjdk.java.net/browse/JDK-8187124 >>> >>> webrev: http://cr.openjdk.java.net/~ccheung/8187124/webrev.00/ >>> >>> As described in the bug report, the change is to add the current >>> timestamp into the shared archive file name. >>> >>> Ran all tests under hotspot/test/runtime/SharedArchiveFile locally >>> on linux-x64. All generated *.jsa files, except for the >>> SharedBaseAddress.java test which sets the shared archive filenames, >>> have a timestamp in their names. >>> >>> thanks, >>> Calvin >> From dmitry.samersoff at bell-sw.com Mon Sep 11 18:58:36 2017 From: dmitry.samersoff at bell-sw.com (Dmitry Samersoff) Date: Mon, 11 Sep 2017 21:58:36 +0300 Subject: What does SPARC_WORK define mean? Message-ID: <8cff7948-af3b-e934-5a39-7b94b41949d0@bell-sw.com> Everybody, Actually %subj%: What does SPARC_WORKS define mean and why we have it in os_linux_x86.cpp ? Could we narrow function below[1] to just intptr_t **ebp; __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); that works for both gcc and clang? 1. static intptr_t* _get_previous_fp() { #ifdef SPARC_WORKS register intptr_t **ebp; __asm__("mov %%"SPELL_REG_FP", %0":"=r"(ebp)); #elif defined(__clang__) intptr_t **ebp; __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); #else register intptr_t **ebp __asm__ (SPELL_REG_FP); #endif return *ebp; } -Dmitry From vladimir.kozlov at oracle.com Mon Sep 11 19:48:30 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 11 Sep 2017 12:48:30 -0700 Subject: What does SPARC_WORK define mean? In-Reply-To: <8cff7948-af3b-e934-5a39-7b94b41949d0@bell-sw.com> References: <8cff7948-af3b-e934-5a39-7b94b41949d0@bell-sw.com> Message-ID: <960d6d38-36be-7930-be13-21a22652ea5e@oracle.com> SPARC_WORKS is defined by Sun/Oracle Studio native compilers. Changes were added with this: http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/485d403e94e1 Vladimir On 9/11/17 11:58 AM, Dmitry Samersoff wrote: > Everybody, > > Actually %subj%: > > What does SPARC_WORKS define mean and why we have it in os_linux_x86.cpp ? > > Could we narrow function below[1] to just > > intptr_t **ebp; > __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); > > that works for both gcc and clang? > > > 1. > static intptr_t* _get_previous_fp() { > #ifdef SPARC_WORKS > register intptr_t **ebp; > __asm__("mov %%"SPELL_REG_FP", %0":"=r"(ebp)); > #elif defined(__clang__) > intptr_t **ebp; > __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); > #else > register intptr_t **ebp __asm__ (SPELL_REG_FP); > #endif > > return *ebp; > } > > -Dmitry > > From david.holmes at oracle.com Mon Sep 11 21:19:45 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 12 Sep 2017 07:19:45 +1000 Subject: What does SPARC_WORK define mean? In-Reply-To: <960d6d38-36be-7930-be13-21a22652ea5e@oracle.com> References: <8cff7948-af3b-e934-5a39-7b94b41949d0@bell-sw.com> <960d6d38-36be-7930-be13-21a22652ea5e@oracle.com> Message-ID: On 12/09/2017 5:48 AM, Vladimir Kozlov wrote: > SPARC_WORKS is defined by Sun/Oracle Studio native compilers. Changes > were added with this: > > http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/485d403e94e1 That's a blast from the past! - and something I had forgotten about. When was the last time somebody actually tried to build linux-x86 with Studio compiler? Who is supposed to keep this working? It may be time to strip this back out. David ----- > Vladimir > > On 9/11/17 11:58 AM, Dmitry Samersoff wrote: >> Everybody, >> >> Actually %subj%: >> >> What does SPARC_WORKS define mean and why we have it in >> os_linux_x86.cpp ? >> >> Could we narrow function below[1] to just >> >> ?? intptr_t **ebp; >> ?? __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); >> >> that works for both gcc and clang? >> >> >> 1. >> static intptr_t* _get_previous_fp() { >> #ifdef SPARC_WORKS >> ?? register intptr_t **ebp; >> ?? __asm__("mov %%"SPELL_REG_FP", %0":"=r"(ebp)); >> #elif defined(__clang__) >> ?? intptr_t **ebp; >> ?? __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); >> #else >> ?? register intptr_t **ebp __asm__ (SPELL_REG_FP); >> #endif >> >> ?? return *ebp; >> } >> >> -Dmitry >> >> From magnus.ihse.bursie at oracle.com Tue Sep 12 11:53:30 2017 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 12 Sep 2017 13:53:30 +0200 Subject: What does SPARC_WORK define mean? In-Reply-To: References: <8cff7948-af3b-e934-5a39-7b94b41949d0@bell-sw.com> <960d6d38-36be-7930-be13-21a22652ea5e@oracle.com> Message-ID: <519139c1-9ac0-d38f-3555-0d254d091148@oracle.com> On 2017-09-11 23:19, David Holmes wrote: > On 12/09/2017 5:48 AM, Vladimir Kozlov wrote: >> SPARC_WORKS is defined by Sun/Oracle Studio native compilers. Changes >> were added with this: >> >> http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/485d403e94e1 > > That's a blast from the past! - and something I had forgotten about. > When was the last time somebody actually tried to build linux-x86 with > Studio compiler? Who is supposed to keep this working? I tried it last year when converting the hotspot makefiles, to assess the amount of damage I'd create if I didn't support it. Not surprisingly, it did not work, not even after some quick and ugly fixes to try to get it accepted by the compiler. It's worth noting that there have never been any support for building the JDK native libraries with sunstudio on linux. I would assume that such code can safely be removed, but perhaps some formal decision to do that is required. /Magnus > > It may be time to strip this back out. > > David > ----- > >> Vladimir >> >> On 9/11/17 11:58 AM, Dmitry Samersoff wrote: >>> Everybody, >>> >>> Actually %subj%: >>> >>> What does SPARC_WORKS define mean and why we have it in >>> os_linux_x86.cpp ? >>> >>> Could we narrow function below[1] to just >>> >>> intptr_t **ebp; >>> __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); >>> >>> that works for both gcc and clang? >>> >>> >>> 1. >>> static intptr_t* _get_previous_fp() { >>> #ifdef SPARC_WORKS >>> register intptr_t **ebp; >>> __asm__("mov %%"SPELL_REG_FP", %0":"=r"(ebp)); >>> #elif defined(__clang__) >>> intptr_t **ebp; >>> __asm__ __volatile__ ("mov %%"SPELL_REG_FP", %0":"=r"(ebp):); >>> #else >>> register intptr_t **ebp __asm__ (SPELL_REG_FP); >>> #endif >>> >>> return *ebp; >>> } >>> >>> -Dmitry >>> >>> From ioi.lam at oracle.com Tue Sep 12 16:50:54 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 12 Sep 2017 09:50:54 -0700 Subject: RFR: 8186789: CDS dump crashes at ConstantPool::resolve_class_constants In-Reply-To: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> References: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> Message-ID: Since the function ConstantPool::resolve_class_constants(TRAPS) always returns true, maybe it should be changed to a void function? Thanks - Ioi On 9/1/17 12:26 PM, Jiangli Zhou wrote: > Hi, > > Please review the following fix for 8186789. > > webrev: http://cr.openjdk.java.net/~jiangli/8186789/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8186789 > > If a class fails verification due to missing dependencies at dump time, the constant pool _cache may be NULL. ConstantPool::resolve_class_constants() needs to check for that case. Also moved the function under #if INCLUDE_CDS_JAVA_HEAP, since it is only used when INCLUDE_CDS_JAVA_HEAP is enabled. > > Tested with JPRT and unit test case. > > Thanks, > Jiangli > From jiangli.zhou at oracle.com Tue Sep 12 17:02:17 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 12 Sep 2017 10:02:17 -0700 Subject: RFR: 8186789: CDS dump crashes at ConstantPool::resolve_class_constants In-Reply-To: References: <3C644DF0-2767-4992-8A82-DEC3786DB90B@oracle.com> Message-ID: > On Sep 12, 2017, at 9:50 AM, Ioi Lam wrote: > > Since the function ConstantPool::resolve_class_constants(TRAPS) always returns true, maybe it should be changed to a void function? That sounds ok to me. Thanks, Jiangli > > Thanks > > - Ioi > > > On 9/1/17 12:26 PM, Jiangli Zhou wrote: >> Hi, >> >> Please review the following fix for 8186789. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8186789/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8186789 >> >> If a class fails verification due to missing dependencies at dump time, the constant pool _cache may be NULL. ConstantPool::resolve_class_constants() needs to check for that case. Also moved the function under #if INCLUDE_CDS_JAVA_HEAP, since it is only used when INCLUDE_CDS_JAVA_HEAP is enabled. >> >> Tested with JPRT and unit test case. >> >> Thanks, >> Jiangli >> > From calvin.cheung at oracle.com Tue Sep 12 22:14:08 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 12 Sep 2017 15:14:08 -0700 Subject: RFR(S): 8138600: eliminate the need of ModuleLoaderMap.dat for CDS Message-ID: <59B85C30.1000108@oracle.com> With the fix for JDK-8186842 , we no longer need the ModuleLoaderMap to determine the loader type of a class during CDS dump time. This change is for removing the ModuleLoaderMap from the runtime image and the code which references it. bug: https://bugs.openjdk.java.net/browse/JDK-8138600 webrevs: http://cr.openjdk.java.net/~ccheung/8138600/jdk/webrev.00/ http://cr.openjdk.java.net/~ccheung/8138600/hotspot/webrev.00/ Testing: JPRT CDS tests hs-tier3 thanks, Calvin From mandy.chung at oracle.com Tue Sep 12 22:33:05 2017 From: mandy.chung at oracle.com (mandy chung) Date: Tue, 12 Sep 2017 15:33:05 -0700 Subject: RFR(S): 8138600: eliminate the need of ModuleLoaderMap.dat for CDS In-Reply-To: <59B85C30.1000108@oracle.com> References: <59B85C30.1000108@oracle.com> Message-ID: (I move this thread from jdk10-dev to jigsaw-dev which can review the jdk change). The JDK change looks good to me. Mandy On 9/12/17 3:14 PM, Calvin Cheung wrote: > With the fix for JDK-8186842 > , we no longer need > the ModuleLoaderMap to determine the loader type of a class during CDS > dump time. This change is for removing the ModuleLoaderMap from the > runtime image and the code which references it. > > bug: https://bugs.openjdk.java.net/browse/JDK-8138600 > > webrevs: > http://cr.openjdk.java.net/~ccheung/8138600/jdk/webrev.00/ > http://cr.openjdk.java.net/~ccheung/8138600/hotspot/webrev.00/ > > Testing: > ??? JPRT > ??? CDS tests > ??? hs-tier3 > > thanks, > Calvin > From jiangli.zhou at Oracle.COM Tue Sep 12 22:53:15 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Tue, 12 Sep 2017 15:53:15 -0700 Subject: RFR(S): 8138600: eliminate the need of ModuleLoaderMap.dat for CDS In-Reply-To: <59B85C30.1000108@oracle.com> References: <59B85C30.1000108@oracle.com> Message-ID: <45691800-61C9-4117-BEFB-993D0159DB33@oracle.com> Hi Calvin, Glad to see those code being eliminated. Please also remove the following in classLoader.hpp. These macros were only used by the module loader map related code. 43 // Initial sizes of the following arrays are based on the generated ModuleLoaderMap.dat 44 #define INITIAL_BOOT_MODULES_ARRAY_SIZE 30 45 #define INITIAL_PLATFORM_MODULES_ARRAY_SIZE 15 Thanks, Jiangli > On Sep 12, 2017, at 3:14 PM, Calvin Cheung wrote: > > With the fix for JDK-8186842 , we no longer need the ModuleLoaderMap to determine the loader type of a class during CDS dump time. This change is for removing the ModuleLoaderMap from the runtime image and the code which references it. > > bug: https://bugs.openjdk.java.net/browse/JDK-8138600 > > webrevs: > http://cr.openjdk.java.net/~ccheung/8138600/jdk/webrev.00/ > http://cr.openjdk.java.net/~ccheung/8138600/hotspot/webrev.00/ > > Testing: > JPRT > CDS tests > hs-tier3 > > thanks, > Calvin > From calvin.cheung at oracle.com Tue Sep 12 23:28:21 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 12 Sep 2017 16:28:21 -0700 Subject: RFR(S): 8138600: eliminate the need of ModuleLoaderMap.dat for CDS In-Reply-To: <45691800-61C9-4117-BEFB-993D0159DB33@oracle.com> References: <59B85C30.1000108@oracle.com> <45691800-61C9-4117-BEFB-993D0159DB33@oracle.com> Message-ID: <59B86D95.2070903@oracle.com> Hi Jiangli, Thanks for your quick review. On 9/12/17, 3:53 PM, Jiangli Zhou wrote: > Hi Calvin, > > Glad to see those code being eliminated. > > Please also remove the following in classLoader.hpp. These macros were > only used by the module loader map related code. > > 43 // Initial sizes of the following arrays are based on the generated ModuleLoaderMap.dat > 44 #define INITIAL_BOOT_MODULES_ARRAY_SIZE 30 > 45 #define INITIAL_PLATFORM_MODULES_ARRAY_SIZE 15 > I will remove those as well. thanks, Calvin > Thanks, > Jiangli > >> On Sep 12, 2017, at 3:14 PM, Calvin Cheung > > wrote: >> >> With the fix for JDK-8186842 >> , we no longer need >> the ModuleLoaderMap to determine the loader type of a class during >> CDS dump time. This change is for removing the ModuleLoaderMap from >> the runtime image and the code which references it. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8138600 >> >> webrevs: >> http://cr.openjdk.java.net/~ccheung/8138600/jdk/webrev.00/ >> >> http://cr.openjdk.java.net/~ccheung/8138600/hotspot/webrev.00/ >> >> Testing: >> JPRT >> CDS tests >> hs-tier3 >> >> thanks, >> Calvin >> > From calvin.cheung at oracle.com Tue Sep 12 23:29:33 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 12 Sep 2017 16:29:33 -0700 Subject: RFR(S): 8138600: eliminate the need of ModuleLoaderMap.dat for CDS In-Reply-To: References: <59B85C30.1000108@oracle.com> Message-ID: <59B86DDD.9030309@oracle.com> Thanks for your review Mandy. Calvin On 9/12/17, 3:33 PM, mandy chung wrote: > (I move this thread from jdk10-dev to jigsaw-dev which can review the > jdk change). > > The JDK change looks good to me. > > Mandy > > On 9/12/17 3:14 PM, Calvin Cheung wrote: >> With the fix for JDK-8186842 >> , we no longer need >> the ModuleLoaderMap to determine the loader type of a class during >> CDS dump time. This change is for removing the ModuleLoaderMap from >> the runtime image and the code which references it. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8138600 >> >> webrevs: >> http://cr.openjdk.java.net/~ccheung/8138600/jdk/webrev.00/ >> http://cr.openjdk.java.net/~ccheung/8138600/hotspot/webrev.00/ >> >> Testing: >> JPRT >> CDS tests >> hs-tier3 >> >> thanks, >> Calvin >> > From jamsheed.c.m at oracle.com Thu Sep 14 06:21:12 2017 From: jamsheed.c.m at oracle.com (jamsheed) Date: Thu, 14 Sep 2017 11:51:12 +0530 Subject: [10] RFR: [AOT] assert(false) failed: DEBUG MESSAGE: InterpreterMacroAssembler::call_VM_base: last_sp != NULL In-Reply-To: References: Message-ID: <8eb949bd-fa14-2722-5eaa-21a0a0c95b26@oracle.com> (adding runtime list for inputs) On Monday 11 September 2017 11:43 PM, jamsheed wrote: > brief desc: special handling of Object. in > TemplateInterpreter::deopt_reexecute_entry > > required last_sp to be reset explicitly in normal return path > > address TemplateInterpreter::deopt_reexecute_entry(Method* method, > address bcp) { > assert(method->contains(bcp), "just checkin'"); > Bytecodes::Code code = Bytecodes::java_code_at(method, bcp); > if (code == Bytecodes::_return) { > // This is used for deopt during registration of finalizers > // during Object.. We simply need to resume execution at > // the standard return vtos bytecode to pop the frame normally. > // reexecuting the real bytecode would cause double registration > // of the finalizable object. > return _normal_table.entry(Bytecodes::_return).entry(vtos); last_sp ! = null not an issue for this case, so i skip the assert in debug build http://cr.openjdk.java.net/~jcm/8168712/webrev.01/ Please review. Best Regards, Jamsheed From jamsheed.c.m at oracle.com Thu Sep 14 06:27:22 2017 From: jamsheed.c.m at oracle.com (jamsheed) Date: Thu, 14 Sep 2017 11:57:22 +0530 Subject: [10] RFR: [AOT] assert(false) failed: DEBUG MESSAGE: InterpreterMacroAssembler::call_VM_base: last_sp != NULL In-Reply-To: <8eb949bd-fa14-2722-5eaa-21a0a0c95b26@oracle.com> References: <8eb949bd-fa14-2722-5eaa-21a0a0c95b26@oracle.com> Message-ID: forgot to put bug id : https://bugs.openjdk.java.net/browse/JDK-8168712 On Thursday 14 September 2017 11:51 AM, jamsheed wrote: > (adding runtime list for inputs) > > On Monday 11 September 2017 11:43 PM, jamsheed wrote: >> brief desc: special handling of Object. in >> TemplateInterpreter::deopt_reexecute_entry >> >> required last_sp to be reset explicitly in normal return path >> >> address TemplateInterpreter::deopt_reexecute_entry(Method* method, >> address bcp) { >> assert(method->contains(bcp), "just checkin'"); >> Bytecodes::Code code = Bytecodes::java_code_at(method, bcp); >> if (code == Bytecodes::_return) { >> // This is used for deopt during registration of finalizers >> // during Object.. We simply need to resume execution at >> // the standard return vtos bytecode to pop the frame normally. >> // reexecuting the real bytecode would cause double registration >> // of the finalizable object. >> return _normal_table.entry(Bytecodes::_return).entry(vtos); > > last_sp ! = null not an issue for this case, so i skip the assert in > debug build > > http://cr.openjdk.java.net/~jcm/8168712/webrev.01/ > > Please review. > > Best Regards, > Jamsheed > > > > > From dean.long at oracle.com Thu Sep 14 06:54:20 2017 From: dean.long at oracle.com (Dean Long) Date: Wed, 13 Sep 2017 23:54:20 -0700 Subject: [10] RFR: [AOT] assert(false) failed: DEBUG MESSAGE: InterpreterMacroAssembler::call_VM_base: last_sp != NULL In-Reply-To: <8eb949bd-fa14-2722-5eaa-21a0a0c95b26@oracle.com> References: <8eb949bd-fa14-2722-5eaa-21a0a0c95b26@oracle.com> Message-ID: <069663cb-4b48-483e-7d1b-8619dafe616d@oracle.com> It looks like you accidentally dropped hotspot-compiler-dev at openjdk.java.net when you added runtime. dl On 9/13/2017 11:21 PM, jamsheed wrote: > (adding runtime list for inputs) > > On Monday 11 September 2017 11:43 PM, jamsheed wrote: >> brief desc: special handling of Object. in >> TemplateInterpreter::deopt_reexecute_entry >> >> required last_sp to be reset explicitly in normal return path >> >> address TemplateInterpreter::deopt_reexecute_entry(Method* method, >> address bcp) { >> ? assert(method->contains(bcp), "just checkin'"); >> ? Bytecodes::Code code?? = Bytecodes::java_code_at(method, bcp); >> ? if (code == Bytecodes::_return) { >> ??? // This is used for deopt during registration of finalizers >> ??? // during Object..? We simply need to resume execution at >> ??? // the standard return vtos bytecode to pop the frame normally. >> ??? // reexecuting the real bytecode would cause double registration >> ??? // of the finalizable object. >> ??? return _normal_table.entry(Bytecodes::_return).entry(vtos); > > last_sp ! = null not an issue for this case, so i skip the assert in > debug build > > http://cr.openjdk.java.net/~jcm/8168712/webrev.01/ > > Please review. > > Best Regards, > Jamsheed > > > > > From martin.doerr at sap.com Thu Sep 14 15:35:45 2017 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 14 Sep 2017 15:35:45 +0000 Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some places In-Reply-To: <07b0d758-40eb-dc1f-e25b-49e031849744@azul.com> References: <07b0d758-40eb-dc1f-e25b-49e031849744@azul.com> Message-ID: <1b20a156de344954a0ffe0bc88a07d6e@sap.com> Hi Anton, thank you very much for providing a fix. Looks correct. In your current version, other_insn_offset is always negative. I'd prefer to make it always positive and simplify the usage like: assert(other_insn_offset > 0, "first instruction must be found"); start = addr - other_insn_offset; range = BytesPerInstWord + other_insn_offset; This would be better readable. Would you agree? At the moment, jdk10 repos are temporarily closed, but we can sponsor the change when it's open again and after a 2nd review. Backports will also need to get addressed. Please note that ppc-aix-port-dev is not appropriate for reviews because the PPC64 port is part of the main repos. Therefore, I've added hotspot-runtime-dev. Best regards, Martin -----Original Message----- From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Anton Kozlov Sent: Donnerstag, 14. September 2017 15:06 To: ppc-aix-port-dev at openjdk.java.net Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some places Hi, All! Icache invalidation range calculation in NativeMovConstReg::set_data_plain and NativeMovConstReg::set_narrow_oop is incorrect and could cause VM crash: https://bugs.openjdk.java.net/browse/JDK-8187547 I suppose the root is in mismatch of supposed and actual return values of MacroAssembler::patch_set_narrow_oop and MacroAssembler::patch_calculate_address_from_global_toc_at. These functions takes address of the middle of sequence and expected to return first instruction offset (negative by current implementation). Instead of this, they return `-offset == abs(offset)` and offset to `data` respectively. Supposed fix: http://cr.openjdk.java.net/~akozlov/8187547/webrev.01/ Thanks, Anton From harold.seigel at oracle.com Thu Sep 14 19:02:02 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 14 Sep 2017 15:02:02 -0400 Subject: RFR 8187436: -Xbootclasspath/a causes sanity check assertion with exploded build Message-ID: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> Hi, Please review this JDK-10 change to fix an assertion involving ClassLoader::_num_entries.? The assertion gets triggered when running the exploded build.? ClassLoader::_num_entries is only used by CDS, which is not supported for exploded builds.? So, assertions involving _num_entries should check for a normal build before doing their check involving _num_entries. Note that a new RFE will be filed shortly requesting a re-design of the confusing boot classpath entries code as requested in one of the comments in this JBS bug. Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8187436/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8187436 The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests.? The test were run with both the normal and exploded builds. Thanks, Harold From Alan.Bateman at oracle.com Thu Sep 14 19:33:59 2017 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 14 Sep 2017 20:33:59 +0100 Subject: RFR 8187436: -Xbootclasspath/a causes sanity check assertion with exploded build In-Reply-To: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> References: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> Message-ID: <6582b1f1-c2cc-4990-1fb6-ff05aa2f5721@oracle.com> On 14/09/2017 20:02, harold seigel wrote: > Hi, > > Please review this JDK-10 change to fix an assertion involving > ClassLoader::_num_entries.? The assertion gets triggered when running > the exploded build.? ClassLoader::_num_entries is only used by CDS, > which is not supported for exploded builds.? So, assertions involving > _num_entries should check for a normal build before doing their check > involving _num_entries. > > Note that a new RFE will be filed shortly requesting a re-design of > the confusing boot classpath entries code as requested in one of the > comments in this JBS bug. > > Open webrev: > http://cr.openjdk.java.net/~hseigel/bug_8187436/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8187436 > > The change was tested with the JCK Lang and VM tests, the JTreg > hotspot, java/io, java/lang, java/util, and other tests.? The test > were run with both the normal and exploded builds. This looks okay. An alternative for the test is to put "@run main/othervm -Xbootclasspath/a:." so that you don't need to generate a source file, compiler it, and run in another VM. -Alan From jiangli.zhou at oracle.com Thu Sep 14 19:49:51 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 14 Sep 2017 12:49:51 -0700 Subject: RFR 8187436: -Xbootclasspath/a causes sanity check assertion with exploded build In-Reply-To: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> References: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> Message-ID: <4EAC55B4-2AC8-4B11-BDC7-05C887C25393@oracle.com> Hi Harold, The fix looks good. Thanks, Jiangli > On Sep 14, 2017, at 12:02 PM, harold seigel wrote: > > Hi, > > Please review this JDK-10 change to fix an assertion involving ClassLoader::_num_entries. The assertion gets triggered when running the exploded build. ClassLoader::_num_entries is only used by CDS, which is not supported for exploded builds. So, assertions involving _num_entries should check for a normal build before doing their check involving _num_entries. > > Note that a new RFE will be filed shortly requesting a re-design of the confusing boot classpath entries code as requested in one of the comments in this JBS bug. > > Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8187436/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8187436 > > The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests. The test were run with both the normal and exploded builds. > > Thanks, Harold > From harold.seigel at oracle.com Thu Sep 14 20:15:53 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 14 Sep 2017 16:15:53 -0400 Subject: RFR 8187436: -Xbootclasspath/a causes sanity check assertion with exploded build In-Reply-To: <4EAC55B4-2AC8-4B11-BDC7-05C887C25393@oracle.com> References: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> <4EAC55B4-2AC8-4B11-BDC7-05C887C25393@oracle.com> Message-ID: Hi Jiangli, Thanks for the review! Harold On 9/14/2017 3:49 PM, Jiangli Zhou wrote: > Hi Harold, > > The fix looks good. > > Thanks, > Jiangli > >> On Sep 14, 2017, at 12:02 PM, harold seigel wrote: >> >> Hi, >> >> Please review this JDK-10 change to fix an assertion involving ClassLoader::_num_entries. The assertion gets triggered when running the exploded build. ClassLoader::_num_entries is only used by CDS, which is not supported for exploded builds. So, assertions involving _num_entries should check for a normal build before doing their check involving _num_entries. >> >> Note that a new RFE will be filed shortly requesting a re-design of the confusing boot classpath entries code as requested in one of the comments in this JBS bug. >> >> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8187436/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8187436 >> >> The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests. The test were run with both the normal and exploded builds. >> >> Thanks, Harold >> From harold.seigel at oracle.com Thu Sep 14 20:19:34 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 14 Sep 2017 16:19:34 -0400 Subject: RFR 8187436: -Xbootclasspath/a causes sanity check assertion with exploded build In-Reply-To: <6582b1f1-c2cc-4990-1fb6-ff05aa2f5721@oracle.com> References: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> <6582b1f1-c2cc-4990-1fb6-ff05aa2f5721@oracle.com> Message-ID: <5c95cfa2-5900-631a-0783-d43e02804570@oracle.com> Hi Alan, Thanks for the review! I tried using "@run ... -Xbootclasspath/a=. ..." to simplify the test but JTReg adds the location of the test class file to CLASSPATH, causing it to get loaded by the app-class loader, not the boot loader. Harold On 9/14/2017 3:33 PM, Alan Bateman wrote: > > > On 14/09/2017 20:02, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to fix an assertion involving >> ClassLoader::_num_entries.? The assertion gets triggered when running >> the exploded build.? ClassLoader::_num_entries is only used by CDS, >> which is not supported for exploded builds.? So, assertions involving >> _num_entries should check for a normal build before doing their check >> involving _num_entries. >> >> Note that a new RFE will be filed shortly requesting a re-design of >> the confusing boot classpath entries code as requested in one of the >> comments in this JBS bug. >> >> Open webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8187436/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8187436 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util, and other tests.? The test >> were run with both the normal and exploded builds. > This looks okay. An alternative for the test is to put "@run > main/othervm -Xbootclasspath/a:." so that you don't need to generate a > source file, compiler it, and run in another VM. > > -Alan From david.holmes at oracle.com Fri Sep 15 01:47:58 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 15 Sep 2017 11:47:58 +1000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: References: Message-ID: <9ae20280-0f27-1edb-4969-e7c9b40977e1@oracle.com> Hi Adam, I am still very much torn over this one. I think the idea of print-and-exit flags for a potentially hosted library like the JVM is just wrong - we should never have done that, but we did. Fixing that by moving the flags to the launcher is far from trivial**. Endorsing and encouraging these sorts of flag by adding JNI support seems to be sending the wrong message. ** I can envisage a "help xxx" Dcmd that can read back the info from the VM. The launcher can send the Dcmd, print the output and exit. The launcher would not need to know what the xxx values mean, but would have to intercept the existing ones. Another option is just to be aware of these flags (are there more than jdwp and Xlog?) and deal with them specially in your custom launcher - either filter them out and ignore them, or else launch the VM in its own process to respond to them. Any changes to the JNI specification need to go through the CSR process. Cheers, David On 14/09/2017 6:26 PM, Adam Farley8 wrote: > Hi All, > > I was advised (on the OpenJDK IRC channel) that supplying a fix is > better than > proposing the idea of one, so I've gone ahead and written a "silent > exit" fix for the > example case: > > java -agentlib:jdwp=help > > This fix solves that bug, and also creates the tools for other code, VM and > otherwise, to be able to solve their exit(0) problems as well. > > I've attached the hg diffs to this email, along with a zip containing > the test > I wrote to exercise this fix. The test, once unzipped, can be run via the > TestStart.sh file, and is a bash script intended for linux execution. > Run it like this: > > bash TestStart.sh /location/of/java > > To be clear, the Java directory is the one that contains the bin directory. > > Best Regards > > Adam Farley > > > > P.S. I debated returning a jni return code from the debugInit.c's > parseOptions > method, but elected not to on the basis that (a) a seperate isHelp method > allows us to quit the OnLoad before the agent does anything that would > require us figuring out how to unload half an agent, and (b) I thought > it best to > modify as few apis as possible. This is up for debate if people think a > jni return > code is a better option here. > > P.P.S. I know the files get stripped from list emails. If anyone wants > copies, > email me and they're yours. > > > ------------------ Previous email ------------------ > > Hi All, > > I've included the full text of my reply in-line below. > > A summary is: I continue to support the idea of a new return code on the > basis > that, when the VM is nonusable but no error has occurred, we have no > suitable > option. > > Right now we can: > - Report an error that has not occurred. > - Die and take the user's code with us (except for any exit hook code). > - Return a JNI_OK, and allow the user's next action to fail. > > I think that if VM developers concur that the correct action is to leave > the VM > in a nonusable state, but to not throw an error, that this RC gives us a > better option > than exit(0). > > - Adam > > P.S. Apologies for the delay. I was on vacation. :) > > > Hi Alan, David, and Tom, > >> > >> First, thanks again for your efforts on this. As a new guy to OpenJDK > >> contributions, it means a lot to see so much progress on this so > >> quickly. :) > > > >All I see is discussion :) Progress would be something else entirely. > > True. :) > > > > >> > >> ?>On 24/08/2017 07:33, David Holmes wrote: > >> ?>> Hi Adam, > >> ?>> > >> ?>> cc'ing hotspot runtime dev as runtime own JNI and the invocation > API - > >> ?>> and some of the problematic code resides in the VM. > >> ?>Yeah, the hotspot mailing list would be a better place to discuss > this > >> ?>as there are several issues here and several places where HotSpot > aborts > >> ?>the process when initialization fails. It's a long standing issue > (going > >> ?>back 15+ years) that I think is partly because it's not easy to > release > >> ?>all resources and cleanup before CreateJavaVM returns with an error. > >> ?> > >> > >> According to the JNI spec, it is not possible (yet) to create a > second VM > >> in the same thread as the first. > >> > >> There is also a bug (dup'd against another bug I don't have the > access for) > >> which states that even a successful VM creation+destruction won't > permit > >> a second VM to be created. > >> > >> https://bugs.openjdk.java.net/browse/JDK-4712793 > >> > >> Both of these seem to imply that making a new VM after a failed > VM-creation > >> (in the same thread) is unsupported behaviour. > >> > >> So is it important to release all resources and cleanup, given that we > >> won't > >> be trying to create a new VM in this thread? By "important" I mean > "more > >> important than exiting with a return code and allowing the user's code > >> to finish". > > > >Okay, so if there is no intention of attempting to reload the jvm again, > >I'm unclear what the purpose of the hosting process actually is. To me > >it is either a customer launcher - in which case the exit calls are > >"harmless" (and atexit handlers could be used if the process has its own > >clean up) - or it's something multi-purpose part of which is to launch a > >VM. In the latter case given the inability to reload a VM, and assuming > >the process does not what it's java launching powers to be removed, then > >the only real option is to filter out the problematic arguments and > >either ignore them or exec a separate process to handle them. > > My assumption is that the user's code may be doing many things, of which > the Java work is only a part. I'm trying not to be too specific here, as > I don't > know what the user is trying to do, nor what they want their code to do if > Java returns an error. I think we should tell the user what has > happened, and > allow them to act on the information. > > Right now the VM developers don't have that option. They don't have a > mechanism > to tell the user that the VM is not in a usable state, but had found no > errors. Therefore > the VM *must* call exit(0) to indicate "pass", but also to prevent the > user trying to do > anything with the unusable VM. > > I would give them that option. If they can return an RC, they should > have one available > that fits this scenario. > > By providing this negative return code both within and without the VM, > we can give future > VM-upgrade projects the option to indicate an unusable VM with no error, > removing > the need for them to call exit(0) when the VM is unusable despite no > error occurring. > > Also, in regards to the example option: I agree that this option should > really be filtered > out before we get to the exit(0)-slash-JNI_SILENT_EXIT RC. Perhaps we > could abstract > the "is this a help option" logic into a shared function, and tie that > into the unrecognised > options logic? > > > > >> ?>> > >> ?>> This specific case seems like a bug to me as the logic is > assuming it > >> ?>> is only ever called by a launcher which it is okay to terminate. > >> ?>> Though to be honest the very existence of the "help" option > seems to > >> ?>> me somewhat misguided in a hosted-VM environment. That said, I see > >> ?>> unified logging in 9 also added a terminating "help" option . > >> ?>The agent "help" option case is tricky and would likely need an > update > >> ?>to the JVM TI spec and the Agent_OnLoad return value. > >> ?> > >> > >> To clarify, the agent "help" option is only an example of this problem. > >> There are 19 locations both within and without hotspot that call > exit(0) > >> directly, plus more places where exit is passed a variable that can be > >> 0 (e.g. the aforementioned agent "help", which calls the forceExit > function > >> with an argument of 0, which calls exit(arg) in turn). > >> > >> I understand that your comment was intended as an effort to effect a > fix > >> for this specific instance of the problem. I wanted to make sure we > kept > >> sight of the wider problem, as ideally we'd come up with an ideal > solution > >> that could be applied to all cases. > > > >The fact there are numerous potential process termination points in the > >VM and JDK native code, is something we just have to live with. I'm only > >considering these kind of "report and terminate" flags to be the problem > >cases that should be handled better. > > A fair statement. I posit that simply having this option available could > prevent > the need for further exit(0)'s in the future. > > Though I'm certainly not ruling out an entrepreneurial VM developer fixing > these issues in the future. I'm simply agreeing that resolving all of these > issues are outside of this proposal's scope. > > > > >> My thought on this was a unique return code that tells the user's code > >> that the VM is not in a usable state, but that no error has > occurred. This > >> should be a negative code (so the usual x<0 check will prevent the > user's > >> code from using the VM), but it shouldn't be one of the existing JNI > codes; > >> all of which seem to indicate either: > >> > >> a) The VM is fine and can be used (0). > >> or > >> b) The VM is not fine because an error occurred (-1 to -6). > >> > >> Ideally we need a c) The VM is not fine, but no error has occurred. > > > >It's somewhat debatable how to classify the case where you ask the VM to > >load and then perform a one-off action that effectively succeeds but > >leaves the VM unusable. Again ideally, to me, the VM would never do that > >- such actions would occur as part of VM initialization, the VM would be > >usable, but the launcher would do the termination because that is how > >the flag is specified. But that is non-trivial to untangle. > > > >David > > Agreed. > > > > >> Or is there another solution to the exit(0) problem? Other than putting > >> a copy of the rest of your code on the exit hook, I mean. > >> > >> ?> > >> ?>> > >> ?>> Options processed by the VM will be recognized, while options > >> ?>> processed by the Java launcher will not be. "-version", "-X", > "-help" > >> ?>> and numerous others are launcher options. Pure VM options are -XX > >> ?>> options, but the VM also processes some -X flags and, as a > result of > >> ?>> jigsaw, now also processes a bunch of module-related flags that are > >> ?>> simple --foo options. > >> ?>Right because these options need to passed to CreateJavaVM as they > are > >> ?>used when initializing the VM. Using system properties would just > repeat > >> ?>the issues of past (e.g. java.class.path) and require documenting > a slew > >> ?>of system properties (which is complicated at repeating options). > >> ?> > >> ?>-Alan > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From adam.farley at uk.ibm.com Thu Sep 14 08:26:50 2017 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Thu, 14 Sep 2017 08:26:50 +0000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java Message-ID: Hi All, I was advised (on the OpenJDK IRC channel) that supplying a fix is better than proposing the idea of one, so I've gone ahead and written a "silent exit" fix for the example case: java -agentlib:jdwp=help This fix solves that bug, and also creates the tools for other code, VM and otherwise, to be able to solve their exit(0) problems as well. I've attached the hg diffs to this email, along with a zip containing the test I wrote to exercise this fix. The test, once unzipped, can be run via the TestStart.sh file, and is a bash script intended for linux execution. Run it like this: bash TestStart.sh /location/of/java To be clear, the Java directory is the one that contains the bin directory. Best Regards Adam Farley P.S. I debated returning a jni return code from the debugInit.c's parseOptions method, but elected not to on the basis that (a) a seperate isHelp method allows us to quit the OnLoad before the agent does anything that would require us figuring out how to unload half an agent, and (b) I thought it best to modify as few apis as possible. This is up for debate if people think a jni return code is a better option here. P.P.S. I know the files get stripped from list emails. If anyone wants copies, email me and they're yours. ------------------ Previous email ------------------ Hi All, I've included the full text of my reply in-line below. A summary is: I continue to support the idea of a new return code on the basis that, when the VM is nonusable but no error has occurred, we have no suitable option. Right now we can: - Report an error that has not occurred. - Die and take the user's code with us (except for any exit hook code). - Return a JNI_OK, and allow the user's next action to fail. I think that if VM developers concur that the correct action is to leave the VM in a nonusable state, but to not throw an error, that this RC gives us a better option than exit(0). - Adam P.S. Apologies for the delay. I was on vacation. :) > Hi Alan, David, and Tom, >> >> First, thanks again for your efforts on this. As a new guy to OpenJDK >> contributions, it means a lot to see so much progress on this so >> quickly. :) > >All I see is discussion :) Progress would be something else entirely. True. :) > >> >> >On 24/08/2017 07:33, David Holmes wrote: >> >> Hi Adam, >> >> >> >> cc'ing hotspot runtime dev as runtime own JNI and the invocation API - >> >> and some of the problematic code resides in the VM. >> >Yeah, the hotspot mailing list would be a better place to discuss this >> >as there are several issues here and several places where HotSpot aborts >> >the process when initialization fails. It's a long standing issue (going >> >back 15+ years) that I think is partly because it's not easy to release >> >all resources and cleanup before CreateJavaVM returns with an error. >> > >> >> According to the JNI spec, it is not possible (yet) to create a second VM >> in the same thread as the first. >> >> There is also a bug (dup'd against another bug I don't have the access for) >> which states that even a successful VM creation+destruction won't permit >> a second VM to be created. >> >> https://bugs.openjdk.java.net/browse/JDK-4712793 >> >> Both of these seem to imply that making a new VM after a failed VM-creation >> (in the same thread) is unsupported behaviour. >> >> So is it important to release all resources and cleanup, given that we >> won't >> be trying to create a new VM in this thread? By "important" I mean "more >> important than exiting with a return code and allowing the user's code >> to finish". > >Okay, so if there is no intention of attempting to reload the jvm again, >I'm unclear what the purpose of the hosting process actually is. To me >it is either a customer launcher - in which case the exit calls are >"harmless" (and atexit handlers could be used if the process has its own >clean up) - or it's something multi-purpose part of which is to launch a >VM. In the latter case given the inability to reload a VM, and assuming >the process does not what it's java launching powers to be removed, then >the only real option is to filter out the problematic arguments and >either ignore them or exec a separate process to handle them. My assumption is that the user's code may be doing many things, of which the Java work is only a part. I'm trying not to be too specific here, as I don't know what the user is trying to do, nor what they want their code to do if Java returns an error. I think we should tell the user what has happened, and allow them to act on the information. Right now the VM developers don't have that option. They don't have a mechanism to tell the user that the VM is not in a usable state, but had found no errors. Therefore the VM *must* call exit(0) to indicate "pass", but also to prevent the user trying to do anything with the unusable VM. I would give them that option. If they can return an RC, they should have one available that fits this scenario. By providing this negative return code both within and without the VM, we can give future VM-upgrade projects the option to indicate an unusable VM with no error, removing the need for them to call exit(0) when the VM is unusable despite no error occurring. Also, in regards to the example option: I agree that this option should really be filtered out before we get to the exit(0)-slash-JNI_SILENT_EXIT RC. Perhaps we could abstract the "is this a help option" logic into a shared function, and tie that into the unrecognised options logic? > >> >> >> >> This specific case seems like a bug to me as the logic is assuming it >> >> is only ever called by a launcher which it is okay to terminate. >> >> Though to be honest the very existence of the "help" option seems to >> >> me somewhat misguided in a hosted-VM environment. That said, I see >> >> unified logging in 9 also added a terminating "help" option . >> >The agent "help" option case is tricky and would likely need an update >> >to the JVM TI spec and the Agent_OnLoad return value. >> > >> >> To clarify, the agent "help" option is only an example of this problem. >> There are 19 locations both within and without hotspot that call exit(0) >> directly, plus more places where exit is passed a variable that can be >> 0 (e.g. the aforementioned agent "help", which calls the forceExit function >> with an argument of 0, which calls exit(arg) in turn). >> >> I understand that your comment was intended as an effort to effect a fix >> for this specific instance of the problem. I wanted to make sure we kept >> sight of the wider problem, as ideally we'd come up with an ideal solution >> that could be applied to all cases. > >The fact there are numerous potential process termination points in the >VM and JDK native code, is something we just have to live with. I'm only >considering these kind of "report and terminate" flags to be the problem >cases that should be handled better. A fair statement. I posit that simply having this option available could prevent the need for further exit(0)'s in the future. Though I'm certainly not ruling out an entrepreneurial VM developer fixing these issues in the future. I'm simply agreeing that resolving all of these issues are outside of this proposal's scope. > >> My thought on this was a unique return code that tells the user's code >> that the VM is not in a usable state, but that no error has occurred. This >> should be a negative code (so the usual x<0 check will prevent the user's >> code from using the VM), but it shouldn't be one of the existing JNI codes; >> all of which seem to indicate either: >> >> a) The VM is fine and can be used (0). >> or >> b) The VM is not fine because an error occurred (-1 to -6). >> >> Ideally we need a c) The VM is not fine, but no error has occurred. > >It's somewhat debatable how to classify the case where you ask the VM to >load and then perform a one-off action that effectively succeeds but >leaves the VM unusable. Again ideally, to me, the VM would never do that >- such actions would occur as part of VM initialization, the VM would be >usable, but the launcher would do the termination because that is how >the flag is specified. But that is non-trivial to untangle. > >David Agreed. > >> Or is there another solution to the exit(0) problem? Other than putting >> a copy of the rest of your code on the exit hook, I mean. >> >> > >> >> >> >> Options processed by the VM will be recognized, while options >> >> processed by the Java launcher will not be. "-version", "-X", "-help" >> >> and numerous others are launcher options. Pure VM options are -XX >> >> options, but the VM also processes some -X flags and, as a result of >> >> jigsaw, now also processes a bunch of module-related flags that are >> >> simple --foo options. >> >Right because these options need to passed to CreateJavaVM as they are >> >used when initializing the VM. Using system properties would just repeat >> >the issues of past (e.g. java.class.path) and require documenting a slew >> >of system properties (which is complicated at repeating options). >> > >> >-Alan Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From akozlov at azul.com Thu Sep 14 18:25:40 2017 From: akozlov at azul.com (Anton Kozlov) Date: Thu, 14 Sep 2017 21:25:40 +0300 Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some places In-Reply-To: <1b20a156de344954a0ffe0bc88a07d6e@sap.com> References: <07b0d758-40eb-dc1f-e25b-49e031849744@azul.com> <1b20a156de344954a0ffe0bc88a07d6e@sap.com> Message-ID: Martin, thanks for review! I tried to keep knowledge of where relocation is pointing to in macroAssembler. But if simplier code is preferable, of course I'm agree with this. Webrev with suggestion applied: http://cr.openjdk.java.net/~akozlov/8187547/webrev.02 I'm not very confident with the process for now (when repo closed), so please do what should be done, I'll try to assist as much as I can. Backport should be easy, as the patch applies to jdk8u as well Thanks, Anton On 14.09.2017 18:35, Doerr, Martin wrote: > Hi Anton, > > thank you very much for providing a fix. Looks correct. > > In your current version, other_insn_offset is always negative. > I'd prefer to make it always positive and simplify the usage like: > assert(other_insn_offset > 0, "first instruction must be found"); > start = addr - other_insn_offset; > range = BytesPerInstWord + other_insn_offset; > This would be better readable. Would you agree? > > At the moment, jdk10 repos are temporarily closed, but we can sponsor the change when it's open again and after a 2nd review. > Backports will also need to get addressed. > > Please note that ppc-aix-port-dev is not appropriate for reviews because the PPC64 port is part of the main repos. Therefore, I've added hotspot-runtime-dev. > > Best regards, > Martin > > > -----Original Message----- > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Anton Kozlov > Sent: Donnerstag, 14. September 2017 15:06 > To: ppc-aix-port-dev at openjdk.java.net > Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some places > > Hi, All! > > Icache invalidation range calculation in NativeMovConstReg::set_data_plain and NativeMovConstReg::set_narrow_oop is incorrect and could cause VM crash: > > https://bugs.openjdk.java.net/browse/JDK-8187547 > > I suppose the root is in mismatch of supposed and actual return values of MacroAssembler::patch_set_narrow_oop and MacroAssembler::patch_calculate_address_from_global_toc_at. > These functions takes address of the middle of sequence and expected to return first instruction offset (negative by current implementation). Instead of this, they return `-offset == abs(offset)` and offset to `data` respectively. > > Supposed fix: http://cr.openjdk.java.net/~akozlov/8187547/webrev.01/ > > Thanks, > Anton > From david.holmes at oracle.com Fri Sep 15 05:29:06 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 15 Sep 2017 15:29:06 +1000 Subject: Fwd: openjdk runtime consumes almost 100% cpu on Thread.sleep call In-Reply-To: References: Message-ID: <1ae4d397-e636-92e2-9c22-e0d01e0508a7@oracle.com> Hi, This email only just turned up on the mailing list despite being sent back in August! On 11/08/2017 8:54 PM, Aditya Ilkal wrote: > Hi, > > We are running openjdk with below version on android os (Linux localhost > 4.9.31-android-x86) > > *openjdk version "9-internal"* > *OpenJDK Runtime Environment (build 9-internal+0-adhoc.aditi.dev)* > *OpenJDK Client VM (build 9-internal+0-adhoc.aditi.dev, mixed mode)* Where did you obtain this from? Did you build this from the mobile-dev project? I suggest asking on mobile-dev at openjdk.java.net as that is the only place we provide any level of Android support. I don't know how sleep is implemented on android. David > > The following is the program > > public static void main(String[] args) throws Throwable { > while(true) { > System.out.println("sd123"); > Thread.sleep(30000); > System.out.println("sd12"); > Thread.sleep(10000); > } > } > > The processes details on the OS is as below. > > *Main Process* > PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND > 7206 28501 root S 869m 24.6 0 87.8 /etc/jdk8/images/jre/bin/java > *Threads of above process* > PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND > 7208 28501 root R 869m 24.6 0 13.7 {VM Thread} > 7207 28501 root R 869m 24.6 1 13.3 /etc/jdk8/images/jre/bin/java > 7212 28501 root R 869m 24.6 0 12.8 {C1 CompilerThre} > 7222 28501 root R 869m 24.6 1 12.6 {VM Periodic Task thread} > 7216 28501 root R 869m 24.6 1 12.3 {Sweeper thread} > 7217 28501 root R 869m 24.6 1 11.7 {Common-Cleaner} > The same application when run on Ubuntu linux runs very well and the thread > state of above threads is 'S', where as in android os, it is shown as R and > consuming CPU. > > Also, the output "sd12" which should print after waiting 30 s, is taking > more time. This just means sleep interval is not getting calculated > correctly. > > This can be easily reproducible, hence request you to look in to this issue > and provide possible insights. > > Thanks, > Aditya Ilkal > From Alan.Bateman at oracle.com Fri Sep 15 10:17:25 2017 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 15 Sep 2017 11:17:25 +0100 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: <9ae20280-0f27-1edb-4969-e7c9b40977e1@oracle.com> References: <9ae20280-0f27-1edb-4969-e7c9b40977e1@oracle.com> Message-ID: <1aa2e842-79b3-77a6-c4ab-dbab94ef870f@oracle.com> On 15/09/2017 02:47, David Holmes wrote: > Hi Adam, > > I am still very much torn over this one. I think the idea of > print-and-exit flags for a potentially hosted library like the JVM is > just wrong - we should never have done that, but we did. Fixing that > by moving the flags to the launcher is far from trivial**. Endorsing > and encouraging these sorts of flag by adding JNI support seems to be > sending the wrong message. > > ** I can envisage a "help xxx" Dcmd that can read back the info from > the VM. The launcher can send the Dcmd, print the output and exit. The > launcher would not need to know what the xxx values mean, but would > have to intercept the existing ones. > > Another option is just to be aware of these flags (are there more than > jdwp and Xlog?) and deal with them specially in your custom launcher - > either filter them out and ignore them, or else launch the VM in its > own process to respond to them. > > Any changes to the JNI specification need to go through the CSR process. Yes, it would require an update to the JNI spec, also a change to the JVM TI spec where Agent_OnLoad returning a non-0 value is specified to terminates the VM. The name and value needs discussion too, esp. as the JNI spec uses negative values for failure. In any case, I'm also torn over this one as it's a corner case that is only interesting for custom launchers that load agents with options that print usage messages. It wouldn't be hard to have the Agent_OnLoad specify a printf hook that the agent could use for output although there are complications with agents such as JDWP that also announce their transport end point. Beyond that there is still the issue of the custom launcher that would need to know to destroy the VM without reporting an error. So what happened to the more meaty part to this which is fixing the various cases in HotSpot that terminate the process during initialization? I would expect some progress could be made on those cases while trying to decide whether to rev the JNI and JVM TI specs to cover the help case. -Alan From goetz.lindenmaier at sap.com Fri Sep 15 10:27:59 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 15 Sep 2017 10:27:59 +0000 Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some places In-Reply-To: References: <07b0d758-40eb-dc1f-e25b-49e031849744@azul.com> <1b20a156de344954a0ffe0bc88a07d6e@sap.com> Message-ID: <107e53d21d06483db1b01d0dd91c9cc9@sap.com> Hi Anton, thanks for fixing this issue. Looks good. Nevertheless I would find it more readable if the patch_* functions would just return the address of the first instruction. The code in nativeInst_ppc then would read if (MacroAssembler::get_address_of_calculate_address_from_global_toc_at(addr, cb->content_begin()) != (address)data) { address inst2_addr = addr; const address inst1_addr = MacroAssembler::patch_calculate_address_from_global_toc_at(inst2_addr, cb->content_begin(), (address)data); ICache::ppc64_flush_icache_bytes(inst1_addr, inst2_addr - inst1_addr + BytesPerInstWord); } which I can read much more easily. Similar for patch_set_narrow_oop. You could add // Returns address of first instruction in sequence. in macroAssembler_ppc.hpp. Please update the Oracle copyright in nativeInst_ppc.cpp. Also, jdk10/master is available. Could you please prepare the webrev against that repo? I assume you are allowed to contribute to openJdk being with Azul who have signed the OCA? Best regards, Goetz. > -----Original Message----- > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev- > bounces at openjdk.java.net] On Behalf Of Anton Kozlov > Sent: Donnerstag, 14. September 2017 20:26 > To: Doerr, Martin ; ppc-aix-port- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some > places > > Martin, > > thanks for review! > > I tried to keep knowledge of where relocation is pointing to in > macroAssembler. > But if simplier code is preferable, of course I'm agree with this. > > Webrev with suggestion applied: > http://cr.openjdk.java.net/~akozlov/8187547/webrev.02 > > I'm not very confident with the process for now (when repo closed), so > please do what should be done, I'll try to assist as much as I can. > > Backport should be easy, as the patch applies to jdk8u as well > > Thanks, > Anton > > On 14.09.2017 18:35, Doerr, Martin wrote: > > Hi Anton, > > > > thank you very much for providing a fix. Looks correct. > > > > In your current version, other_insn_offset is always negative. > > I'd prefer to make it always positive and simplify the usage like: > > assert(other_insn_offset > 0, "first instruction must be found"); > > start = addr - other_insn_offset; > > range = BytesPerInstWord + other_insn_offset; > > This would be better readable. Would you agree? > > > > At the moment, jdk10 repos are temporarily closed, but we can sponsor > the change when it's open again and after a 2nd review. > > Backports will also need to get addressed. > > > > Please note that ppc-aix-port-dev is not appropriate for reviews because > the PPC64 port is part of the main repos. Therefore, I've added hotspot- > runtime-dev. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev- > bounces at openjdk.java.net] On Behalf Of Anton Kozlov > > Sent: Donnerstag, 14. September 2017 15:06 > > To: ppc-aix-port-dev at openjdk.java.net > > Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some > places > > > > Hi, All! > > > > Icache invalidation range calculation in NativeMovConstReg::set_data_plain > and NativeMovConstReg::set_narrow_oop is incorrect and could cause VM > crash: > > > > https://bugs.openjdk.java.net/browse/JDK-8187547 > > > > I suppose the root is in mismatch of supposed and actual return values of > MacroAssembler::patch_set_narrow_oop and > MacroAssembler::patch_calculate_address_from_global_toc_at. > > These functions takes address of the middle of sequence and expected to > return first instruction offset (negative by current implementation). Instead > of this, they return `-offset == abs(offset)` and offset to `data` respectively. > > > > Supposed fix: http://cr.openjdk.java.net/~akozlov/8187547/webrev.01/ > > > > Thanks, > > Anton > > From david.holmes at oracle.com Fri Sep 15 11:03:32 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 15 Sep 2017 21:03:32 +1000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: <1aa2e842-79b3-77a6-c4ab-dbab94ef870f@oracle.com> References: <9ae20280-0f27-1edb-4969-e7c9b40977e1@oracle.com> <1aa2e842-79b3-77a6-c4ab-dbab94ef870f@oracle.com> Message-ID: On 15/09/2017 8:17 PM, Alan Bateman wrote: > On 15/09/2017 02:47, David Holmes wrote: >> Hi Adam, >> >> I am still very much torn over this one. I think the idea of >> print-and-exit flags for a potentially hosted library like the JVM is >> just wrong - we should never have done that, but we did. Fixing that >> by moving the flags to the launcher is far from trivial**. Endorsing >> and encouraging these sorts of flag by adding JNI support seems to be >> sending the wrong message. >> >> ** I can envisage a "help xxx" Dcmd that can read back the info from >> the VM. The launcher can send the Dcmd, print the output and exit. The >> launcher would not need to know what the xxx values mean, but would >> have to intercept the existing ones. >> >> Another option is just to be aware of these flags (are there more than >> jdwp and Xlog?) and deal with them specially in your custom launcher - >> either filter them out and ignore them, or else launch the VM in its >> own process to respond to them. >> >> Any changes to the JNI specification need to go through the CSR process. > Yes, it would require an update to the JNI spec, also a change to the > JVM TI spec where Agent_OnLoad returning a non-0 value is specified to > terminates the VM. The name and value needs discussion too, esp. as the > JNI spec uses negative values for failure. > > In any case, I'm also torn over this one as it's a corner case that is > only interesting for custom launchers that load agents with options that > print usage messages. It wouldn't be hard to have the Agent_OnLoad > specify a printf hook that the agent could use for output although there > are complications with agents such as JDWP that also announce their > transport end point. Beyond that there is still the issue of the custom > launcher that would need to know to destroy the VM without reporting an > error. > > So what happened to the more meaty part to this which is fixing the > various cases in HotSpot that terminate the process during > initialization? I would expect some progress could be made on those > cases while trying to decide whether to rev the JNI and JVM TI specs to > cover the help case. Trying to eliminate the vm_exit_during_initialization paths in hotspot is a huge undertaking IMHO. David > > -Alan From bob.vandette at oracle.com Fri Sep 15 13:05:29 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Fri, 15 Sep 2017 09:05:29 -0400 Subject: openjdk runtime consumes almost 100% cpu on Thread.sleep call In-Reply-To: References: Message-ID: Are you building OpenJDK from the mobile project repositories? What version of Android are you running on? I just got a report yesterday that Thread.sleep will hang on Android 8 but we have not determined the root cause yet. Bob Vandette > On Aug 11, 2017, at 6:54 AM, Aditya Ilkal wrote: > > Hi, > > We are running openjdk with below version on android os (Linux localhost > 4.9.31-android-x86) > > *openjdk version "9-internal"* > *OpenJDK Runtime Environment (build 9-internal+0-adhoc.aditi.dev)* > *OpenJDK Client VM (build 9-internal+0-adhoc.aditi.dev, mixed mode)* > > > The following is the program > > public static void main(String[] args) throws Throwable { > while(true) { > System.out.println("sd123"); > Thread.sleep(30000); > System.out.println("sd12"); > Thread.sleep(10000); > } > } > > The processes details on the OS is as below. > > *Main Process* > PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND > 7206 28501 root S 869m 24.6 0 87.8 /etc/jdk8/images/jre/bin/java > *Threads of above process* > PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND > 7208 28501 root R 869m 24.6 0 13.7 {VM Thread} > 7207 28501 root R 869m 24.6 1 13.3 /etc/jdk8/images/jre/bin/java > 7212 28501 root R 869m 24.6 0 12.8 {C1 CompilerThre} > 7222 28501 root R 869m 24.6 1 12.6 {VM Periodic Task thread} > 7216 28501 root R 869m 24.6 1 12.3 {Sweeper thread} > 7217 28501 root R 869m 24.6 1 11.7 {Common-Cleaner} > The same application when run on Ubuntu linux runs very well and the thread > state of above threads is 'S', where as in android os, it is shown as R and > consuming CPU. > > Also, the output "sd12" which should print after waiting 30 s, is taking > more time. This just means sleep interval is not getting calculated > correctly. > > This can be easily reproducible, hence request you to look in to this issue > and provide possible insights. > > Thanks, > Aditya Ilkal From jiangli.zhou at Oracle.COM Fri Sep 15 19:29:47 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Fri, 15 Sep 2017 12:29:47 -0700 Subject: RFR(S): 8068314: "Java fields that are currently set during shared space dumping" comment is incorrect Message-ID: <638AD108-377E-4126-9D55-A0637A563504@oracle.com> Hi, Please review following changes for 8068314. webrev: http://cr.openjdk.java.net/~jiangli/8068314/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8068314 The change cleans up out dated code and comments in universe_post_init(). The preallocated out_of_memory errors could be used during CDS/AppCDS dump time, especially with the use of java class loader instances and executing java code during dump time. Also, initializing the error messages during dump time has no unwanted side effects on the archived data. Tested with CDS/AppCDS related tests. Thanks, Jiangli From ioi.lam at oracle.com Fri Sep 15 19:47:45 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 15 Sep 2017 12:47:45 -0700 Subject: RFR(S): 8068314: "Java fields that are currently set during shared space dumping" comment is incorrect In-Reply-To: <638AD108-377E-4126-9D55-A0637A563504@oracle.com> References: <638AD108-377E-4126-9D55-A0637A563504@oracle.com> Message-ID: <07fa2111-42e6-2f4b-eac8-fd7750bff429@oracle.com> Looks good. Thanks! - Ioi On 9/15/17 12:29 PM, Jiangli Zhou wrote: > Hi, > > Please review following changes for 8068314. > > webrev: http://cr.openjdk.java.net/~jiangli/8068314/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8068314 > > The change cleans up out dated code and comments in universe_post_init(). The preallocated out_of_memory errors could be used during CDS/AppCDS dump time, especially with the use of java class loader instances and executing java code during dump time. Also, initializing the error messages during dump time has no unwanted side effects on the archived data. > > Tested with CDS/AppCDS related tests. > > Thanks, > Jiangli From jiangli.zhou at oracle.com Fri Sep 15 20:30:37 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 15 Sep 2017 13:30:37 -0700 Subject: RFR(S): 8068314: "Java fields that are currently set during shared space dumping" comment is incorrect In-Reply-To: <07fa2111-42e6-2f4b-eac8-fd7750bff429@oracle.com> References: <638AD108-377E-4126-9D55-A0637A563504@oracle.com> <07fa2111-42e6-2f4b-eac8-fd7750bff429@oracle.com> Message-ID: <4E826810-05D1-4AC8-8E95-21BD7E01B4F4@oracle.com> Thanks, Ioi! Jiangli > On Sep 15, 2017, at 12:47 PM, Ioi Lam wrote: > > Looks good. Thanks! > > - Ioi > > > On 9/15/17 12:29 PM, Jiangli Zhou wrote: >> Hi, >> >> Please review following changes for 8068314. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8068314/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8068314 >> >> The change cleans up out dated code and comments in universe_post_init(). The preallocated out_of_memory errors could be used during CDS/AppCDS dump time, especially with the use of java class loader instances and executing java code during dump time. Also, initializing the error messages during dump time has no unwanted side effects on the archived data. >> >> Tested with CDS/AppCDS related tests. >> >> Thanks, >> Jiangli > From harold.seigel at oracle.com Fri Sep 15 20:36:30 2017 From: harold.seigel at oracle.com (harold seigel) Date: Fri, 15 Sep 2017 16:36:30 -0400 Subject: RFR(S): 8068314: "Java fields that are currently set during shared space dumping" comment is incorrect In-Reply-To: <638AD108-377E-4126-9D55-A0637A563504@oracle.com> References: <638AD108-377E-4126-9D55-A0637A563504@oracle.com> Message-ID: <9a477dd4-25b2-c61f-6736-df72559c0710@oracle.com> Hi Jiangli, The changes look good. Thanks, Harold On 9/15/2017 3:29 PM, Jiangli Zhou wrote: > Hi, > > Please review following changes for 8068314. > > webrev: http://cr.openjdk.java.net/~jiangli/8068314/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8068314 > > The change cleans up out dated code and comments in universe_post_init(). The preallocated out_of_memory errors could be used during CDS/AppCDS dump time, especially with the use of java class loader instances and executing java code during dump time. Also, initializing the error messages during dump time has no unwanted side effects on the archived data. > > Tested with CDS/AppCDS related tests. > > Thanks, > Jiangli From jiangli.zhou at oracle.com Fri Sep 15 23:00:41 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 15 Sep 2017 16:00:41 -0700 Subject: RFR(S): 8068314: "Java fields that are currently set during shared space dumping" comment is incorrect In-Reply-To: <9a477dd4-25b2-c61f-6736-df72559c0710@oracle.com> References: <638AD108-377E-4126-9D55-A0637A563504@oracle.com> <9a477dd4-25b2-c61f-6736-df72559c0710@oracle.com> Message-ID: <272BCD61-E2CA-47D4-A6F3-905AA5D0723F@oracle.com> Thanks, Harold! Jiangli > On Sep 15, 2017, at 1:36 PM, harold seigel wrote: > > Hi Jiangli, > > The changes look good. > > Thanks, Harold > > > On 9/15/2017 3:29 PM, Jiangli Zhou wrote: >> Hi, >> >> Please review following changes for 8068314. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8068314/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8068314 >> >> The change cleans up out dated code and comments in universe_post_init(). The preallocated out_of_memory errors could be used during CDS/AppCDS dump time, especially with the use of java class loader instances and executing java code during dump time. Also, initializing the error messages during dump time has no unwanted side effects on the archived data. >> >> Tested with CDS/AppCDS related tests. >> >> Thanks, >> Jiangli > From george.triantafillou at oracle.com Mon Sep 18 13:33:10 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Mon, 18 Sep 2017 09:33:10 -0400 Subject: RFR 8187436: -Xbootclasspath/a causes sanity check assertion with exploded build In-Reply-To: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> References: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> Message-ID: <032a660a-53af-83a9-65e5-c97474059228@oracle.com> Hi Harold, Will you move the test to test/runtime/getSysPackage/GetPackageXbootclasspath.java? Otherwise, looks good. -George On 9/14/2017 3:02 PM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to fix an assertion involving > ClassLoader::_num_entries.? The assertion gets triggered when running > the exploded build.? ClassLoader::_num_entries is only used by CDS, > which is not supported for exploded builds.? So, assertions involving > _num_entries should check for a normal build before doing their check > involving _num_entries. > > Note that a new RFE will be filed shortly requesting a re-design of > the confusing boot classpath entries code as requested in one of the > comments in this JBS bug. > > Open webrev: > http://cr.openjdk.java.net/~hseigel/bug_8187436/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8187436 > > The change was tested with the JCK Lang and VM tests, the JTreg > hotspot, java/io, java/lang, java/util, and other tests.? The test > were run with both the normal and exploded builds. > > Thanks, Harold > From harold.seigel at oracle.com Mon Sep 18 13:34:56 2017 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 18 Sep 2017 09:34:56 -0400 Subject: RFR 8187436: -Xbootclasspath/a causes sanity check assertion with exploded build In-Reply-To: <032a660a-53af-83a9-65e5-c97474059228@oracle.com> References: <58fa0597-2586-5c5e-2e51-1b3a9e681253@oracle.com> <032a660a-53af-83a9-65e5-c97474059228@oracle.com> Message-ID: <309974aa-04fe-6fa4-4b55-e6ce10bd7463@oracle.com> Thanks George! I'll move the test before pushing the change. Harold On 9/18/2017 9:33 AM, George Triantafillou wrote: > Hi Harold, > > Will you move the test to > test/runtime/getSysPackage/GetPackageXbootclasspath.java? Otherwise, > looks good. > > -George > > On 9/14/2017 3:02 PM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to fix an assertion involving >> ClassLoader::_num_entries.? The assertion gets triggered when running >> the exploded build.? ClassLoader::_num_entries is only used by CDS, >> which is not supported for exploded builds.? So, assertions involving >> _num_entries should check for a normal build before doing their check >> involving _num_entries. >> >> Note that a new RFE will be filed shortly requesting a re-design of >> the confusing boot classpath entries code as requested in one of the >> comments in this JBS bug. >> >> Open webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8187436/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8187436 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util, and other tests.? The test >> were run with both the normal and exploded builds. >> >> Thanks, Harold >> > From akozlov at azul.com Tue Sep 19 11:43:45 2017 From: akozlov at azul.com (Anton Kozlov) Date: Tue, 19 Sep 2017 14:43:45 +0300 Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some places In-Reply-To: <107e53d21d06483db1b01d0dd91c9cc9@sap.com> References: <07b0d758-40eb-dc1f-e25b-49e031849744@azul.com> <1b20a156de344954a0ffe0bc88a07d6e@sap.com> <107e53d21d06483db1b01d0dd91c9cc9@sap.com> Message-ID: <5aac8f0a-b106-b7c7-8974-c4509c821d40@azul.com> Hi, Goetz, thanks for rewiew! I made suggested changes in the patch and preparied webrev for jdk10-master http://cr.openjdk.java.net/~akozlov/8187547/webrev.03/ > I assume you are allowed to contribute to openJdk being with Azul who > have signed the OCA? Yes, Azul signed OCA and I'm working here. Thanks, Anton On 15.09.2017 13:27, Lindenmaier, Goetz wrote: > Hi Anton, > > thanks for fixing this issue. Looks good. > > Nevertheless I would find it more readable if the patch_* functions > would just return the address of the first instruction. > The code in nativeInst_ppc then would read > > if (MacroAssembler::get_address_of_calculate_address_from_global_toc_at(addr, cb->content_begin()) != > (address)data) { > address inst2_addr = addr; > const address inst1_addr = > MacroAssembler::patch_calculate_address_from_global_toc_at(inst2_addr, cb->content_begin(), > (address)data); > ICache::ppc64_flush_icache_bytes(inst1_addr, inst2_addr - inst1_addr + BytesPerInstWord); > } > > which I can read much more easily. > Similar for patch_set_narrow_oop. > You could add > // Returns address of first instruction in sequence. > in macroAssembler_ppc.hpp. > > Please update the Oracle copyright in nativeInst_ppc.cpp. > > Also, jdk10/master is available. Could you please prepare the webrev against > that repo? > > I assume you are allowed to contribute to openJdk being with Azul who > have signed the OCA? > > Best regards, > Goetz. > > > > >> -----Original Message----- >> From: ppc-aix-port-dev [mailto:ppc-aix-port-dev- >> bounces at openjdk.java.net] On Behalf Of Anton Kozlov >> Sent: Donnerstag, 14. September 2017 20:26 >> To: Doerr, Martin ; ppc-aix-port- >> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some >> places >> >> Martin, >> >> thanks for review! >> >> I tried to keep knowledge of where relocation is pointing to in >> macroAssembler. >> But if simplier code is preferable, of course I'm agree with this. >> >> Webrev with suggestion applied: >> http://cr.openjdk.java.net/~akozlov/8187547/webrev.02 >> >> I'm not very confident with the process for now (when repo closed), so >> please do what should be done, I'll try to assist as much as I can. >> >> Backport should be easy, as the patch applies to jdk8u as well >> >> Thanks, >> Anton >> >> On 14.09.2017 18:35, Doerr, Martin wrote: >>> Hi Anton, >>> >>> thank you very much for providing a fix. Looks correct. >>> >>> In your current version, other_insn_offset is always negative. >>> I'd prefer to make it always positive and simplify the usage like: >>> assert(other_insn_offset > 0, "first instruction must be found"); >>> start = addr - other_insn_offset; >>> range = BytesPerInstWord + other_insn_offset; >>> This would be better readable. Would you agree? >>> >>> At the moment, jdk10 repos are temporarily closed, but we can sponsor >> the change when it's open again and after a 2nd review. >>> Backports will also need to get addressed. >>> >>> Please note that ppc-aix-port-dev is not appropriate for reviews because >> the PPC64 port is part of the main repos. Therefore, I've added hotspot- >> runtime-dev. >>> >>> Best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: ppc-aix-port-dev [mailto:ppc-aix-port-dev- >> bounces at openjdk.java.net] On Behalf Of Anton Kozlov >>> Sent: Donnerstag, 14. September 2017 15:06 >>> To: ppc-aix-port-dev at openjdk.java.net >>> Subject: RFR(s): 8187547: PPC64: icache invalidation is incorrect in some >> places >>> >>> Hi, All! >>> >>> Icache invalidation range calculation in NativeMovConstReg::set_data_plain >> and NativeMovConstReg::set_narrow_oop is incorrect and could cause VM >> crash: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8187547 >>> >>> I suppose the root is in mismatch of supposed and actual return values of >> MacroAssembler::patch_set_narrow_oop and >> MacroAssembler::patch_calculate_address_from_global_toc_at. >>> These functions takes address of the middle of sequence and expected to >> return first instruction offset (negative by current implementation). Instead >> of this, they return `-offset == abs(offset)` and offset to `data` respectively. >>> >>> Supposed fix: http://cr.openjdk.java.net/~akozlov/8187547/webrev.01/ >>> >>> Thanks, >>> Anton >>> From adam.farley at uk.ibm.com Tue Sep 19 12:57:48 2017 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Tue, 19 Sep 2017 13:57:48 +0100 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: References: <9ae20280-0f27-1edb-4969-e7c9b40977e1@oracle.com> <1aa2e842-79b3-77a6-c4ab-dbab94ef870f@oracle.com> Message-ID: Hi David, Alan, You are right in that the changes to HotSpot would be nontrivial. I see a number of places in (e.g.) arguments.cpp that seem to exit in the same manner as Xlog (such as -Xinternalversion). I would advise ploughing through the CSR process to alter the JNI spec, and simultaneously identify some key paths that can be raised as bugs. That way, when people have time to address these issues, the mechanism to handle a silent exit is already in place. The JDWP fix can be raised separately as one of these bugs, if it would make things simpler. As for the name, JNI_SILENT_EXIT is a placeholder, and can be readily changed. Do you have any suggestions? Lastly, in an ideal world, the VM initialisation should never exit(#). It should return a return code that tells the caller something, pass or fail, messy or tidy. That way, if someone is using the JNI as part of something bigger (like a database or a web server), one of these scenarios is just a bug, rather than a world-ender like exit(#). And now for the individual messages. :) David: Having help data returned by the launcher seems like a good way to avoid exit(0) calls, but I'm not sure how we'd prevent a JNI-caller using those options. Ultimately, to be sure, we'd have to remove the logic for those options, centralise the data to better enable launcher access, and add some logic in there so it can find any other help data (e.g. from the jdwp agent library). I feel this would be a bigger task than adding the new return code and changing the vm, plus it wouldn't provide for any non-help scenarios where the vm wants to shut down without error during initialisation. Alan: I should mention that the silent exit solution is already in use in the OpenJ9 VM. Not all of the exit paths have been resolved, but many have. The code is open and can be found here: https://github.com/eclipse/openj9 And though the silent exit code is disabled for the time being, it can be re-enabled by entering this class: runtime/vm/jvminit.c and altering line 2343 ( ctrl-f for exit(1) if it's not there). I won't paste the full code here in case people are concerned about contamination, but I would assert that this code (and the associated vm files) prove that the concept is possible. Note that that code should not be enabled until after we've integrated the code that can handle a silent exit. Best Regards Adam Farley P.S. Thank you both for your efforts on this. :) From: David Holmes To: Alan Bateman , Adam Farley8 , core-libs-dev at openjdk.java.net, hotspot-runtime-dev at openjdk.java.net, thomas.stuefe at gmail.com Date: 15/09/2017 12:03 Subject: Re: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java On 15/09/2017 8:17 PM, Alan Bateman wrote: > On 15/09/2017 02:47, David Holmes wrote: >> Hi Adam, >> >> I am still very much torn over this one. I think the idea of >> print-and-exit flags for a potentially hosted library like the JVM is >> just wrong - we should never have done that, but we did. Fixing that >> by moving the flags to the launcher is far from trivial**. Endorsing >> and encouraging these sorts of flag by adding JNI support seems to be >> sending the wrong message. >> >> ** I can envisage a "help xxx" Dcmd that can read back the info from >> the VM. The launcher can send the Dcmd, print the output and exit. The >> launcher would not need to know what the xxx values mean, but would >> have to intercept the existing ones. >> >> Another option is just to be aware of these flags (are there more than >> jdwp and Xlog?) and deal with them specially in your custom launcher - >> either filter them out and ignore them, or else launch the VM in its >> own process to respond to them. >> >> Any changes to the JNI specification need to go through the CSR process. > Yes, it would require an update to the JNI spec, also a change to the > JVM TI spec where Agent_OnLoad returning a non-0 value is specified to > terminates the VM. The name and value needs discussion too, esp. as the > JNI spec uses negative values for failure. > > In any case, I'm also torn over this one as it's a corner case that is > only interesting for custom launchers that load agents with options that > print usage messages. It wouldn't be hard to have the Agent_OnLoad > specify a printf hook that the agent could use for output although there > are complications with agents such as JDWP that also announce their > transport end point. Beyond that there is still the issue of the custom > launcher that would need to know to destroy the VM without reporting an > error. > > So what happened to the more meaty part to this which is fixing the > various cases in HotSpot that terminate the process during > initialization? I would expect some progress could be made on those > cases while trying to decide whether to rev the JNI and JVM TI specs to > cover the help case. Trying to eliminate the vm_exit_during_initialization paths in hotspot is a huge undertaking IMHO. David > > -Alan Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From yasuenag at gmail.com Wed Sep 20 12:16:45 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 20 Sep 2017 21:16:45 +0900 Subject: RFR: JDK-8087291: InitialBootClassLoaderMetaspaceSize and CompressedClassSpaceSize should be checked consistent from MaxMetaspaceSize In-Reply-To: <19df096b-3243-1ac0-3d3a-e955c63c534d@oracle.com> References: <19df096b-3243-1ac0-3d3a-e955c63c534d@oracle.com> Message-ID: <53fb4559-aeed-da1a-67fe-6cb50fd8e9ce@gmail.com> Hi, (CC'ed hotspot-runtime-dev) > I think the reason is that this bug is a P5. The compressed class space belongs to the runtime code, so you might get more traction for this on the hotspot-runtime-dev list. I will send review request against jdk10/master or jdk10/hs after repos are opened. Thanks, Yasumasa On 2017/09/20 20:53, Stefan Karlsson wrote: > Hi Man, > > On 2017-09-13 20:55, Man Cao wrote: >> Hi Yasumasa, Stefan, >> >> Do you have any thoughts on why this patch has been pending for 2+ years? This patch could really save us from some annoying issues since we are automatically monitoring hsperfdata counters. > > I think the reason is that this bug is a P5. The compressed class space belongs to the runtime code, so you might get more traction for this on the hotspot-runtime-dev list. > > StefanK > >> >> -Man >> >> On Mon, Aug 21, 2017 at 3:46 PM, Man Cao > wrote: >> >> ??? Hi all, >> >> ??? I wonder if there is any recent update on the patch for JDK-8087291. >> ??? Is it possible to push this patch into JDK9? Except for its low >> ??? priority (P5), >> ??? is there any complication that prevents this patch getting approved >> ??? (for example, some JVM logic requires CompressedClassSpaceSize to be >> ??? 1GB by default)? >> >> ??? I work in the Java Platform Team at Google. We have encountered >> ??? annoying issues that the hsperfdata counter >> ??? "sun_gc_metaspace_maxCapacity" reporting >> ??? a too large value (about 1GB) even if user sets >> ??? -XX:MaxMetaspaceSize=100m, as well as GC log shows the confusing 1GB >> ??? memory reserved by metaspace, >> ??? regardless of MaxMetaspaceSize value. The root cause for these >> ??? issues is that CompressedClassSpaceSize is not automatically capped >> ??? by MaxMetaspaceSize >> ??? during VM initialization, and this patch seems fix the root cause. >> ??? (I'm aware that even after this patch, the reserved size could still >> ??? be up to 2*MaxMetaspaceSize, >> ??? but it is better than the current situation.) >> >> ??? Thanks, >> ??? Man >> >> ??? On 6/19/2015 00:34, Yasumasa Suenaga wrote: >> >> ??????? Thank you for your comment! >> ???????? > Try running a debug JVM with your patch with this command line. >> ???????? > >> ???????? > java -XX:MaxMetaspaceSize=4195328 -version >> ??????? Sorry, I've fixed it and uploaded webrev: >> ??????? http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.02/ >> ??????? >> ??????? It works on fastdebug VM. >> ??????? Please review again. >> >> ??????? Thanks, >> ??????? Yasumasa >> >> ??????? On 2015/06/18 10:45, Jon Masamitsu wrote: >> ???????? > Yasumasa, >> ???????? > >> ???????? > Try running a debug JVM with your patch with this command line. >> ???????? > >> ???????? > java -XX:MaxMetaspaceSize=4195328 -version >> ???????? > >> ???????? > On a linux system I get this when I build with your patch. >> ???????? > >> ???????? >> java -XX:MaxMetaspaceSize=4195328 -version >> ???????? >> # To suppress the following error report, specify this argument >> ???????? >> # after -XX: or in .hotspotrc: >> ??????? SuppressErrorAt=/metaspace.cpp:2324 >> ???????? >> # >> ???????? >> # A fatal error has been detected by the Java Runtime >> ??????? Environment: >> ???????? >> # >> ???????? >> #? Internal Error >> ???????? >> >> ??????? (/export/jmasa/java/jdk9-gc-code_review/src/share/vm/memory/metaspace.cpp:2324), >> ???????? >> pid=19099, tid=0x00007ff4b9b92700 >> ???????? >> #? assert(size > MediumChunk || size > ClassMediumChunk) >> ??????? failed: Not a >> ???????? >> humongous chunk >> ???????? >> # >> ???????? > >> ???????? > >> ???????? > Jon >> ???????? > >> ???????? > >> ???????? > On 6/17/2015 7:54 AM, Yasumasa Suenaga wrote: >> ???????? >> I want to continue to discuss about CompressedClassSpace and >> ??????? MaxMetaspace in this (RFR) thread. >> ???????? >> >> ???????? >> >> ??????? http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013873.html >> ??????? >> ???????? >>>> Should I resize CompressedClassSpaceSize than to show >> ??????? error message? >> ???????? >>> If you add slightly better heuristics for the setup of the >> ??????? CompressedClassSpaceSize flag, for example lowering the >> ??????? CompressedClassSpaceSize when MaxMetaspaceSize is set, then it >> ??????? might be less likely that you'll hit the OutOfMemoryError when >> ??????? the system is set up with strict overcommit settings. >> ???????? >> >> ???????? >> I've uploaded new webrev: >> ???????? >> http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.01/ >> ??????? >> ???????? >> >> ???????? >> This patch checkes MaxMetaspaceSize, >> ??????? CompressedClassSpaceSize, and >> ???????? >> InitialBootClassLoaderMetaspaceSize. >> ???????? >> >> ???????? >> I add to check CompressedClassSpaceSize in >> ??????? Arguments::set_use_compressed_klass_ptrs(). >> ???????? >> If InitialBootClassLoaderMetaspaceSize is greater than >> ??????? MaxMetaspaceSize, >> ???????? >> VM will fail with error message. >> ???????? >> >> ???????? >> InitialBootClassLoaderMetaspaceSize will be set to >> ??????? MaxMetaspaceSize >> ???????? >> when UseCompressedClassPointers is not set in >> ??????? Metaspace::ergo_initialize(). >> ???????? >> >> ???????? >> >> ???????? >> Thanks, >> ???????? >> >> ???????? >> Yasumasa >> ???????? >> >> >> From manc at google.com Wed Sep 20 18:11:17 2017 From: manc at google.com (Man Cao) Date: Wed, 20 Sep 2017 11:11:17 -0700 Subject: RFR: JDK-8087291: InitialBootClassLoaderMetaspaceSize and CompressedClassSpaceSize should be checked consistent from MaxMetaspaceSize In-Reply-To: <53fb4559-aeed-da1a-67fe-6cb50fd8e9ce@gmail.com> References: <19df096b-3243-1ac0-3d3a-e955c63c534d@oracle.com> <53fb4559-aeed-da1a-67fe-6cb50fd8e9ce@gmail.com> Message-ID: Thank Yasumasa and Stefan for the responses. Good to know that the patch is not blocked due to breaking some internal invariants/assumptions, but just due to its P5 status. Is it possible to push it to P4? -Man On Wed, Sep 20, 2017 at 5:16 AM, Yasumasa Suenaga wrote: > Hi, > > (CC'ed hotspot-runtime-dev) > > I think the reason is that this bug is a P5. The compressed class space >> belongs to the runtime code, so you might get more traction for this on the >> hotspot-runtime-dev list. >> > > I will send review request against jdk10/master or jdk10/hs after repos > are opened. > > > Thanks, > > Yasumasa > > > > On 2017/09/20 20:53, Stefan Karlsson wrote: > >> Hi Man, >> >> On 2017-09-13 20:55, Man Cao wrote: >> >>> Hi Yasumasa, Stefan, >>> >>> Do you have any thoughts on why this patch has been pending for 2+ >>> years? This patch could really save us from some annoying issues since we >>> are automatically monitoring hsperfdata counters. >>> >> >> I think the reason is that this bug is a P5. The compressed class space >> belongs to the runtime code, so you might get more traction for this on the >> hotspot-runtime-dev list. >> >> StefanK >> >> >>> -Man >>> >>> On Mon, Aug 21, 2017 at 3:46 PM, Man Cao >> manc at google.com>> wrote: >>> >>> Hi all, >>> >>> I wonder if there is any recent update on the patch for JDK-8087291. >>> Is it possible to push this patch into JDK9? Except for its low >>> priority (P5), >>> is there any complication that prevents this patch getting approved >>> (for example, some JVM logic requires CompressedClassSpaceSize to be >>> 1GB by default)? >>> >>> I work in the Java Platform Team at Google. We have encountered >>> annoying issues that the hsperfdata counter >>> "sun_gc_metaspace_maxCapacity" reporting >>> a too large value (about 1GB) even if user sets >>> -XX:MaxMetaspaceSize=100m, as well as GC log shows the confusing 1GB >>> memory reserved by metaspace, >>> regardless of MaxMetaspaceSize value. The root cause for these >>> issues is that CompressedClassSpaceSize is not automatically capped >>> by MaxMetaspaceSize >>> during VM initialization, and this patch seems fix the root cause. >>> (I'm aware that even after this patch, the reserved size could still >>> be up to 2*MaxMetaspaceSize, >>> but it is better than the current situation.) >>> >>> Thanks, >>> Man >>> >>> On 6/19/2015 00:34, Yasumasa Suenaga wrote: >>> >>> Thank you for your comment! >>> > Try running a debug JVM with your patch with this command >>> line. >>> > >>> > java -XX:MaxMetaspaceSize=4195328 -version >>> Sorry, I've fixed it and uploaded webrev: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.02/ >>> >>> It works on fastdebug VM. >>> Please review again. >>> >>> Thanks, >>> Yasumasa >>> >>> On 2015/06/18 10:45, Jon Masamitsu wrote: >>> > Yasumasa, >>> > >>> > Try running a debug JVM with your patch with this command >>> line. >>> > >>> > java -XX:MaxMetaspaceSize=4195328 -version >>> > >>> > On a linux system I get this when I build with your patch. >>> > >>> >> java -XX:MaxMetaspaceSize=4195328 -version >>> >> # To suppress the following error report, specify this >>> argument >>> >> # after -XX: or in .hotspotrc: >>> SuppressErrorAt=/metaspace.cpp:2324 >>> >> # >>> >> # A fatal error has been detected by the Java Runtime >>> Environment: >>> >> # >>> >> # Internal Error >>> >> >>> (/export/jmasa/java/jdk9-gc-code_review/src/share/vm/memory/ >>> metaspace.cpp:2324), >>> >> pid=19099, tid=0x00007ff4b9b92700 >>> >> # assert(size > MediumChunk || size > ClassMediumChunk) >>> failed: Not a >>> >> humongous chunk >>> >> # >>> > >>> > >>> > Jon >>> > >>> > >>> > On 6/17/2015 7:54 AM, Yasumasa Suenaga wrote: >>> >> I want to continue to discuss about CompressedClassSpace and >>> MaxMetaspace in this (RFR) thread. >>> >> >>> >> >>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-J >>> une/013873.html >>> >> June/013873.html> >>> >>>> Should I resize CompressedClassSpaceSize than to show >>> error message? >>> >>> If you add slightly better heuristics for the setup of the >>> CompressedClassSpaceSize flag, for example lowering the >>> CompressedClassSpaceSize when MaxMetaspaceSize is set, then it >>> might be less likely that you'll hit the OutOfMemoryError when >>> the system is set up with strict overcommit settings. >>> >> >>> >> I've uploaded new webrev: >>> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.01/ >>> >>> >> >>> >> This patch checkes MaxMetaspaceSize, >>> CompressedClassSpaceSize, and >>> >> InitialBootClassLoaderMetaspaceSize. >>> >> >>> >> I add to check CompressedClassSpaceSize in >>> Arguments::set_use_compressed_klass_ptrs(). >>> >> If InitialBootClassLoaderMetaspaceSize is greater than >>> MaxMetaspaceSize, >>> >> VM will fail with error message. >>> >> >>> >> InitialBootClassLoaderMetaspaceSize will be set to >>> MaxMetaspaceSize >>> >> when UseCompressedClassPointers is not set in >>> Metaspace::ergo_initialize(). >>> >> >>> >> >>> >> Thanks, >>> >> >>> >> Yasumasa >>> >> >>> >>> >>> From adam.farley at uk.ibm.com Thu Sep 21 15:11:13 2017 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Thu, 21 Sep 2017 16:11:13 +0100 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java Message-ID: Hi David and Alan, I'm on vacation for next week and can't guarantee an online presence. Andrew Leonard has volunteered to help out on this while I'm away. I've brought him up-to-date on the core points, and he has the diffs and tests if anyone needs another copy. Best Regards Adam Farley -- Previous Email -- Hi David, Alan, You are right in that the changes to HotSpot would be nontrivial. I see a number of places in (e.g.) arguments.cpp that seem to exit in the same manner as Xlog (such as -Xinternalversion). I would advise ploughing through the CSR process to alter the JNI spec, and simultaneously identify some key paths that can be raised as bugs. That way, when people have time to address these issues, the mechanism to handle a silent exit is already in place. The JDWP fix can be raised separately as one of these bugs, if it would make things simpler. As for the name, JNI_SILENT_EXIT is a placeholder, and can be readily changed. Do you have any suggestions? Lastly, in an ideal world, the VM initialisation should never exit(#). It should return a return code that tells the caller something, pass or fail, messy or tidy. That way, if someone is using the JNI as part of something bigger (like a database or a web server), one of these scenarios is just a bug, rather than a world-ender like exit(#). And now for the individual messages. :) David: Having help data returned by the launcher seems like a good way to avoid exit(0) calls, but I'm not sure how we'd prevent a JNI-caller using those options. Ultimately, to be sure, we'd have to remove the logic for those options, centralise the data to better enable launcher access, and add some logic in there so it can find any other help data (e.g. from the jdwp agent library). I feel this would be a bigger task than adding the new return code and changing the vm, plus it wouldn't provide for any non-help scenarios where the vm wants to shut down without error during initialisation. Alan: I should mention that the silent exit solution is already in use in the OpenJ9 VM. Not all of the exit paths have been resolved, but many have. The code is open and can be found here: https://github.com/eclipse/openj9 And though the silent exit code is disabled for the time being, it can be re-enabled by entering this class: runtime/vm/jvminit.c and altering line 2343 ( ctrl-f for exit(1) if it's not there). I won't paste the full code here in case people are concerned about contamination, but I would assert that this code (and the associated vm files) prove that the concept is possible. Note that that code should not be enabled until after we've integrated the code that can handle a silent exit. Best Regards Adam Farley P.S. Thank you both for your efforts on this. :) From: David Holmes To: Alan Bateman , Adam Farley8 , core-libs-dev at openjdk.java.net, hotspot-runtime-dev at openjdk.java.net, thomas.stuefe at gmail.com Date: 15/09/2017 12:03 Subject: Re: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java On 15/09/2017 8:17 PM, Alan Bateman wrote: > On 15/09/2017 02:47, David Holmes wrote: >> Hi Adam, >> >> I am still very much torn over this one. I think the idea of >> print-and-exit flags for a potentially hosted library like the JVM is >> just wrong - we should never have done that, but we did. Fixing that >> by moving the flags to the launcher is far from trivial**. Endorsing >> and encouraging these sorts of flag by adding JNI support seems to be >> sending the wrong message. >> >> ** I can envisage a "help xxx" Dcmd that can read back the info from >> the VM. The launcher can send the Dcmd, print the output and exit. The >> launcher would not need to know what the xxx values mean, but would >> have to intercept the existing ones. >> >> Another option is just to be aware of these flags (are there more than >> jdwp and Xlog?) and deal with them specially in your custom launcher - >> either filter them out and ignore them, or else launch the VM in its >> own process to respond to them. >> >> Any changes to the JNI specification need to go through the CSR process. > Yes, it would require an update to the JNI spec, also a change to the > JVM TI spec where Agent_OnLoad returning a non-0 value is specified to > terminates the VM. The name and value needs discussion too, esp. as the > JNI spec uses negative values for failure. > > In any case, I'm also torn over this one as it's a corner case that is > only interesting for custom launchers that load agents with options that > print usage messages. It wouldn't be hard to have the Agent_OnLoad > specify a printf hook that the agent could use for output although there > are complications with agents such as JDWP that also announce their > transport end point. Beyond that there is still the issue of the custom > launcher that would need to know to destroy the VM without reporting an > error. > > So what happened to the more meaty part to this which is fixing the > various cases in HotSpot that terminate the process during > initialization? I would expect some progress could be made on those > cases while trying to decide whether to rev the JNI and JVM TI specs to > cover the help case. Trying to eliminate the vm_exit_during_initialization paths in hotspot is a huge undertaking IMHO. David > > -Alan Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From mikhailo.seledtsov at oracle.com Fri Sep 22 00:58:30 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Thu, 21 Sep 2017 17:58:30 -0700 Subject: RFR(S): 8181592: [TESTBUG] Docker test utils and docker jdk basic test Message-ID: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> Please review this initial drop of Docker test utils and a sanity test. This change lays ground for further test development and test utils improvement in this area. ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8181592 ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8181592.00/ ??? Testing: ?????? - run this test on machine with Docker enabled - works ?????? - run this test on Linux-x64 with no Docker engine or Docker disabled - test skipped (as expected) ?????? - run this test on automated system - in progress Thank you, Misha From goetz.lindenmaier at sap.com Fri Sep 22 12:56:47 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 22 Sep 2017 12:56:47 +0000 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <85c13adbd5564c88b3d4cb70b0523180@sap.com> <161eca29-6060-7896-4bf0-d3b334466e4b@oracle.com> Message-ID: Hi Coleen, I updated Thomas' webrev to the new repo structure. I also ran it through our testing again. Would you mind sponsoring it? http://cr.openjdk.java.net/~goetz/wr17/8185712-windows-improve-native-symbol-resolver/webrev.05/ Thanks, Goetz. original webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.04/webrev/ > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Donnerstag, 7. September 2017 10:17 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net; Coleen Phillmore > > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder > > Hi Goetz, > > as I am gone for vacation the next four weeks, could you please prepare the > webrev rebased to the new repo once it is open and give it to Coleen? > > Thank you! > > (Last valid version was > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > native-symbol-resolver/webrev.04/webrev/index.html > native-symbol-resolver/webrev.04/webrev/index.html> ) > > On Wed, Sep 6, 2017 at 3:40 PM, > wrote: > > > > I will sponsor this for you, but remind me. > Thanks, > Coleen > > > > On 9/6/17 9:16 AM, Thomas St?fe wrote: > > > Great, thank you! > > On Wed, Sep 6, 2017 at 3:11 PM, Lindenmaier, Goetz < > goetz.lindenmaier at sap.com > > wrote: > > > > HI Thomas, > > thanks for removing all that useless code. Looks > perfect now :) > > Best regards, > Goetz. > > > > -----Original Message----- > From: Thomas St?fe > [mailto:thomas.stuefe at gmail.com ] > Sent: Mittwoch, 6. September 2017 14:38 > To: Lindenmaier, Goetz > > > Cc: hotspot-runtime-dev at openjdk.java.net > ; Ioi Lam > >; > Zhengyu Gu > > Subject: Re: RFR(m): 8185712: [windows] > Improve native symbol decoder > > Hi Goetz, > > On Wed, Sep 6, 2017 at 10:18 AM, > Lindenmaier, Goetz > > > > wrote: > > > Hi Thomas, > > I had a look at the new webrev you sent > after Zhengyu's comments. > I appreciate the new tests. Looks good. > > I still think removal of > Decoder::can_decode_C_frame_in_vm() > should > go into this change, because windows was > the only platform to use > this. > If you insist put it in a change of its own, > but to me it seems > > > odd to > > > leave > this in the code in your change. > > Best regards, > Goetz. > > > > Okay, you convinced me. I removed both > Decoder::can_decode_C_frame_in_vm() and > Decoder::shutdown() as you > suggested in your earlier review. > > New Webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > improve- improve-> > native-symbol- > resolver/webrev.04/webrev/index.html > > > Delta: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > improve- improve-> > native-symbol-resolver/webrev.03-to- > 04/webrev/index.html > > > Note to other reviewers: This new webrev > just removes dead code, it > > > should > > > not have any function change over > webrev.03. > > I did build on Linux x64, Aix, MacOS and > Windows (32/64bit) and ran > > > gtests on > > > these platforms. Will run jtreg tests tonight. > > Thanks, Thomas > > > > > > -----Original Message----- > > From: Thomas St?fe > [mailto:thomas.stuefe at gmail.com > > ] > > Sent: Dienstag, 5. September 2017 15:06 > > To: Lindenmaier, Goetz > > > > > > Cc: hotspot-runtime- > dev at openjdk.java.net > > runtime-dev at openjdk.java.net > > ; Ioi Lam > > > > > Subject: Re: RFR(m): 8185712: [windows] > Improve native symbol > decoder > > > > Hi Goetz, > > > > thank you for your review! > > > > New Webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > > improve- > > improve-> > > native-symbol-resolver/webrev.02 > > > > > > Delta to last: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > > improve- > > improve-> > > native-symbol-resolver/webrev.01-to- > 02/webrev/ > > > > > > The only change is that I removed the - > XX:InitializeDbgHelpEarly > switch to > > avoid having to file a CSR. > > > > Please find further comments inline: > > > > > > On Mon, Sep 4, 2017 at 5:08 PM, > Lindenmaier, Goetz > > > > > > > > > > > > wrote: > > > > > > Hi Thomas, > > > > I had a look at your change. Great > somebody finally fixes > > the windows symbol printing, thanks > a lot! > > > > The code looks good, I'm just not sure > whether you > > need new files symbolengine.c|hpp. > Isn't that > > just what should go to > decoder_windows.h|cpp and > > class Decoder? > > You would also get rid of the > redirections in > decoder_windows.cpp. > > > > > > > > > > As we discussed, I see your point, but > would prefer to leave the > change for > > the moment as it is. > > > > A similar change to this one - doing away > with the > > > AbstractDecoder > > > object > > instantiation layer - will be coming for > AIX, where it does not > > > make > > > much > > sense either, and I propose to do a > separate cleanup or > simplification change > > once that is done, merging > decoder_windows.cpp and > > symbolengine.cpp/hpp. Unless I hear > more objections from other > reviewers, > > I'd prefer to do this in a later patch. > > > > > > > > In shutdown() you comment > > // There is no reason ever to shut > down the decoder. > > ... I think you can remove that > function altogether, i.e. > > > also > > > > from the shared code, I don't see > where it is ever called. > > > > > > > > > > Totally agree... > > > > > > Also, I think, you can just delete > > > Decoder::can_decode_C_frame_in_vm() > > from the code. The only place where > it is used, in > > > frame.cpp, > > > > calls dll_address_to_duntion_name(). > This returns useful > information > > also in the case of the NullDecoder, > which now is the only > > > one to > > > > return false in that function. > > > > > > > > totally agree also here, but would also > prefer both issues in a > separate > > change. In fact, Ioi opened a bug for this > a while ago: > > > https://bugs.openjdk.java.net/browse/JDK-8144855 > > 8144855 > - and I > would like > > > to > > > fix > > it under that bug. Reason is, in this > change, I'd like to avoid > > > changing > > > shared > > sources as much as possible and keep > this change windows only. > > > > > > > > Globals_windows.hpp needs > Copyright adaption, please. > > This is not introduced by your change, > but maybe > > you can also fix the copyright in > decoder.hpp, which > > says " 1997, 2015, 2017" ... should only > name two > > years ... > > > > > > > > > > Not needed anymore: since I removed > the - > XX:InitializeDbgHelpEarly switch, > > globals_windows.hpp is reverted to its > original state. Do you > > > still > > > want me to > > fix the date? > > > > Thanks for the review work! > > > > ..Thomas > > > > > > Best regards, > > Goetz. > > > > > > > > > > > > > > > > > -----Original Message----- > > > From: hotspot-runtime-dev > [mailto:hotspot-runtime-dev- > > > > > > > > > > bounces at openjdk.java.net > > > > > > ] > > On Behalf Of Thomas St?fe > > > Sent: Mittwoch, 30. August 2017 > 14:34 > > > To: hotspot-runtime- > dev at openjdk.java.net > > > hotspot- > > > runtime-dev at openjdk.java.net > > > > > runtime-dev at openjdk.java.net > > > dev at openjdk.java.net > > > > > > Subject: RFR(m): 8185712: > [windows] Improve native symbol > > decoder > > > > > > Hi all, > > > > > > May I please have reviews for the > following change. > > > > > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8185712 > > 8185712 > > > > > 8185712 > > > > > Webrev: > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712- > > windows- > > > improve- > > windows- > > > improve-> > > > native-symbol- > resolver/webrev.01/webrev/ > > > > > > (This is the followup to: > > > https://bugs.openjdk.java.net/browse/JDK-8186349 > > 8186349 > > > > > 8186349 > > ) > > > > > > > ------------- > > > > > > Basically, this is a reimplementation > of the layer > > > around the > > > > Windows > > > Symbol API (the API used to resolve > debug symbols). The > > > old > > > > > implementation > > > had a number of errors and > shortcomings which together > caused > > the > > > Windows > > > native symbol resolution (and > hence callstacks in error > > > logs) to > > > be a > > bit > > > of a lottery. The aim of this > reimplementation is to > > > make the > > > code > > more > > > robust and easier to maintain. > > > > > > The problems with the existing > implementation are listed > > > in > > > detail > > in the > > > bug description. > > > > > > The new implementation: > > > > > > - uses the new centralized > WindowsDbgHelper class, which > wraps > > the > > > dbghelp.dll loading, introduced with > JDK-8186349 > > > > > > - Completely bypasses the "create > two instances of > > AbstractDecoder class > > > and synchronize access to them" > scheme in decoder.cpp. It > does > > not make > > > sense for windows, where we have > to synchronize each > > > access > > > to > > the > > > dbghelp.dll anyway - this is done > one layer below in > > WindowsDbgHelper. The > > > static methods of the shared > Decoder class now directly > > > access > > > the > > static > > > methods in the new SymbolEngine > class, see > > decoder_windows.cpp. > > > > > > - The layer wrapping the Symbol API > lives in the new > > symbolengine.cpp/hpp > > > files. The coding takes care of > properly initializing > > > (once) the > > > symbol > > API > > > and of assembling the pdb search > path. > > > > > > - Pdb search path construction is > changed: where before > > > we > > > just > > added jdk > > > and jvm bin directories, we now just > add all directories > > > of all > > > loaded > > DLLs > > > (which, of course, include the jdk > and jvm bin > > > directories). That > > > way > > we > > > have a high chance of catching pdb > files of third party > > > libraries, > > > as > > long > > > as they follow the convention of > putting the pdb files > > > beside > > > the > > dlls. > > > This means it is easier to analyse > crashes where third > > > party > > > DLLs are > > > involved. > > > > > > - On Windows, we now have source > file and line number in > > > the > > > > callstack. > > > > > > - There is a new parameter, > diagnostic and windows-only, > > > called "InitializeDbgHelpEarly". That > parameter is by > > > default > > > off. If > > on, > > > it causes the symbol engine to be > initialized early, > > > which > > > increases > > the > > > chance of good callstacks later on > (because the > > > initialization > > > does > > not > > > have to run in an error situation). > > > > > > - Added tests: gtests and a jtreg test > which tests the > > > callstack > > > > printing. > > > All tests windows only. There is no > technical reason for > > > making > > > > them > > > windows only, but I wanted to keep > disturbances to other > > platforms to a > > > minimum and these kind of tests > can be shaky. > > > > > > Thanks a lot for reviewing this! > > > > > > Kind Regards, Thomas > > > > > > > > > > > From goetz.lindenmaier at sap.com Fri Sep 22 13:00:59 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 22 Sep 2017 13:00:59 +0000 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: <0559eb3655cc42bb8b6cb37fb4370da8@sap.com> References: <0559eb3655cc42bb8b6cb37fb4370da8@sap.com> Message-ID: Hi, I updated my webrev to the directory structure: http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.02/ I also ran it through our tests again. Could someone please sponsor this change? Thanks, Goetz. > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Dienstag, 5. September 2017 10:05 > To: David Holmes ; hotspot-runtime- > dev at openjdk.java.net; build-dev > Subject: RE: RFR(M): 8187045: [linux] Not all libraries in the VM are linked > with -z,noexecstack > > Hi David, > > thanks for looking at my change! > > Hi Goetz, > > > > On 1/09/2017 11:05 PM, Lindenmaier, Goetz wrote: > > > Hi, > > > > > > I found that not all libraries are linked with -z,noexecstack. > > > This lead to errors with our linuxppc64 build. The linker omitted > > > the flag altogether, which is interpreted as a lib with execstack. > > > > > > This change contains a small test that scans all libraries in the tested VM > > > to have the noexecstack flag set. It utilizes the elf parser in the VM for > this. > > > Further -z,noexecstack is now passed to all libraries. > > > > > > Please review this change. I please need a sponsor. > > > http://cr.openjdk.java.net/~goetz/wr17/8187045- > > execstackLink/webrev.01/ > > > > So IIUC presently we only set noexecstack for gcc on linux when building > > libjvm - via the JVM_LDFLAGS settings. > Yes. > > > With this change we also set it for building JDK libraries via the > > LDFLAGS_JDKLIB setting. But this seems to be unconditional, not limited > > to gcc and linux ?? > LDFLAGS_NO_EXEC_STACK="-Wl,-z,noexecstack" is only assigned on linux, > on other platforms its empty. > > > In addition we want to build libjsig with noexecstack, and we do that by > > exposing LDFLAGS_NO_EXEC_STACK in spec.gmk, and using it in > > CompileLibjsig.gmk. I don't have an issue with the use of noexecstack > > but I think it could just have been hard-wired for linux just as the > > bulk of the flags set in that file are. Granted you copied what is done > > for LDFLAGS_HASH_STYLE - but in that case I'm assuming it is important > > that the same hash style be used throughout. Anyway minor stylistic nit > > which may be moot soon as once we have the consolidated repo I think > > libjsig could be handled the same as others libs? > I had hoped to find a location where flags that should be used in all linking > steps are assembled. Noexecstack should really be set in any lib we build. > But I didn't find that, so I implemented it as with the HASH_STYLE. I don't > really like it this way because if a new lib is added it might be forgotten > to add the noexecstack. > But I assume after the repo consolidation the build will be reshaped, > so now is not the right time to seek for optimal setups. > > Best regards, > Goetz. > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8187045- > execstackLink/webrev.01- > > hs/ > > > > Test changes look okay to me. > > > > Thanks, > > David > > > > > Best regards, > > > Goetz. > > > From coleen.phillimore at oracle.com Fri Sep 22 15:14:12 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 22 Sep 2017 11:14:12 -0400 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <85c13adbd5564c88b3d4cb70b0523180@sap.com> <161eca29-6060-7896-4bf0-d3b334466e4b@oracle.com> Message-ID: <9169d08a-f699-f65b-1152-bf1235efee18@oracle.com> I would be happy to once the jdk10/hs repo is open.? There are still problems with testing it. Thanks, Coleen On 9/22/17 8:56 AM, Lindenmaier, Goetz wrote: > Hi Coleen, > > I updated Thomas' webrev to the new repo structure. > I also ran it through our testing again. > Would you mind sponsoring it? > http://cr.openjdk.java.net/~goetz/wr17/8185712-windows-improve-native-symbol-resolver/webrev.05/ > > Thanks, > Goetz. > > original webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.04/webrev/ > > >> -----Original Message----- >> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >> Sent: Donnerstag, 7. September 2017 10:17 >> To: Lindenmaier, Goetz >> Cc: hotspot-runtime-dev at openjdk.java.net; Coleen Phillmore >> >> Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder >> >> Hi Goetz, >> >> as I am gone for vacation the next four weeks, could you please prepare the >> webrev rebased to the new repo once it is open and give it to Coleen? >> >> Thank you! >> >> (Last valid version was >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- >> native-symbol-resolver/webrev.04/webrev/index.html >> > native-symbol-resolver/webrev.04/webrev/index.html> ) >> >> On Wed, Sep 6, 2017 at 3:40 PM, > > wrote: >> >> >> >> I will sponsor this for you, but remind me. >> Thanks, >> Coleen >> >> >> >> On 9/6/17 9:16 AM, Thomas St?fe wrote: >> >> >> Great, thank you! >> >> On Wed, Sep 6, 2017 at 3:11 PM, Lindenmaier, Goetz < >> goetz.lindenmaier at sap.com >> > wrote: >> >> >> >> HI Thomas, >> >> thanks for removing all that useless code. Looks >> perfect now :) >> >> Best regards, >> Goetz. >> >> >> >> -----Original Message----- >> From: Thomas St?fe >> [mailto:thomas.stuefe at gmail.com ] >> Sent: Mittwoch, 6. September 2017 14:38 >> To: Lindenmaier, Goetz >> > >> Cc: hotspot-runtime-dev at openjdk.java.net >> ; Ioi Lam >> >; >> Zhengyu Gu > > >> Subject: Re: RFR(m): 8185712: [windows] >> Improve native symbol decoder >> >> Hi Goetz, >> >> On Wed, Sep 6, 2017 at 10:18 AM, >> Lindenmaier, Goetz >> > > > > >> wrote: >> >> >> Hi Thomas, >> >> I had a look at the new webrev you sent >> after Zhengyu's comments. >> I appreciate the new tests. Looks good. >> >> I still think removal of >> Decoder::can_decode_C_frame_in_vm() >> should >> go into this change, because windows was >> the only platform to use >> this. >> If you insist put it in a change of its own, >> but to me it seems >> >> >> odd to >> >> >> leave >> this in the code in your change. >> >> Best regards, >> Goetz. >> >> >> >> Okay, you convinced me. I removed both >> Decoder::can_decode_C_frame_in_vm() and >> Decoder::shutdown() as you >> suggested in your earlier review. >> >> New Webrev: >> >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >> improve- > improve-> >> native-symbol- >> resolver/webrev.04/webrev/index.html >> >> >> Delta: >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >> improve- > improve-> >> native-symbol-resolver/webrev.03-to- >> 04/webrev/index.html >> >> >> Note to other reviewers: This new webrev >> just removes dead code, it >> >> >> should >> >> >> not have any function change over >> webrev.03. >> >> I did build on Linux x64, Aix, MacOS and >> Windows (32/64bit) and ran >> >> >> gtests on >> >> >> these platforms. Will run jtreg tests tonight. >> >> Thanks, Thomas >> >> >> >> >> > -----Original Message----- >> > From: Thomas St?fe >> [mailto:thomas.stuefe at gmail.com >> > > ] >> > Sent: Dienstag, 5. September 2017 15:06 >> > To: Lindenmaier, Goetz >> >> > > > >> > Cc: hotspot-runtime- >> dev at openjdk.java.net >> >> runtime-dev at openjdk.java.net >> > ; Ioi Lam > >> > > > >> > Subject: Re: RFR(m): 8185712: [windows] >> Improve native symbol >> decoder >> > >> > Hi Goetz, >> > >> > thank you for your review! >> > >> > New Webrev: >> > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >> >> improve- >> > >> improve-> >> > native-symbol-resolver/webrev.02 >> > >> > >> > Delta to last: >> > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >> >> improve- >> > >> improve-> >> > native-symbol-resolver/webrev.01-to- >> 02/webrev/ >> > >> > >> > The only change is that I removed the - >> XX:InitializeDbgHelpEarly >> switch to >> > avoid having to file a CSR. >> > >> > Please find further comments inline: >> > >> > >> > On Mon, Sep 4, 2017 at 5:08 PM, >> Lindenmaier, Goetz >> > > >> > > >> >> > > > > >> >> > wrote: >> > >> > >> > Hi Thomas, >> > >> > I had a look at your change. Great >> somebody finally fixes >> > the windows symbol printing, thanks >> a lot! >> > >> > The code looks good, I'm just not sure >> whether you >> > need new files symbolengine.c|hpp. >> Isn't that >> > just what should go to >> decoder_windows.h|cpp and >> > class Decoder? >> > You would also get rid of the >> redirections in >> decoder_windows.cpp. >> > >> > >> > >> > >> > As we discussed, I see your point, but >> would prefer to leave the >> change for >> > the moment as it is. >> > >> > A similar change to this one - doing away >> with the >> >> >> AbstractDecoder >> >> >> object >> > instantiation layer - will be coming for >> AIX, where it does not >> >> >> make >> >> >> much >> > sense either, and I propose to do a >> separate cleanup or >> simplification change >> > once that is done, merging >> decoder_windows.cpp and >> > symbolengine.cpp/hpp. Unless I hear >> more objections from other >> reviewers, >> > I'd prefer to do this in a later patch. >> > >> > >> > >> > In shutdown() you comment >> > // There is no reason ever to shut >> down the decoder. >> > ... I think you can remove that >> function altogether, i.e. >> >> >> also >> >> >> > from the shared code, I don't see >> where it is ever called. >> > >> > >> > >> > >> > Totally agree... >> > >> > >> > Also, I think, you can just delete >> > >> Decoder::can_decode_C_frame_in_vm() >> > from the code. The only place where >> it is used, in >> >> >> frame.cpp, >> >> >> > calls dll_address_to_duntion_name(). >> This returns useful >> information >> > also in the case of the NullDecoder, >> which now is the only >> >> >> one to >> >> >> > return false in that function. >> > >> > >> > >> > totally agree also here, but would also >> prefer both issues in a >> separate >> > change. In fact, Ioi opened a bug for this >> a while ago: >> > >> https://bugs.openjdk.java.net/browse/JDK-8144855 >> >> > 8144855 > - and I >> would like >> >> >> to >> >> >> fix >> > it under that bug. Reason is, in this >> change, I'd like to avoid >> >> >> changing >> >> >> shared >> > sources as much as possible and keep >> this change windows only. >> > >> > >> > >> > Globals_windows.hpp needs >> Copyright adaption, please. >> > This is not introduced by your change, >> but maybe >> > you can also fix the copyright in >> decoder.hpp, which >> > says " 1997, 2015, 2017" ... should only >> name two >> > years ... >> > >> > >> > >> > >> > Not needed anymore: since I removed >> the - >> XX:InitializeDbgHelpEarly switch, >> > globals_windows.hpp is reverted to its >> original state. Do you >> >> >> still >> >> >> want me to >> > fix the date? >> > >> > Thanks for the review work! >> > >> > ..Thomas >> > >> > >> > Best regards, >> > Goetz. >> > >> > >> > >> > >> > >> > >> > >> > > -----Original Message----- >> > > From: hotspot-runtime-dev >> [mailto:hotspot-runtime-dev- >> > > >> >> > > > > > >> > > bounces at openjdk.java.net >> >> > > > >> > > > ] >> > On Behalf Of Thomas St?fe >> > > Sent: Mittwoch, 30. August 2017 >> 14:34 >> > > To: hotspot-runtime- >> dev at openjdk.java.net >> > >> >> hotspot- >> >> >> runtime-dev at openjdk.java.net >> > > > >> > runtime-dev at openjdk.java.net >> > dev at openjdk.java.net >> > > >> > > Subject: RFR(m): 8185712: >> [windows] Improve native symbol >> > decoder >> > > >> > > Hi all, >> > > >> > > May I please have reviews for the >> following change. >> > > >> > > Issue: >> https://bugs.openjdk.java.net/browse/JDK-8185712 >> >> > 8185712 > >> > >> > >> > 8185712 > > >> > > Webrev: >> > > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712- >> >> windows- >> > >> > improve- >> > >> windows- >> > >> > improve-> >> > > native-symbol- >> resolver/webrev.01/webrev/ >> > > >> > > (This is the followup to: >> > >> https://bugs.openjdk.java.net/browse/JDK-8186349 >> >> > 8186349 > >> > >> > >> > 8186349 > > ) >> >> > > >> > > ------------- >> > > >> > > Basically, this is a reimplementation >> of the layer >> >> >> around the >> >> >> > Windows >> > > Symbol API (the API used to resolve >> debug symbols). The >> >> >> old >> >> >> > > implementation >> > > had a number of errors and >> shortcomings which together >> caused >> > the >> > > Windows >> > > native symbol resolution (and >> hence callstacks in error >> >> >> logs) to >> >> >> be a >> > bit >> > > of a lottery. The aim of this >> reimplementation is to >> >> >> make the >> >> >> code >> > more >> > > robust and easier to maintain. >> > > >> > > The problems with the existing >> implementation are listed >> >> >> in >> >> >> detail >> > in the >> > > bug description. >> > > >> > > The new implementation: >> > > >> > > - uses the new centralized >> WindowsDbgHelper class, which >> wraps >> > the >> > > dbghelp.dll loading, introduced with >> JDK-8186349 >> > > >> > > - Completely bypasses the "create >> two instances of >> > AbstractDecoder class >> > > and synchronize access to them" >> scheme in decoder.cpp. It >> does >> > not make >> > > sense for windows, where we have >> to synchronize each >> >> >> access >> >> >> to >> > the >> > > dbghelp.dll anyway - this is done >> one layer below in >> > WindowsDbgHelper. The >> > > static methods of the shared >> Decoder class now directly >> >> >> access >> >> >> the >> > static >> > > methods in the new SymbolEngine >> class, see >> > decoder_windows.cpp. >> > > >> > > - The layer wrapping the Symbol API >> lives in the new >> > symbolengine.cpp/hpp >> > > files. The coding takes care of >> properly initializing >> >> >> (once) the >> >> >> symbol >> > API >> > > and of assembling the pdb search >> path. >> > > >> > > - Pdb search path construction is >> changed: where before >> >> >> we >> >> >> just >> > added jdk >> > > and jvm bin directories, we now just >> add all directories >> >> >> of all >> >> >> loaded >> > DLLs >> > > (which, of course, include the jdk >> and jvm bin >> >> >> directories). That >> >> >> way >> > we >> > > have a high chance of catching pdb >> files of third party >> >> >> libraries, >> >> >> as >> > long >> > > as they follow the convention of >> putting the pdb files >> >> >> beside >> >> >> the >> > dlls. >> > > This means it is easier to analyse >> crashes where third >> >> >> party >> >> >> DLLs are >> > > involved. >> > > >> > > - On Windows, we now have source >> file and line number in >> >> >> the >> >> >> > callstack. >> > > >> > > - There is a new parameter, >> diagnostic and windows-only, >> > > called "InitializeDbgHelpEarly". That >> parameter is by >> >> >> default >> >> >> off. If >> > on, >> > > it causes the symbol engine to be >> initialized early, >> >> >> which >> >> >> increases >> > the >> > > chance of good callstacks later on >> (because the >> >> >> initialization >> >> >> does >> > not >> > > have to run in an error situation). >> > > >> > > - Added tests: gtests and a jtreg test >> which tests the >> >> >> callstack >> >> >> > printing. >> > > All tests windows only. There is no >> technical reason for >> >> >> making >> >> >> > them >> > > windows only, but I wanted to keep >> disturbances to other >> > platforms to a >> > > minimum and these kind of tests >> can be shaky. >> > > >> > > Thanks a lot for reviewing this! >> > > >> > > Kind Regards, Thomas >> > >> > >> >> >> >> >> >> >> From goetz.lindenmaier at sap.com Fri Sep 22 15:21:47 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 22 Sep 2017 15:21:47 +0000 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: <9169d08a-f699-f65b-1152-bf1235efee18@oracle.com> References: <1400c761fcc34c37aa1e374790bb7d39@sap.com> <85c13adbd5564c88b3d4cb70b0523180@sap.com> <161eca29-6060-7896-4bf0-d3b334466e4b@oracle.com> <9169d08a-f699-f65b-1152-bf1235efee18@oracle.com> Message-ID: <3a235712c9ca496185319cce5e85ba80@sap.com> Hi Coleen, that's great, thanks. ... I saw the other mail about jprt :) Daniel, Calvin thanks for working on this! Best regards, Goetz > -----Original Message----- > From: coleen.phillimore at oracle.com [mailto:coleen.phillimore at oracle.com] > Sent: Freitag, 22. September 2017 17:14 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net; Thomas St?fe > > Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder > > > I would be happy to once the jdk10/hs repo is open.? There are still > problems with testing it. > > Thanks, > Coleen > > On 9/22/17 8:56 AM, Lindenmaier, Goetz wrote: > > Hi Coleen, > > > > I updated Thomas' webrev to the new repo structure. > > I also ran it through our testing again. > > Would you mind sponsoring it? > > http://cr.openjdk.java.net/~goetz/wr17/8185712-windows-improve- > native-symbol-resolver/webrev.05/ > > > > Thanks, > > Goetz. > > > > original webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > native-symbol-resolver/webrev.04/webrev/ > > > > > >> -----Original Message----- > >> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > >> Sent: Donnerstag, 7. September 2017 10:17 > >> To: Lindenmaier, Goetz > >> Cc: hotspot-runtime-dev at openjdk.java.net; Coleen Phillmore > >> > >> Subject: Re: RFR(m): 8185712: [windows] Improve native symbol decoder > >> > >> Hi Goetz, > >> > >> as I am gone for vacation the next four weeks, could you please prepare > the > >> webrev rebased to the new repo once it is open and give it to Coleen? > >> > >> Thank you! > >> > >> (Last valid version was > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve- > >> native-symbol-resolver/webrev.04/webrev/index.html > >> improve- > >> native-symbol-resolver/webrev.04/webrev/index.html> ) > >> > >> On Wed, Sep 6, 2017 at 3:40 PM, >> > wrote: > >> > >> > >> > >> I will sponsor this for you, but remind me. > >> Thanks, > >> Coleen > >> > >> > >> > >> On 9/6/17 9:16 AM, Thomas St?fe wrote: > >> > >> > >> Great, thank you! > >> > >> On Wed, Sep 6, 2017 at 3:11 PM, Lindenmaier, Goetz < > >> goetz.lindenmaier at sap.com > >> > wrote: > >> > >> > >> > >> HI Thomas, > >> > >> thanks for removing all that useless code. Looks > >> perfect now :) > >> > >> Best regards, > >> Goetz. > >> > >> > >> > >> -----Original Message----- > >> From: Thomas St?fe > >> [mailto:thomas.stuefe at gmail.com ] > >> Sent: Mittwoch, 6. September 2017 14:38 > >> To: Lindenmaier, Goetz > >> > > >> Cc: hotspot-runtime-dev at openjdk.java.net > >> ; Ioi Lam > >> >; > >> Zhengyu Gu >> > > >> Subject: Re: RFR(m): 8185712: [windows] > >> Improve native symbol decoder > >> > >> Hi Goetz, > >> > >> On Wed, Sep 6, 2017 at 10:18 AM, > >> Lindenmaier, Goetz > >> >> > >> > > > >> wrote: > >> > >> > >> Hi Thomas, > >> > >> I had a look at the new webrev you sent > >> after Zhengyu's comments. > >> I appreciate the new tests. Looks good. > >> > >> I still think removal of > >> Decoder::can_decode_C_frame_in_vm() > >> should > >> go into this change, because windows was > >> the only platform to use > >> this. > >> If you insist put it in a change of its own, > >> but to me it seems > >> > >> > >> odd to > >> > >> > >> leave > >> this in the code in your change. > >> > >> Best regards, > >> Goetz. > >> > >> > >> > >> Okay, you convinced me. I removed both > >> Decoder::can_decode_C_frame_in_vm() and > >> Decoder::shutdown() as you > >> suggested in your earlier review. > >> > >> New Webrev: > >> > >> > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > >> improve- windows- > >> improve-> > >> native-symbol- > >> resolver/webrev.04/webrev/index.html > >> > >> > >> Delta: > >> > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > >> improve- windows- > >> improve-> > >> native-symbol-resolver/webrev.03-to- > >> 04/webrev/index.html > >> > >> > >> Note to other reviewers: This new webrev > >> just removes dead code, it > >> > >> > >> should > >> > >> > >> not have any function change over > >> webrev.03. > >> > >> I did build on Linux x64, Aix, MacOS and > >> Windows (32/64bit) and ran > >> > >> > >> gtests on > >> > >> > >> these platforms. Will run jtreg tests tonight. > >> > >> Thanks, Thomas > >> > >> > >> > >> > >> > -----Original Message----- > >> > From: Thomas St?fe > >> [mailto:thomas.stuefe at gmail.com > >> >> > ] > >> > Sent: Dienstag, 5. September 2017 15:06 > >> > To: Lindenmaier, Goetz > >> > >> >> > > > >> > Cc: hotspot-runtime- > >> dev at openjdk.java.net > >> > >> runtime-dev at openjdk.java.net > >> > ; Ioi Lam >> > >> >> > > > >> > Subject: Re: RFR(m): 8185712: [windows] > >> Improve native symbol > >> decoder > >> > > >> > Hi Goetz, > >> > > >> > thank you for your review! > >> > > >> > New Webrev: > >> > > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > >> > >> improve- > >> >> > >> improve-> > >> > native-symbol-resolver/webrev.02 > >> > > >> > > >> > Delta to last: > >> > > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- > >> > >> improve- > >> >> > >> improve-> > >> > native-symbol-resolver/webrev.01-to- > >> 02/webrev/ > >> > > >> > > >> > The only change is that I removed the - > >> XX:InitializeDbgHelpEarly > >> switch to > >> > avoid having to file a CSR. > >> > > >> > Please find further comments inline: > >> > > >> > > >> > On Mon, Sep 4, 2017 at 5:08 PM, > >> Lindenmaier, Goetz > >> > >> > >> >> > > >> > >> >> > > > > >> > >> > wrote: > >> > > >> > > >> > Hi Thomas, > >> > > >> > I had a look at your change. Great > >> somebody finally fixes > >> > the windows symbol printing, thanks > >> a lot! > >> > > >> > The code looks good, I'm just not sure > >> whether you > >> > need new files symbolengine.c|hpp. > >> Isn't that > >> > just what should go to > >> decoder_windows.h|cpp and > >> > class Decoder? > >> > You would also get rid of the > >> redirections in > >> decoder_windows.cpp. > >> > > >> > > >> > > >> > > >> > As we discussed, I see your point, but > >> would prefer to leave the > >> change for > >> > the moment as it is. > >> > > >> > A similar change to this one - doing away > >> with the > >> > >> > >> AbstractDecoder > >> > >> > >> object > >> > instantiation layer - will be coming for > >> AIX, where it does not > >> > >> > >> make > >> > >> > >> much > >> > sense either, and I propose to do a > >> separate cleanup or > >> simplification change > >> > once that is done, merging > >> decoder_windows.cpp and > >> > symbolengine.cpp/hpp. Unless I hear > >> more objections from other > >> reviewers, > >> > I'd prefer to do this in a later patch. > >> > > >> > > >> > > >> > In shutdown() you comment > >> > // There is no reason ever to shut > >> down the decoder. > >> > ... I think you can remove that > >> function altogether, i.e. > >> > >> > >> also > >> > >> > >> > from the shared code, I don't see > >> where it is ever called. > >> > > >> > > >> > > >> > > >> > Totally agree... > >> > > >> > > >> > Also, I think, you can just delete > >> > > >> Decoder::can_decode_C_frame_in_vm() > >> > from the code. The only place where > >> it is used, in > >> > >> > >> frame.cpp, > >> > >> > >> > calls dll_address_to_duntion_name(). > >> This returns useful > >> information > >> > also in the case of the NullDecoder, > >> which now is the only > >> > >> > >> one to > >> > >> > >> > return false in that function. > >> > > >> > > >> > > >> > totally agree also here, but would also > >> prefer both issues in a > >> separate > >> > change. In fact, Ioi opened a bug for this > >> a while ago: > >> > > >> https://bugs.openjdk.java.net/browse/JDK-8144855 > >> > >> >> 8144855 > - and I > >> would like > >> > >> > >> to > >> > >> > >> fix > >> > it under that bug. Reason is, in this > >> change, I'd like to avoid > >> > >> > >> changing > >> > >> > >> shared > >> > sources as much as possible and keep > >> this change windows only. > >> > > >> > > >> > > >> > Globals_windows.hpp needs > >> Copyright adaption, please. > >> > This is not introduced by your change, > >> but maybe > >> > you can also fix the copyright in > >> decoder.hpp, which > >> > says " 1997, 2015, 2017" ... should only > >> name two > >> > years ... > >> > > >> > > >> > > >> > > >> > Not needed anymore: since I removed > >> the - > >> XX:InitializeDbgHelpEarly switch, > >> > globals_windows.hpp is reverted to its > >> original state. Do you > >> > >> > >> still > >> > >> > >> want me to > >> > fix the date? > >> > > >> > Thanks for the review work! > >> > > >> > ..Thomas > >> > > >> > > >> > Best regards, > >> > Goetz. > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > -----Original Message----- > >> > > From: hotspot-runtime-dev > >> [mailto:hotspot-runtime-dev- > >> >> > > >> > >> > >> >> > > > >> > > bounces at openjdk.java.net > >> > >> >> > > >> > >> >> > > ] > >> > On Behalf Of Thomas St?fe > >> > > Sent: Mittwoch, 30. August 2017 > >> 14:34 > >> > > To: hotspot-runtime- > >> dev at openjdk.java.net > >> >> > >> > >> hotspot- > >> > >> > >> runtime-dev at openjdk.java.net > >> > >> > > >> > runtime-dev at openjdk.java.net > >> >> dev at openjdk.java.net > >> > > > >> > > Subject: RFR(m): 8185712: > >> [windows] Improve native symbol > >> > decoder > >> > > > >> > > Hi all, > >> > > > >> > > May I please have reviews for the > >> following change. > >> > > > >> > > Issue: > >> https://bugs.openjdk.java.net/browse/JDK-8185712 > >> > >> >> 8185712 > > >> > > >> >> > >> >> 8185712 > > > >> > > Webrev: > >> > > > >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712- > >> > >> windows- > >> >> > >> > improve- > >> >> > >> windows- > >> >> > >> > improve-> > >> > > native-symbol- > >> resolver/webrev.01/webrev/ > >> > > > >> > > (This is the followup to: > >> > > >> https://bugs.openjdk.java.net/browse/JDK-8186349 > >> > >> >> 8186349 > > >> > > >> >> > >> >> 8186349 > > ) > >> > >> > > > >> > > ------------- > >> > > > >> > > Basically, this is a reimplementation > >> of the layer > >> > >> > >> around the > >> > >> > >> > Windows > >> > > Symbol API (the API used to resolve > >> debug symbols). The > >> > >> > >> old > >> > >> > >> > > implementation > >> > > had a number of errors and > >> shortcomings which together > >> caused > >> > the > >> > > Windows > >> > > native symbol resolution (and > >> hence callstacks in error > >> > >> > >> logs) to > >> > >> > >> be a > >> > bit > >> > > of a lottery. The aim of this > >> reimplementation is to > >> > >> > >> make the > >> > >> > >> code > >> > more > >> > > robust and easier to maintain. > >> > > > >> > > The problems with the existing > >> implementation are listed > >> > >> > >> in > >> > >> > >> detail > >> > in the > >> > > bug description. > >> > > > >> > > The new implementation: > >> > > > >> > > - uses the new centralized > >> WindowsDbgHelper class, which > >> wraps > >> > the > >> > > dbghelp.dll loading, introduced with > >> JDK-8186349 > >> > > > >> > > - Completely bypasses the "create > >> two instances of > >> > AbstractDecoder class > >> > > and synchronize access to them" > >> scheme in decoder.cpp. It > >> does > >> > not make > >> > > sense for windows, where we have > >> to synchronize each > >> > >> > >> access > >> > >> > >> to > >> > the > >> > > dbghelp.dll anyway - this is done > >> one layer below in > >> > WindowsDbgHelper. The > >> > > static methods of the shared > >> Decoder class now directly > >> > >> > >> access > >> > >> > >> the > >> > static > >> > > methods in the new SymbolEngine > >> class, see > >> > decoder_windows.cpp. > >> > > > >> > > - The layer wrapping the Symbol API > >> lives in the new > >> > symbolengine.cpp/hpp > >> > > files. The coding takes care of > >> properly initializing > >> > >> > >> (once) the > >> > >> > >> symbol > >> > API > >> > > and of assembling the pdb search > >> path. > >> > > > >> > > - Pdb search path construction is > >> changed: where before > >> > >> > >> we > >> > >> > >> just > >> > added jdk > >> > > and jvm bin directories, we now just > >> add all directories > >> > >> > >> of all > >> > >> > >> loaded > >> > DLLs > >> > > (which, of course, include the jdk > >> and jvm bin > >> > >> > >> directories). That > >> > >> > >> way > >> > we > >> > > have a high chance of catching pdb > >> files of third party > >> > >> > >> libraries, > >> > >> > >> as > >> > long > >> > > as they follow the convention of > >> putting the pdb files > >> > >> > >> beside > >> > >> > >> the > >> > dlls. > >> > > This means it is easier to analyse > >> crashes where third > >> > >> > >> party > >> > >> > >> DLLs are > >> > > involved. > >> > > > >> > > - On Windows, we now have source > >> file and line number in > >> > >> > >> the > >> > >> > >> > callstack. > >> > > > >> > > - There is a new parameter, > >> diagnostic and windows-only, > >> > > called "InitializeDbgHelpEarly". That > >> parameter is by > >> > >> > >> default > >> > >> > >> off. If > >> > on, > >> > > it causes the symbol engine to be > >> initialized early, > >> > >> > >> which > >> > >> > >> increases > >> > the > >> > > chance of good callstacks later on > >> (because the > >> > >> > >> initialization > >> > >> > >> does > >> > not > >> > > have to run in an error situation). > >> > > > >> > > - Added tests: gtests and a jtreg test > >> which tests the > >> > >> > >> callstack > >> > >> > >> > printing. > >> > > All tests windows only. There is no > >> technical reason for > >> > >> > >> making > >> > >> > >> > them > >> > > windows only, but I wanted to keep > >> disturbances to other > >> > platforms to a > >> > > minimum and these kind of tests > >> can be shaky. > >> > > > >> > > Thanks a lot for reviewing this! > >> > > > >> > > Kind Regards, Thomas > >> > > >> > > >> > >> > >> > >> > >> > >> > >> From mandy.chung at oracle.com Fri Sep 22 22:18:20 2017 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 22 Sep 2017 15:18:20 -0700 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library Message-ID: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> This patch proposes to replace the ClassLoader use of finalizer with phantom reference, specifically Cleaner, for unloading native libraries.? It registers the class loader for cleanup only if it's not a builtin class loader which will never be unloaded. The spec for JNI_OnUnload [1] specifies that this function is called in an unknown context whereas the hotspot implementation uses the class loader being unloaded as the context, which I consider a bug.?? It should not load a class defined to that class loader.? The proposed spec change for JNI_FindClass if called from JNI_OnUnload to use system class loader (consistent with no caller context case). Webrev: http://cr.openjdk.java.net/~mchung/jdk10/webrevs/8164512/webrev.00/ CSR: ?? https://bugs.openjdk.java.net/browse/JDK-8187882 Please see the spec change an behavioral change in this CSR.? A native library may fail to be reloaded if it is loaded immediately or soon after a class loader is GC'ed but the cleaner thread doesn't yet have the chance to handle the unloading of the native library.? Likewise, there is no guarantee regarding the timing of finalization and a native library may fail to reload.? It's believed that the compatibility risk should be low. I'm looking into adding a native jtreg test that will be added in the next revision. Mandy [1] http://docs.oracle.com/javase/9/docs/specs/jni/invocation.html#jni_onunload From harold.seigel at oracle.com Mon Sep 25 15:21:38 2017 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 25 Sep 2017 11:21:38 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults Message-ID: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> Hi, Please review this JDK-10 change to fix JDK-8186092.? The change prevents the checking of loader constraints during vtable and itable creation if the selected method is an overpass method. Overpass methods are created by the JVM to throw exceptions and so should not be subjected to loader constraint checking. Additionally, this change improves the LinkageError exception error text when a loader constraint violation occurs during vtable and itable creation. The fix includes four new tests, one test each to check that loader constraint checking is not done for overpass methods during vtable and itable creation, and one test each to test the new vtable and itable loader constraint error messages. Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests, the co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. Thanks, Harold From george.triantafillou at oracle.com Mon Sep 25 18:21:39 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Mon, 25 Sep 2017 14:21:39 -0400 Subject: RFR(S): 8181592: [TESTBUG] Docker test utils and docker jdk basic test In-Reply-To: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> References: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> Message-ID: Hi Misha, test/lib/jdk/test/lib/containers/docker/DockerTestUtils.java: 197????? * Convenicence method - by defaul retains stdout of child process Should be: * Convenience method - by default retains stdout of child process 215????? * @param pb process to executed specified as ProcessBuilder Should be: * @param pb process to be executed specified as ProcessBuilder -George On 9/21/2017 8:58 PM, mikhailo wrote: > Please review this initial drop of Docker test utils and a sanity > test. This change lays ground > for further test development and test utils improvement in this area. > > ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8181592 > ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8181592.00/ > ??? Testing: > ?????? - run this test on machine with Docker enabled - works > ?????? - run this test on Linux-x64 with no Docker engine or Docker > disabled - test skipped (as expected) > ?????? - run this test on automated system - in progress > > > Thank you, > Misha > From brent.christian at oracle.com Mon Sep 25 23:02:49 2017 From: brent.christian at oracle.com (Brent Christian) Date: Mon, 25 Sep 2017 16:02:49 -0700 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> Message-ID: <23a23d63-0056-126f-4736-4a61d75b16b4@oracle.com> Hi, Mandy The changes look alright to me. One thing that I noticed: ClassLoader.NativeLibrary.register() writes to 'loadedLibraryNames', 'nativeLibraries', and 'systemNativeLibraries' without first synchronizing on them. There is not a bug here per se, as register() is called from inside the needed synchronized blocks in loadLibrary0(). But perhaps it's worth a comment to call out that this is what is happening? Thanks, -Brent On 9/22/17 3:18 PM, mandy chung wrote: > This patch proposes to replace the ClassLoader use of finalizer with > phantom reference, specifically Cleaner, for unloading native > libraries.? It registers the class loader for cleanup only if it's not a > builtin class loader which will never be unloaded. > > The spec for JNI_OnUnload [1] specifies that this function is called in > an unknown context whereas the hotspot implementation uses the class > loader being unloaded as the context, which I consider a bug.?? It > should not load a class defined to that class loader.? The proposed spec > change for JNI_FindClass if called from JNI_OnUnload to use system class > loader (consistent with no caller context case). > > Webrev: > http://cr.openjdk.java.net/~mchung/jdk10/webrevs/8164512/webrev.00/ > > CSR: > ?? https://bugs.openjdk.java.net/browse/JDK-8187882 > > Please see the spec change an behavioral change in this CSR.? A native > library may fail to be reloaded if it is loaded immediately or soon > after a class loader is GC'ed but the cleaner thread doesn't yet have > the chance to handle the unloading of the native library.? Likewise, > there is no guarantee regarding the timing of finalization and a native > library may fail to reload.? It's believed that the compatibility risk > should be low. > > I'm looking into adding a native jtreg test that will be added in the > next revision. > > Mandy > [1] > http://docs.oracle.com/javase/9/docs/specs/jni/invocation.html#jni_onunload From mikhailo.seledtsov at oracle.com Tue Sep 26 00:16:29 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 25 Sep 2017 17:16:29 -0700 Subject: RFR(S): 8181592: [TESTBUG] Docker test utils and docker jdk basic test In-Reply-To: References: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> Message-ID: <59C99C5D.6040103@oracle.com> Hi George, Thank you for review. I have fixed the typos. Misha On 9/25/17, 11:21 AM, George Triantafillou wrote: > Hi Misha, > > test/lib/jdk/test/lib/containers/docker/DockerTestUtils.java: > > 197 * Convenicence method - by defaul retains stdout of child > process > > Should be: > * Convenience method - by default retains stdout of child process > > 215 * @param pb process to executed specified as ProcessBuilder > > Should be: > * @param pb process to be executed specified as ProcessBuilder > > -George > > On 9/21/2017 8:58 PM, mikhailo wrote: >> Please review this initial drop of Docker test utils and a sanity >> test. This change lays ground >> for further test development and test utils improvement in this area. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8181592 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8181592.00/ >> Testing: >> - run this test on machine with Docker enabled - works >> - run this test on Linux-x64 with no Docker engine or Docker >> disabled - test skipped (as expected) >> - run this test on automated system - in progress >> >> >> Thank you, >> Misha >> > From david.holmes at oracle.com Tue Sep 26 03:28:34 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 26 Sep 2017 13:28:34 +1000 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: References: <0559eb3655cc42bb8b6cb37fb4370da8@sap.com> Message-ID: <66f399c1-66b2-bb9a-2eb1-103a804d9d17@oracle.com> Hi Goetz, I'll sponsor this for you. David On 22/09/2017 11:00 PM, Lindenmaier, Goetz wrote: > Hi, > > I updated my webrev to the directory structure: > http://cr.openjdk.java.net/~goetz/wr17/8187045-execstackLink/webrev.02/ > I also ran it through our tests again. > > Could someone please sponsor this change? > > Thanks, > Goetz. > > >> -----Original Message----- >> From: Lindenmaier, Goetz >> Sent: Dienstag, 5. September 2017 10:05 >> To: David Holmes ; hotspot-runtime- >> dev at openjdk.java.net; build-dev >> Subject: RE: RFR(M): 8187045: [linux] Not all libraries in the VM are linked >> with -z,noexecstack >> >> Hi David, >> >> thanks for looking at my change! >>> Hi Goetz, >>> >>> On 1/09/2017 11:05 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> I found that not all libraries are linked with -z,noexecstack. >>>> This lead to errors with our linuxppc64 build. The linker omitted >>>> the flag altogether, which is interpreted as a lib with execstack. >>>> >>>> This change contains a small test that scans all libraries in the tested VM >>>> to have the noexecstack flag set. It utilizes the elf parser in the VM for >> this. >>>> Further -z,noexecstack is now passed to all libraries. >>>> >>>> Please review this change. I please need a sponsor. >>>> http://cr.openjdk.java.net/~goetz/wr17/8187045- >>> execstackLink/webrev.01/ >>> >>> So IIUC presently we only set noexecstack for gcc on linux when building >>> libjvm - via the JVM_LDFLAGS settings. >> Yes. >> >>> With this change we also set it for building JDK libraries via the >>> LDFLAGS_JDKLIB setting. But this seems to be unconditional, not limited >>> to gcc and linux ?? >> LDFLAGS_NO_EXEC_STACK="-Wl,-z,noexecstack" is only assigned on linux, >> on other platforms its empty. >> >>> In addition we want to build libjsig with noexecstack, and we do that by >>> exposing LDFLAGS_NO_EXEC_STACK in spec.gmk, and using it in >>> CompileLibjsig.gmk. I don't have an issue with the use of noexecstack >>> but I think it could just have been hard-wired for linux just as the >>> bulk of the flags set in that file are. Granted you copied what is done >>> for LDFLAGS_HASH_STYLE - but in that case I'm assuming it is important >>> that the same hash style be used throughout. Anyway minor stylistic nit >>> which may be moot soon as once we have the consolidated repo I think >>> libjsig could be handled the same as others libs? >> I had hoped to find a location where flags that should be used in all linking >> steps are assembled. Noexecstack should really be set in any lib we build. >> But I didn't find that, so I implemented it as with the HASH_STYLE. I don't >> really like it this way because if a new lib is added it might be forgotten >> to add the noexecstack. >> But I assume after the repo consolidation the build will be reshaped, >> so now is not the right time to seek for optimal setups. >> >> Best regards, >> Goetz. >> >>> > >>> http://cr.openjdk.java.net/~goetz/wr17/8187045- >> execstackLink/webrev.01- >>> hs/ >>> >>> Test changes look okay to me. >>> >>> Thanks, >>> David >>> >>>> Best regards, >>>> Goetz. >>>> From kim.barrett at oracle.com Tue Sep 26 06:37:20 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 26 Sep 2017 02:37:20 -0400 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> Message-ID: > On Sep 22, 2017, at 6:18 PM, mandy chung wrote: > > This patch proposes to replace the ClassLoader use of finalizer with phantom reference, specifically Cleaner, for unloading native libraries. It registers the class loader for cleanup only if it's not a builtin class loader which will never be unloaded. > > The spec for JNI_OnUnload [1] specifies that this function is called in an unknown context whereas the hotspot implementation uses the class loader being unloaded as the context, which I consider a bug. It should not load a class defined to that class loader. The proposed spec change for JNI_FindClass if called from JNI_OnUnload to use system class loader (consistent with no caller context case). > > Webrev: > http://cr.openjdk.java.net/~mchung/jdk10/webrevs/8164512/webrev.00/ > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8187882 > > Please see the spec change an behavioral change in this CSR. A native library may fail to be reloaded if it is loaded immediately or soon after a class loader is GC'ed but the cleaner thread doesn't yet have the chance to handle the unloading of the native library. Likewise, there is no guarantee regarding the timing of finalization and a native library may fail to reload. It's believed that the compatibility risk should be low. > > I'm looking into adding a native jtreg test that will be added in the next revision. > > Mandy > [1] http://docs.oracle.com/javase/9/docs/specs/jni/invocation.html#jni_onunload Thanks for dealing with this. ============================================================================== src/java.base/share/native/libjava/ClassLoader.c 415 Java_java_lang_ClassLoader_00024NativeLibrary_00024Unloader_unload 416 (JNIEnv *env, jobject this, jstring name, jboolean isBuiltin, jlong address) With this change, the "this" argument is no longer used. Because of this, the native function could be a static member function of the new Unloader class, or could (I think) still be a (now static) member function of NativeLibrary. The latter would not require a name change, only a signature change. But I don't really care which class has the method. ============================================================================== src/java.base/share/classes/java/lang/ClassLoader.java 2394 public void register() { [...] 2406 // register the class loader for cleanup when unloaded 2407 if (loader != getBuiltinPlatformClassLoader() && 2408 loader != getBuiltinAppClassLoader()) { 2409 CleanerFactory.cleaner() 2410 .register(loader, new Unloader(name, handle, isBuiltin)); 2411 } Can anything before the cleanup registration throw? If so, do we leak because we haven't registered the cleanup yet? And what if the registration itself fails to complete? ============================================================================== src/hotspot/share/prims/jni.cpp 405 // Special handling to make sure JNI_OnLoad are executed in the correct class context. s/are executed/is executed/ ============================================================================== From david.holmes at oracle.com Tue Sep 26 07:30:05 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 26 Sep 2017 17:30:05 +1000 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> Message-ID: Hi Harold, This looks okay to me. A few comments below but only one real query. On 26/09/2017 1:21 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to fix JDK-8186092.? The change > prevents the checking of loader constraints during vtable and itable > creation if the selected method is an overpass method. Overpass methods > are created by the JVM to throw exceptions and so should not be > subjected to loader constraint checking. Okay. > Additionally, this change improves the LinkageError exception error text > when a loader constraint violation occurs during vtable and itable > creation. Hmmm :) I think I put those in initially. Not sure I 100% agree with the changed terminology, but I'll defer to you as the current expert in this area. :) > The fix includes four new tests, one test each to check that loader > constraint checking is not done for overpass methods during vtable and > itable creation, and one test each to test the new vtable and itable > loader constraint error messages. *.jasm: can you add a comment indicating why these are jasm files as it is not obvious to me what is special about them. */Test.java: - You can place multiple files on one @compile tag (and still list one file per line). - you don't need to specify java.lang in the name of the exception classes > Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ The real query: 1201 if (target == NULL || !target->is_public() || target->is_abstract() || target->is_overpass()) { 1202 // Entry does not resolve. Leave it empty for AbstractMethodError. 1203 if (!(target == NULL) && !target->is_public()) { 1204 // Stuff an IllegalAccessError throwing method in there instead. 1205 itableOffsetEntry::method_entry(_klass, method_table_offset)[m->itable_index()]. 1206 initialize(Universe::throw_illegal_access_error()); 1207 } Not clear why you added the overpass check here? If it is non-public then you're replacing it with an IllegalAccessError instead of whatever the Overpass was going to throw. ?? Thanks, David ----- > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 > > The change was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util, and other tests, the co-located NSK > tests, JPRT, and with RBT tier2 - tier5 tests. > > Thanks, Harold > From goetz.lindenmaier at sap.com Tue Sep 26 07:34:44 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 26 Sep 2017 07:34:44 +0000 Subject: RFR(M): 8187045: [linux] Not all libraries in the VM are linked with -z,noexecstack In-Reply-To: <66f399c1-66b2-bb9a-2eb1-103a804d9d17@oracle.com> References: <0559eb3655cc42bb8b6cb37fb4370da8@sap.com> <66f399c1-66b2-bb9a-2eb1-103a804d9d17@oracle.com> Message-ID: <593089e8e3fb4d75a6da598106722ead@sap.com> Hi David, thanks a lot! Best regards, Goetz. > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Dienstag, 26. September 2017 05:29 > To: Lindenmaier, Goetz ; hotspot-runtime- > dev at openjdk.java.net; build-dev > Subject: Re: RFR(M): 8187045: [linux] Not all libraries in the VM are linked > with -z,noexecstack > > Hi Goetz, > > I'll sponsor this for you. > > David > > On 22/09/2017 11:00 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > I updated my webrev to the directory structure: > > http://cr.openjdk.java.net/~goetz/wr17/8187045- > execstackLink/webrev.02/ > > I also ran it through our tests again. > > > > Could someone please sponsor this change? > > > > Thanks, > > Goetz. > > > > > >> -----Original Message----- > >> From: Lindenmaier, Goetz > >> Sent: Dienstag, 5. September 2017 10:05 > >> To: David Holmes ; hotspot-runtime- > >> dev at openjdk.java.net; build-dev > >> Subject: RE: RFR(M): 8187045: [linux] Not all libraries in the VM are linked > >> with -z,noexecstack > >> > >> Hi David, > >> > >> thanks for looking at my change! > >>> Hi Goetz, > >>> > >>> On 1/09/2017 11:05 PM, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> I found that not all libraries are linked with -z,noexecstack. > >>>> This lead to errors with our linuxppc64 build. The linker omitted > >>>> the flag altogether, which is interpreted as a lib with execstack. > >>>> > >>>> This change contains a small test that scans all libraries in the tested VM > >>>> to have the noexecstack flag set. It utilizes the elf parser in the VM for > >> this. > >>>> Further -z,noexecstack is now passed to all libraries. > >>>> > >>>> Please review this change. I please need a sponsor. > >>>> http://cr.openjdk.java.net/~goetz/wr17/8187045- > >>> execstackLink/webrev.01/ > >>> > >>> So IIUC presently we only set noexecstack for gcc on linux when building > >>> libjvm - via the JVM_LDFLAGS settings. > >> Yes. > >> > >>> With this change we also set it for building JDK libraries via the > >>> LDFLAGS_JDKLIB setting. But this seems to be unconditional, not limited > >>> to gcc and linux ?? > >> LDFLAGS_NO_EXEC_STACK="-Wl,-z,noexecstack" is only assigned on > linux, > >> on other platforms its empty. > >> > >>> In addition we want to build libjsig with noexecstack, and we do that by > >>> exposing LDFLAGS_NO_EXEC_STACK in spec.gmk, and using it in > >>> CompileLibjsig.gmk. I don't have an issue with the use of noexecstack > >>> but I think it could just have been hard-wired for linux just as the > >>> bulk of the flags set in that file are. Granted you copied what is done > >>> for LDFLAGS_HASH_STYLE - but in that case I'm assuming it is important > >>> that the same hash style be used throughout. Anyway minor stylistic nit > >>> which may be moot soon as once we have the consolidated repo I think > >>> libjsig could be handled the same as others libs? > >> I had hoped to find a location where flags that should be used in all linking > >> steps are assembled. Noexecstack should really be set in any lib we build. > >> But I didn't find that, so I implemented it as with the HASH_STYLE. I don't > >> really like it this way because if a new lib is added it might be forgotten > >> to add the noexecstack. > >> But I assume after the repo consolidation the build will be reshaped, > >> so now is not the right time to seek for optimal setups. > >> > >> Best regards, > >> Goetz. > >> > >>> > > >>> http://cr.openjdk.java.net/~goetz/wr17/8187045- > >> execstackLink/webrev.01- > >>> hs/ > >>> > >>> Test changes look okay to me. > >>> > >>> Thanks, > >>> David > >>> > >>>> Best regards, > >>>> Goetz. > >>>> From yasuenag at gmail.com Tue Sep 26 09:36:27 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 26 Sep 2017 18:36:27 +0900 Subject: RFR: JDK-8087291: InitialBootClassLoaderMetaspaceSize and CompressedClassSpaceSize should be checked consistent from MaxMetaspaceSize In-Reply-To: References: <19df096b-3243-1ac0-3d3a-e955c63c534d@oracle.com> <53fb4559-aeed-da1a-67fe-6cb50fd8e9ce@gmail.com> Message-ID: Hi all, I uploaded webrev for this issue against jdk10/hs. Could you review it? http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.03/ I cannot access JPRT. So I need a sponsor. Thanks, Yasumasa 2017-09-21 3:11 GMT+09:00 Man Cao : > Thank Yasumasa and Stefan for the responses. > > Good to know that the patch is not blocked due to breaking some internal > invariants/assumptions, but just due to its P5 status. > Is it possible to push it to P4? > > -Man > > On Wed, Sep 20, 2017 at 5:16 AM, Yasumasa Suenaga > wrote: >> >> Hi, >> >> (CC'ed hotspot-runtime-dev) >> >>> I think the reason is that this bug is a P5. The compressed class space >>> belongs to the runtime code, so you might get more traction for this on the >>> hotspot-runtime-dev list. >> >> >> I will send review request against jdk10/master or jdk10/hs after repos >> are opened. >> >> >> Thanks, >> >> Yasumasa >> >> >> >> On 2017/09/20 20:53, Stefan Karlsson wrote: >>> >>> Hi Man, >>> >>> On 2017-09-13 20:55, Man Cao wrote: >>>> >>>> Hi Yasumasa, Stefan, >>>> >>>> Do you have any thoughts on why this patch has been pending for 2+ >>>> years? This patch could really save us from some annoying issues since we >>>> are automatically monitoring hsperfdata counters. >>> >>> >>> I think the reason is that this bug is a P5. The compressed class space >>> belongs to the runtime code, so you might get more traction for this on the >>> hotspot-runtime-dev list. >>> >>> StefanK >>> >>>> >>>> -Man >>>> >>>> On Mon, Aug 21, 2017 at 3:46 PM, Man Cao >>> > wrote: >>>> >>>> Hi all, >>>> >>>> I wonder if there is any recent update on the patch for JDK-8087291. >>>> Is it possible to push this patch into JDK9? Except for its low >>>> priority (P5), >>>> is there any complication that prevents this patch getting approved >>>> (for example, some JVM logic requires CompressedClassSpaceSize to be >>>> 1GB by default)? >>>> >>>> I work in the Java Platform Team at Google. We have encountered >>>> annoying issues that the hsperfdata counter >>>> "sun_gc_metaspace_maxCapacity" reporting >>>> a too large value (about 1GB) even if user sets >>>> -XX:MaxMetaspaceSize=100m, as well as GC log shows the confusing 1GB >>>> memory reserved by metaspace, >>>> regardless of MaxMetaspaceSize value. The root cause for these >>>> issues is that CompressedClassSpaceSize is not automatically capped >>>> by MaxMetaspaceSize >>>> during VM initialization, and this patch seems fix the root cause. >>>> (I'm aware that even after this patch, the reserved size could still >>>> be up to 2*MaxMetaspaceSize, >>>> but it is better than the current situation.) >>>> >>>> Thanks, >>>> Man >>>> >>>> On 6/19/2015 00:34, Yasumasa Suenaga wrote: >>>> >>>> Thank you for your comment! >>>> > Try running a debug JVM with your patch with this command >>>> line. >>>> > >>>> > java -XX:MaxMetaspaceSize=4195328 -version >>>> Sorry, I've fixed it and uploaded webrev: >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.02/ >>>> >>>> It works on fastdebug VM. >>>> Please review again. >>>> >>>> Thanks, >>>> Yasumasa >>>> >>>> On 2015/06/18 10:45, Jon Masamitsu wrote: >>>> > Yasumasa, >>>> > >>>> > Try running a debug JVM with your patch with this command >>>> line. >>>> > >>>> > java -XX:MaxMetaspaceSize=4195328 -version >>>> > >>>> > On a linux system I get this when I build with your patch. >>>> > >>>> >> java -XX:MaxMetaspaceSize=4195328 -version >>>> >> # To suppress the following error report, specify this >>>> argument >>>> >> # after -XX: or in .hotspotrc: >>>> SuppressErrorAt=/metaspace.cpp:2324 >>>> >> # >>>> >> # A fatal error has been detected by the Java Runtime >>>> Environment: >>>> >> # >>>> >> # Internal Error >>>> >> >>>> >>>> (/export/jmasa/java/jdk9-gc-code_review/src/share/vm/memory/metaspace.cpp:2324), >>>> >> pid=19099, tid=0x00007ff4b9b92700 >>>> >> # assert(size > MediumChunk || size > ClassMediumChunk) >>>> failed: Not a >>>> >> humongous chunk >>>> >> # >>>> > >>>> > >>>> > Jon >>>> > >>>> > >>>> > On 6/17/2015 7:54 AM, Yasumasa Suenaga wrote: >>>> >> I want to continue to discuss about CompressedClassSpace and >>>> MaxMetaspace in this (RFR) thread. >>>> >> >>>> >> >>>> >>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013873.html >>>> >>>> >>>> >>>> Should I resize CompressedClassSpaceSize than to show >>>> error message? >>>> >>> If you add slightly better heuristics for the setup of the >>>> CompressedClassSpaceSize flag, for example lowering the >>>> CompressedClassSpaceSize when MaxMetaspaceSize is set, then it >>>> might be less likely that you'll hit the OutOfMemoryError when >>>> the system is set up with strict overcommit settings. >>>> >> >>>> >> I've uploaded new webrev: >>>> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.01/ >>>> >>>> >> >>>> >> This patch checkes MaxMetaspaceSize, >>>> CompressedClassSpaceSize, and >>>> >> InitialBootClassLoaderMetaspaceSize. >>>> >> >>>> >> I add to check CompressedClassSpaceSize in >>>> Arguments::set_use_compressed_klass_ptrs(). >>>> >> If InitialBootClassLoaderMetaspaceSize is greater than >>>> MaxMetaspaceSize, >>>> >> VM will fail with error message. >>>> >> >>>> >> InitialBootClassLoaderMetaspaceSize will be set to >>>> MaxMetaspaceSize >>>> >> when UseCompressedClassPointers is not set in >>>> Metaspace::ergo_initialize(). >>>> >> >>>> >> >>>> >> Thanks, >>>> >> >>>> >> Yasumasa >>>> >> >>>> >>>> > From mandy.chung at oracle.com Tue Sep 26 17:27:18 2017 From: mandy.chung at oracle.com (mandy chung) Date: Tue, 26 Sep 2017 10:27:18 -0700 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> Message-ID: <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> On 9/25/17 11:37 PM, Kim Barrett wrote: > Thanks for dealing with this. I'm all for eliminating the finalizers in the JDK.? Looking forward to having a finalizer-free JDK. > ============================================================================== > src/java.base/share/native/libjava/ClassLoader.c > 415 Java_java_lang_ClassLoader_00024NativeLibrary_00024Unloader_unload > 416 (JNIEnv *env, jobject this, jstring name, jboolean isBuiltin, jlong address) > > With this change, the "this" argument is no longer used. > > Because of this, the native function could be a static member function > of the new Unloader class, or could (I think) still be a (now static) > member function of NativeLibrary. The latter would not require a name > change, only a signature change. But I don't really care which class > has the method. Yes it can be a static method. > ============================================================================== > src/java.base/share/classes/java/lang/ClassLoader.java > 2394 public void register() { > [...] > 2406 // register the class loader for cleanup when unloaded > 2407 if (loader != getBuiltinPlatformClassLoader() && > 2408 loader != getBuiltinAppClassLoader()) { > 2409 CleanerFactory.cleaner() > 2410 .register(loader, new Unloader(name, handle, isBuiltin)); > 2411 } > > Can anything before the cleanup registration throw? No within the register method.? The builtin loaders are created early during startup.? I shall make sure that System::loadLibrary is not called during init phase 1. > If so, do we leak > because we haven't registered the cleanup yet? And what if the > registration itself fails to complete? If Cleaner::register fails, it should throw an exception. Otherwise, the registration should complete. > ============================================================================== > src/hotspot/share/prims/jni.cpp > 405 // Special handling to make sure JNI_OnLoad are executed in the correct class context. > > s/are executed/is executed/ > Will fix it. I'm considering to separate the JNI_FindClass change to target 18.3 and provide a flag to restore the current behavior so that it may help existing code to identify its dependency on the current behavior and give time to migrate.? Then target the finalizer to Cleaner change in 18.9. It's unknown to us how many existing native libraries are impacted by this change (calling FindClass from JNI_OnUnload to load classes visible the class loader being unloaded). ? I suspect it's rare. If FindClass is called when the native library is being unloaded and fail to find the class due to this change, it is not hard to find out but the code might crash if it does not handle the class not found case properly. Any opinion? Mandy From leonid.mesnik at oracle.com Tue Sep 26 18:19:12 2017 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 26 Sep 2017 11:19:12 -0700 Subject: RFR(S): 8181592: [TESTBUG] Docker test utils and docker jdk basic test In-Reply-To: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> References: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> Message-ID: <5F386D9A-CA05-4732-8F68-F493DD2E8E99@oracle.com> Misha http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/hotspot/jtreg/runtime/containers/docker/DockerBasicTest.java.html Copyright is incorrect, need to updated it for GPL. The Hotspot is Oracle VM name only so test might fail for OpenJDK. I think you need to fix this check. The requires checks only that test is executed only on the 64-bit linux. Does it make a sense to introduce more docker-specific check? http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/hotspot/jtreg/runtime/containers/docker/Dockerfile-BasicTest.html Could you please explain why oraclelinux 7.0 is used as a base image for test. http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/lib/jdk/test/lib/containers/docker/DockerTestUtils.java.html The content looks fine. I don?t see anything to clean up docker images on the system. Could you please explain how tests are going to cleanup images. Leonid > On Sep 21, 2017, at 5:58 PM, mikhailo wrote: > > Please review this initial drop of Docker test utils and a sanity test. This change lays ground > for further test development and test utils improvement in this area. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8181592 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8181592.00/ > Testing: > - run this test on machine with Docker enabled - works > - run this test on Linux-x64 with no Docker engine or Docker disabled - test skipped (as expected) > - run this test on automated system - in progress > > > Thank you, > Misha > From harold.seigel at oracle.com Tue Sep 26 19:13:31 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 26 Sep 2017 15:13:31 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> Message-ID: <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> Hi David, Thanks for looking at this change!? Please see updated webrev at: http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html and also see comments embedded below. Thanks, Harold On 9/26/2017 3:30 AM, David Holmes wrote: > Hi Harold, > > This looks okay to me. A few comments below but only one real query. > > On 26/09/2017 1:21 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to fix JDK-8186092.? The change >> prevents the checking of loader constraints during vtable and itable >> creation if the selected method is an overpass method. Overpass >> methods are created by the JVM to throw exceptions and so should not >> be subjected to loader constraint checking. > > Okay. > >> Additionally, this change improves the LinkageError exception error >> text when a loader constraint violation occurs during vtable and >> itable creation. > > Hmmm :) I think I put those in initially. Not sure I 100% agree with > the changed terminology, but I'll defer to you as the current expert > in this area. :) I'm hoping better experts also review the changed messages. > >> The fix includes four new tests, one test each to check that loader >> constraint checking is not done for overpass methods during vtable >> and itable creation, and one test each to test the new vtable and >> itable loader constraint error messages. > > *.jasm: can you add a comment indicating why these are jasm files as > it is not obvious to me what is special about them. Thanks for pointing this out.? I converted the two Task.jasm files to Task.java file and added a comment to the remaining .jasm file, C.jasm. > > */Test.java: > ?- You can place multiple files on one @compile tag (and still list > one file per line). > - you don't need to specify java.lang in the name of the exception > classes Done. > >> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ > > The real query: > > 1201???? if (target == NULL || !target->is_public() || > target->is_abstract() || target->is_overpass()) { > 1202?????? // Entry does not resolve. Leave it empty for > AbstractMethodError. > 1203?????? if (!(target == NULL) && !target->is_public()) { > 1204???????? // Stuff an IllegalAccessError throwing method in there > instead. > 1205???????? itableOffsetEntry::method_entry(_klass, > method_table_offset)[m->itable_index()]. > 1206 initialize(Universe::throw_illegal_access_error()); > 1207?????? } > > Not clear why you added the overpass check here? If it is non-public > then you're replacing it with an IllegalAccessError instead of > whatever the Overpass was going to throw. ?? Currently, all overpass methods are public methods. So, they would not get replaced with IllegalAccessError.? However, in case non-public overpass methods exist in the future, I added "&& !target->is_overpass()" to line 1203. Alternatively, I considered adding an "assert(!target->is_overpass() || target->is_public(), "Non-public overpass method");" between lines 1201 and 1202 but didn't think that this code should be concerned about whether or not overpass methods are public.? I also thought about adding "&& !target->is_overpass()" to line 1211 but thought it better that all checks on 'target', that prevent loader constraints checking, be done at the same place. > > Thanks, > David > ----- > >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util, and other tests, the >> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >> >> Thanks, Harold >> From kim.barrett at oracle.com Tue Sep 26 21:20:19 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 26 Sep 2017 17:20:19 -0400 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> Message-ID: <69E59AD1-DA51-425F-BD63-574933243F90@oracle.com> > On Sep 26, 2017, at 1:27 PM, mandy chung wrote: > On 9/25/17 11:37 PM, Kim Barrett wrote: >> src/java.base/share/classes/java/lang/ClassLoader.java >> 2394 public void register() { >> [...] >> 2406 // register the class loader for cleanup when unloaded >> 2407 if (loader != getBuiltinPlatformClassLoader() && >> 2408 loader != getBuiltinAppClassLoader()) { >> 2409 CleanerFactory.cleaner() >> 2410 .register(loader, new Unloader(name, handle, isBuiltin)); >> 2411 } >> >> Can anything before the cleanup registration throw? > > No within the register method. I think there are some opportunities for OOME, but I think no worse than before. And the result would be a loaded library without the unload registration, which seems like it might perhaps be annoying but probably not fatal. > I'm considering to separate the JNI_FindClass change to target 18.3 and provide a flag to restore the current behavior so that it may help existing code to identify its dependency on the current behavior and give time to migrate. Then target the finalizer to Cleaner change in 18.9. > > It's unknown to us how many existing native libraries are impacted by this change (calling FindClass from JNI_OnUnload to load classes visible the class loader being unloaded). I suspect it's rare. If FindClass is called when the native library is being unloaded and fail to find the class due to this change, it is not hard to find out but the code might crash if it does not handle the class not found case properly. > > Any opinion? I?m in favor of removing this finalizer sooner rather than later, but you probably could have guessed that. From ioi.lam at oracle.com Tue Sep 26 22:03:08 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 26 Sep 2017 15:03:08 -0700 Subject: RFR(XXS) 8187979 Clean up info printing at CDS dump time Message-ID: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> A small clean up to removed obsolete info and improve indentation * https://bugs.openjdk.java.net/browse/JDK-8187979 * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/ Webrev doesn't show the lines that has only changes in blank spaces, but you can see the full diff here: * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/open.patch Thanks - Ioi From david.holmes at oracle.com Wed Sep 27 00:25:38 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 Sep 2017 10:25:38 +1000 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> Message-ID: Hi Harold, On 27/09/2017 5:13 AM, harold seigel wrote: > Hi David, > > Thanks for looking at this change!? Please see updated webrev at: > > http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html Test changes seem fine. > and also see comments embedded below. Follow up below. > Thanks, Harold > > > On 9/26/2017 3:30 AM, David Holmes wrote: >> Hi Harold, >> >> This looks okay to me. A few comments below but only one real query. >> >> On 26/09/2017 1:21 AM, harold seigel wrote: >>> Hi, >>> >>> Please review this JDK-10 change to fix JDK-8186092.? The change >>> prevents the checking of loader constraints during vtable and itable >>> creation if the selected method is an overpass method. Overpass >>> methods are created by the JVM to throw exceptions and so should not >>> be subjected to loader constraint checking. >> >> Okay. >> >>> Additionally, this change improves the LinkageError exception error >>> text when a loader constraint violation occurs during vtable and >>> itable creation. >> >> Hmmm :) I think I put those in initially. Not sure I 100% agree with >> the changed terminology, but I'll defer to you as the current expert >> in this area. :) > I'm hoping better experts also review the changed messages. >> >>> The fix includes four new tests, one test each to check that loader >>> constraint checking is not done for overpass methods during vtable >>> and itable creation, and one test each to test the new vtable and >>> itable loader constraint error messages. >> >> *.jasm: can you add a comment indicating why these are jasm files as >> it is not obvious to me what is special about them. > Thanks for pointing this out.? I converted the two Task.jasm files to > Task.java file and added a comment to the remaining .jasm file, C.jasm. >> >> */Test.java: >> ?- You can place multiple files on one @compile tag (and still list >> one file per line). >> - you don't need to specify java.lang in the name of the exception >> classes > Done. >> >>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >> >> The real query: >> >> 1201???? if (target == NULL || !target->is_public() || >> target->is_abstract() || target->is_overpass()) { >> 1202?????? // Entry does not resolve. Leave it empty for >> AbstractMethodError. >> 1203?????? if (!(target == NULL) && !target->is_public()) { >> 1204???????? // Stuff an IllegalAccessError throwing method in there >> instead. >> 1205???????? itableOffsetEntry::method_entry(_klass, >> method_table_offset)[m->itable_index()]. >> 1206 initialize(Universe::throw_illegal_access_error()); >> 1207?????? } >> >> Not clear why you added the overpass check here? If it is non-public >> then you're replacing it with an IllegalAccessError instead of >> whatever the Overpass was going to throw. ?? > Currently, all overpass methods are public methods. So, they would not > get replaced with IllegalAccessError.? However, in case non-public > overpass methods exist in the future, I added "&& > !target->is_overpass()" to line 1203. > > Alternatively, I considered adding an "assert(!target->is_overpass() || > target->is_public(), "Non-public overpass method");" between lines 1201 > and 1202 but didn't think that this code should be concerned about > whether or not overpass methods are public.? I also thought about adding > "&& !target->is_overpass()" to line 1211 but thought it better that all > checks on 'target', that prevent loader constraints checking, be done at > the same place. Okay I see what you are trying to do now. We want overpass methods to follow the "if" path at 1201, but for them it should currently be a no-op. I'd be inclined to add in the assertion - the code is already concerned about not processing non-public overpasses with your proposed change to 1203. The assertion would ensure that anyone introducing a non-public overpass has it quickly drawn to their attention that doing so needs additional consideration. Thanks, David ---- >> >> Thanks, >> David >> ----- >> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>> >>> The change was tested with the JCK Lang and VM tests, the JTreg >>> hotspot, java/io, java/lang, java/util, and other tests, the >>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>> >>> Thanks, Harold >>> > From david.holmes at oracle.com Wed Sep 27 00:34:02 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 Sep 2017 10:34:02 +1000 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> Message-ID: Hi Mandy, On 27/09/2017 3:27 AM, mandy chung wrote: > > > On 9/25/17 11:37 PM, Kim Barrett wrote: >> Thanks for dealing with this. > > I'm all for eliminating the finalizers in the JDK.? Looking forward to > having a finalizer-free JDK. >> ============================================================================== >> >> src/java.base/share/native/libjava/ClassLoader.c >> ? 415 Java_java_lang_ClassLoader_00024NativeLibrary_00024Unloader_unload >> ? 416 (JNIEnv *env, jobject this, jstring name, jboolean isBuiltin, >> jlong address) >> >> With this change, the "this" argument is no longer used. >> >> Because of this, the native function could be a static member function >> of the new Unloader class, or could (I think) still be a (now static) >> member function of NativeLibrary.? The latter would not require a name >> change, only a signature change.? But I don't really care which class >> has the method. > Yes it can be a static method. >> ============================================================================== >> >> src/java.base/share/classes/java/lang/ClassLoader.java >> 2394???????? public void register() { >> ????????????????? [...] >> 2406???????????????? // register the class loader for cleanup when >> unloaded >> 2407???????????????? if (loader != getBuiltinPlatformClassLoader() && >> 2408???????????????????? loader != getBuiltinAppClassLoader()) { >> 2409???????????????????? CleanerFactory.cleaner() >> 2410???????????????????????? .register(loader, new Unloader(name, >> handle, isBuiltin)); >> 2411???????????????? } >> >> Can anything before the cleanup registration throw? > > No within the register method.? The builtin loaders are created early > during startup.? I shall make sure that System::loadLibrary is not > called during init phase 1. >> If so, do we leak >> because we haven't registered the cleanup yet?? And what if the >> registration itself fails to complete? > > If Cleaner::register fails, it should throw an exception. Otherwise, the > registration should complete. >> ============================================================================== >> >> src/hotspot/share/prims/jni.cpp >> ? 405???? // Special handling to make sure JNI_OnLoad are executed in >> the correct class context. >> >> s/are executed/is executed/ >> > Will fix it. > > I'm considering to separate the JNI_FindClass change to target 18.3 and > provide a flag to restore the current behavior so that it may help > existing code to identify its dependency on the current behavior and > give time to migrate.? Then target the finalizer to Cleaner change in 18.9. > > It's unknown to us how many existing native libraries are impacted by > this change (calling FindClass from JNI_OnUnload to load classes visible > the class loader being unloaded). ? I suspect it's rare. If FindClass is > called when the native library is being unloaded and fail to find the > class due to this change, it is not hard to find out but the code might > crash if it does not handle the class not found case properly. > > Any opinion? Yes :) I agree with being conservative here. We don't know how this may be being used. But I'm still not completely clear how the FindClass change is tied to the switch to Cleaner ?? Thanks, David > Mandy From jiangli.zhou at oracle.com Wed Sep 27 01:02:12 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 26 Sep 2017 18:02:12 -0700 Subject: RFR(XXS) 8187979 Clean up info printing at CDS dump time In-Reply-To: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> References: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> Message-ID: <3CDBD952-0FF7-4412-824E-C78DDFF59D0F@oracle.com> Hi Ioi, Could you please send an updated dump output? Thanks, Jiangli > On Sep 26, 2017, at 3:03 PM, Ioi Lam wrote: > > A small clean up to removed obsolete info and improve indentation > > * https://bugs.openjdk.java.net/browse/JDK-8187979 > * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/ > > Webrev doesn't show the lines that has only changes in blank spaces, but you can see the full diff here: > > * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/open.patch > > Thanks > > - Ioi > From mandy.chung at oracle.com Wed Sep 27 01:32:31 2017 From: mandy.chung at oracle.com (mandy chung) Date: Tue, 26 Sep 2017 18:32:31 -0700 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> Message-ID: <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> On 9/26/17 5:34 PM, David Holmes wrote: > Hi Mandy, > > On 27/09/2017 3:27 AM, mandy chung wrote: >> >> >> On 9/25/17 11:37 PM, Kim Barrett wrote: >>> Thanks for dealing with this. >> >> I'm all for eliminating the finalizers in the JDK.? Looking forward >> to having a finalizer-free JDK. >>> ============================================================================== >>> >>> src/java.base/share/native/libjava/ClassLoader.c >>> ? 415 >>> Java_java_lang_ClassLoader_00024NativeLibrary_00024Unloader_unload >>> ? 416 (JNIEnv *env, jobject this, jstring name, jboolean isBuiltin, >>> jlong address) >>> >>> With this change, the "this" argument is no longer used. >>> >>> Because of this, the native function could be a static member function >>> of the new Unloader class, or could (I think) still be a (now static) >>> member function of NativeLibrary.? The latter would not require a name >>> change, only a signature change.? But I don't really care which class >>> has the method. >> Yes it can be a static method. >>> ============================================================================== >>> >>> src/java.base/share/classes/java/lang/ClassLoader.java >>> 2394???????? public void register() { >>> ????????????????? [...] >>> 2406???????????????? // register the class loader for cleanup when >>> unloaded >>> 2407???????????????? if (loader != getBuiltinPlatformClassLoader() && >>> 2408???????????????????? loader != getBuiltinAppClassLoader()) { >>> 2409???????????????????? CleanerFactory.cleaner() >>> 2410???????????????????????? .register(loader, new Unloader(name, >>> handle, isBuiltin)); >>> 2411???????????????? } >>> >>> Can anything before the cleanup registration throw? >> >> No within the register method.? The builtin loaders are created early >> during startup.? I shall make sure that System::loadLibrary is not >> called during init phase 1. >>> If so, do we leak >>> because we haven't registered the cleanup yet?? And what if the >>> registration itself fails to complete? >> >> If Cleaner::register fails, it should throw an exception. Otherwise, >> the registration should complete. >>> ============================================================================== >>> >>> src/hotspot/share/prims/jni.cpp >>> ? 405???? // Special handling to make sure JNI_OnLoad are executed >>> in the correct class context. >>> >>> s/are executed/is executed/ >>> >> Will fix it. >> >> I'm considering to separate the JNI_FindClass change to target 18.3 >> and provide a flag to restore the current behavior so that it may >> help existing code to identify its dependency on the current behavior >> and give time to migrate.? Then target the finalizer to Cleaner >> change in 18.9. >> >> It's unknown to us how many existing native libraries are impacted by >> this change (calling FindClass from JNI_OnUnload to load classes >> visible the class loader being unloaded). ? I suspect it's rare. If >> FindClass is called when the native library is being unloaded and >> fail to find the class due to this change, it is not hard to find out >> but the code might crash if it does not handle the class not found >> case properly. >> >> Any opinion? > > Yes :) I agree with being conservative here. We don't know how this > may be being used. But I'm still not completely clear how the > FindClass change is tied to the switch to Cleaner ?? It is not tied with the Cleaner change.? Instead, the FindClass bug blocks the finalizer to Cleaner change. FindClass bug is uncovered when I implemented the change from finalizer to Cleaner (or phantom reference).?? There is a test calling FindClass to look for a class defined by the class loader being unloaded, say L.? L is not Gc'ed and so FindClass successfully finds the class (which resurrect the class loader which was marked finalizable). Is that clearer? Mandy From david.holmes at oracle.com Wed Sep 27 02:06:34 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 Sep 2017 12:06:34 +1000 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> Message-ID: <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> On 27/09/2017 11:32 AM, mandy chung wrote: > On 9/26/17 5:34 PM, David Holmes wrote: >> Hi Mandy, >> >> On 27/09/2017 3:27 AM, mandy chung wrote: >>> >>> >>> On 9/25/17 11:37 PM, Kim Barrett wrote: >>>> Thanks for dealing with this. >>> >>> I'm all for eliminating the finalizers in the JDK.? Looking forward >>> to having a finalizer-free JDK. >>>> ============================================================================== >>>> >>>> src/java.base/share/native/libjava/ClassLoader.c >>>> ? 415 >>>> Java_java_lang_ClassLoader_00024NativeLibrary_00024Unloader_unload >>>> ? 416 (JNIEnv *env, jobject this, jstring name, jboolean isBuiltin, >>>> jlong address) >>>> >>>> With this change, the "this" argument is no longer used. >>>> >>>> Because of this, the native function could be a static member function >>>> of the new Unloader class, or could (I think) still be a (now static) >>>> member function of NativeLibrary.? The latter would not require a name >>>> change, only a signature change.? But I don't really care which class >>>> has the method. >>> Yes it can be a static method. >>>> ============================================================================== >>>> >>>> src/java.base/share/classes/java/lang/ClassLoader.java >>>> 2394???????? public void register() { >>>> ????????????????? [...] >>>> 2406???????????????? // register the class loader for cleanup when >>>> unloaded >>>> 2407???????????????? if (loader != getBuiltinPlatformClassLoader() && >>>> 2408???????????????????? loader != getBuiltinAppClassLoader()) { >>>> 2409???????????????????? CleanerFactory.cleaner() >>>> 2410???????????????????????? .register(loader, new Unloader(name, >>>> handle, isBuiltin)); >>>> 2411???????????????? } >>>> >>>> Can anything before the cleanup registration throw? >>> >>> No within the register method.? The builtin loaders are created early >>> during startup.? I shall make sure that System::loadLibrary is not >>> called during init phase 1. >>>> If so, do we leak >>>> because we haven't registered the cleanup yet?? And what if the >>>> registration itself fails to complete? >>> >>> If Cleaner::register fails, it should throw an exception. Otherwise, >>> the registration should complete. >>>> ============================================================================== >>>> >>>> src/hotspot/share/prims/jni.cpp >>>> ? 405???? // Special handling to make sure JNI_OnLoad are executed >>>> in the correct class context. >>>> >>>> s/are executed/is executed/ >>>> >>> Will fix it. >>> >>> I'm considering to separate the JNI_FindClass change to target 18.3 >>> and provide a flag to restore the current behavior so that it may >>> help existing code to identify its dependency on the current behavior >>> and give time to migrate.? Then target the finalizer to Cleaner >>> change in 18.9. >>> >>> It's unknown to us how many existing native libraries are impacted by >>> this change (calling FindClass from JNI_OnUnload to load classes >>> visible the class loader being unloaded). ? I suspect it's rare. If >>> FindClass is called when the native library is being unloaded and >>> fail to find the class due to this change, it is not hard to find out >>> but the code might crash if it does not handle the class not found >>> case properly. >>> >>> Any opinion? >> >> Yes :) I agree with being conservative here. We don't know how this >> may be being used. But I'm still not completely clear how the >> FindClass change is tied to the switch to Cleaner ?? > It is not tied with the Cleaner change.? Instead, the FindClass bug > blocks the finalizer to Cleaner change. > > FindClass bug is uncovered when I implemented the change from finalizer > to Cleaner (or phantom reference).?? There is a test calling FindClass > to look for a class defined by the class loader being unloaded, say L. L > is not Gc'ed and so FindClass successfully finds the class (which > resurrect the class loader which was marked finalizable). > > Is that clearer? So the issue is only that this test breaks?? And you want to change the FindClass spec to make it clear the test is what needs to be changed? David > Mandy From mandy.chung at oracle.com Wed Sep 27 02:11:56 2017 From: mandy.chung at oracle.com (mandy chung) Date: Tue, 26 Sep 2017 19:11:56 -0700 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> Message-ID: On 9/26/17 7:06 PM, David Holmes wrote: > >> It is not tied with the Cleaner change. Instead, the FindClass bug >> blocks the finalizer to Cleaner change. >> >> FindClass bug is uncovered when I implemented the change from >> finalizer to Cleaner (or phantom reference).?? There is a test >> calling FindClass to look for a class defined by the class loader >> being unloaded, say L. L is not Gc'ed and so FindClass successfully >> finds the class (which resurrect the class loader which was marked >> finalizable). >> >> Is that clearer? > > So the issue is only that this test breaks?? No.? The test reveals a bug in JNI_FindClass that uses a class loader being finalized as the context when NativeLibrary is the caller. > And you want to change the FindClass spec to make it clear the test is > what needs to be changed? No.?? It is a bug in the hotspot implementation. ? The JNI spec says that the context of JNI_OnUnload being called is unknown.? The hotspot implementation of FindClass uses the class loader associated with that native library as the context when invoked from JNI_OnUnload which is wrong. I will file a separate JBS issue to separate this JNI bug. Mandy From david.holmes at oracle.com Wed Sep 27 02:35:18 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 Sep 2017 12:35:18 +1000 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> Message-ID: <3d52390c-d468-1f74-6e1e-4a9bc960b681@oracle.com> On 27/09/2017 12:11 PM, mandy chung wrote: > On 9/26/17 7:06 PM, David Holmes wrote: >> >>> It is not tied with the Cleaner change. Instead, the FindClass bug >>> blocks the finalizer to Cleaner change. >>> >>> FindClass bug is uncovered when I implemented the change from >>> finalizer to Cleaner (or phantom reference).?? There is a test >>> calling FindClass to look for a class defined by the class loader >>> being unloaded, say L. L is not Gc'ed and so FindClass successfully >>> finds the class (which resurrect the class loader which was marked >>> finalizable). >>> >>> Is that clearer? >> >> So the issue is only that this test breaks?? > > No.? The test reveals a bug in JNI_FindClass that uses a class loader > being finalized as the context when NativeLibrary is the caller. >> And you want to change the FindClass spec to make it clear the test is >> what needs to be changed? > No.?? It is a bug in the hotspot implementation. ? The JNI spec says > that the context of JNI_OnUnload being called is unknown.? The hotspot > implementation of FindClass uses the class loader associated with that > native library as the context when invoked from JNI_OnUnload which is > wrong. I'm not sure I agree it is wrong. As I've said elsewhere there's a good chance that if you are trying to load classes via FindClass as part of a unload hook (which implies you are using custom classloaders), then it may be only the current loader or a parent (still custom) can load that class. But we're on the fringe of realistic expectations here as the context is specified as being "unknown". That said given the spec says "unknown" the behaviour of the VM could change and still be in spec. I presume that when using a cleaner the current classloader that would be used by FindClass is the system loader? Hence the observed behaviour of FindClass "changes" if you switch to the cleaner from the finalizer - and can't be reverted to the old behaviour by using a command-line flag. Hence if we want to be able to revert we have to do that in a FindClass-only change first. Then drop-in the cleaner update and remove the flag. > I will file a separate JBS issue to separate this JNI bug. Okay. I see this as a RFE not a bug per-se: change from "unknown context" to a specific well known context. Thanks, David > Mandy From yasuenag at gmail.com Wed Sep 27 03:05:35 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 27 Sep 2017 12:05:35 +0900 Subject: RFR: JDK-8087291: InitialBootClassLoaderMetaspaceSize and CompressedClassSpaceSize should be checked consistent from MaxMetaspaceSize In-Reply-To: References: <19df096b-3243-1ac0-3d3a-e955c63c534d@oracle.com> <53fb4559-aeed-da1a-67fe-6cb50fd8e9ce@gmail.com> Message-ID: Hi all, I added a testcase for this issue in new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.04/ Thanks, Yasumasa 2017-09-26 18:36 GMT+09:00 Yasumasa Suenaga : > Hi all, > > I uploaded webrev for this issue against jdk10/hs. > Could you review it? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.03/ > > > I cannot access JPRT. So I need a sponsor. > > > Thanks, > > Yasumasa > > > > 2017-09-21 3:11 GMT+09:00 Man Cao : >> Thank Yasumasa and Stefan for the responses. >> >> Good to know that the patch is not blocked due to breaking some internal >> invariants/assumptions, but just due to its P5 status. >> Is it possible to push it to P4? >> >> -Man >> >> On Wed, Sep 20, 2017 at 5:16 AM, Yasumasa Suenaga >> wrote: >>> >>> Hi, >>> >>> (CC'ed hotspot-runtime-dev) >>> >>>> I think the reason is that this bug is a P5. The compressed class space >>>> belongs to the runtime code, so you might get more traction for this on the >>>> hotspot-runtime-dev list. >>> >>> >>> I will send review request against jdk10/master or jdk10/hs after repos >>> are opened. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> >>> On 2017/09/20 20:53, Stefan Karlsson wrote: >>>> >>>> Hi Man, >>>> >>>> On 2017-09-13 20:55, Man Cao wrote: >>>>> >>>>> Hi Yasumasa, Stefan, >>>>> >>>>> Do you have any thoughts on why this patch has been pending for 2+ >>>>> years? This patch could really save us from some annoying issues since we >>>>> are automatically monitoring hsperfdata counters. >>>> >>>> >>>> I think the reason is that this bug is a P5. The compressed class space >>>> belongs to the runtime code, so you might get more traction for this on the >>>> hotspot-runtime-dev list. >>>> >>>> StefanK >>>> >>>>> >>>>> -Man >>>>> >>>>> On Mon, Aug 21, 2017 at 3:46 PM, Man Cao >>>> > wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I wonder if there is any recent update on the patch for JDK-8087291. >>>>> Is it possible to push this patch into JDK9? Except for its low >>>>> priority (P5), >>>>> is there any complication that prevents this patch getting approved >>>>> (for example, some JVM logic requires CompressedClassSpaceSize to be >>>>> 1GB by default)? >>>>> >>>>> I work in the Java Platform Team at Google. We have encountered >>>>> annoying issues that the hsperfdata counter >>>>> "sun_gc_metaspace_maxCapacity" reporting >>>>> a too large value (about 1GB) even if user sets >>>>> -XX:MaxMetaspaceSize=100m, as well as GC log shows the confusing 1GB >>>>> memory reserved by metaspace, >>>>> regardless of MaxMetaspaceSize value. The root cause for these >>>>> issues is that CompressedClassSpaceSize is not automatically capped >>>>> by MaxMetaspaceSize >>>>> during VM initialization, and this patch seems fix the root cause. >>>>> (I'm aware that even after this patch, the reserved size could still >>>>> be up to 2*MaxMetaspaceSize, >>>>> but it is better than the current situation.) >>>>> >>>>> Thanks, >>>>> Man >>>>> >>>>> On 6/19/2015 00:34, Yasumasa Suenaga wrote: >>>>> >>>>> Thank you for your comment! >>>>> > Try running a debug JVM with your patch with this command >>>>> line. >>>>> > >>>>> > java -XX:MaxMetaspaceSize=4195328 -version >>>>> Sorry, I've fixed it and uploaded webrev: >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.02/ >>>>> >>>>> It works on fastdebug VM. >>>>> Please review again. >>>>> >>>>> Thanks, >>>>> Yasumasa >>>>> >>>>> On 2015/06/18 10:45, Jon Masamitsu wrote: >>>>> > Yasumasa, >>>>> > >>>>> > Try running a debug JVM with your patch with this command >>>>> line. >>>>> > >>>>> > java -XX:MaxMetaspaceSize=4195328 -version >>>>> > >>>>> > On a linux system I get this when I build with your patch. >>>>> > >>>>> >> java -XX:MaxMetaspaceSize=4195328 -version >>>>> >> # To suppress the following error report, specify this >>>>> argument >>>>> >> # after -XX: or in .hotspotrc: >>>>> SuppressErrorAt=/metaspace.cpp:2324 >>>>> >> # >>>>> >> # A fatal error has been detected by the Java Runtime >>>>> Environment: >>>>> >> # >>>>> >> # Internal Error >>>>> >> >>>>> >>>>> (/export/jmasa/java/jdk9-gc-code_review/src/share/vm/memory/metaspace.cpp:2324), >>>>> >> pid=19099, tid=0x00007ff4b9b92700 >>>>> >> # assert(size > MediumChunk || size > ClassMediumChunk) >>>>> failed: Not a >>>>> >> humongous chunk >>>>> >> # >>>>> > >>>>> > >>>>> > Jon >>>>> > >>>>> > >>>>> > On 6/17/2015 7:54 AM, Yasumasa Suenaga wrote: >>>>> >> I want to continue to discuss about CompressedClassSpace and >>>>> MaxMetaspace in this (RFR) thread. >>>>> >> >>>>> >> >>>>> >>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013873.html >>>>> >>>>> >>>>> >>>> Should I resize CompressedClassSpaceSize than to show >>>>> error message? >>>>> >>> If you add slightly better heuristics for the setup of the >>>>> CompressedClassSpaceSize flag, for example lowering the >>>>> CompressedClassSpaceSize when MaxMetaspaceSize is set, then it >>>>> might be less likely that you'll hit the OutOfMemoryError when >>>>> the system is set up with strict overcommit settings. >>>>> >> >>>>> >> I've uploaded new webrev: >>>>> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8087291/webrev.01/ >>>>> >>>>> >> >>>>> >> This patch checkes MaxMetaspaceSize, >>>>> CompressedClassSpaceSize, and >>>>> >> InitialBootClassLoaderMetaspaceSize. >>>>> >> >>>>> >> I add to check CompressedClassSpaceSize in >>>>> Arguments::set_use_compressed_klass_ptrs(). >>>>> >> If InitialBootClassLoaderMetaspaceSize is greater than >>>>> MaxMetaspaceSize, >>>>> >> VM will fail with error message. >>>>> >> >>>>> >> InitialBootClassLoaderMetaspaceSize will be set to >>>>> MaxMetaspaceSize >>>>> >> when UseCompressedClassPointers is not set in >>>>> Metaspace::ergo_initialize(). >>>>> >> >>>>> >> >>>>> >> Thanks, >>>>> >> >>>>> >> Yasumasa >>>>> >> >>>>> >>>>> >> From mandy.chung at oracle.com Wed Sep 27 03:36:02 2017 From: mandy.chung at oracle.com (mandy chung) Date: Tue, 26 Sep 2017 20:36:02 -0700 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <3d52390c-d468-1f74-6e1e-4a9bc960b681@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> <3d52390c-d468-1f74-6e1e-4a9bc960b681@oracle.com> Message-ID: <1e71b112-cdd5-ba0d-d519-5d54e5ace549@oracle.com> On 9/26/17 7:35 PM, David Holmes wrote: > On 27/09/2017 12:11 PM, mandy chung wrote: >> On 9/26/17 7:06 PM, David Holmes wrote: >>> >>>> It is not tied with the Cleaner change. Instead, the FindClass bug >>>> blocks the finalizer to Cleaner change. >>>> >>>> FindClass bug is uncovered when I implemented the change from >>>> finalizer to Cleaner (or phantom reference).?? There is a test >>>> calling FindClass to look for a class defined by the class loader >>>> being unloaded, say L. L is not Gc'ed and so FindClass successfully >>>> finds the class (which resurrect the class loader which was marked >>>> finalizable). >>>> >>>> Is that clearer? >>> >>> So the issue is only that this test breaks?? >> >> No.? The test reveals a bug in JNI_FindClass that uses a class loader >> being finalized as the context when NativeLibrary is the caller. >>> And you want to change the FindClass spec to make it clear the test >>> is what needs to be changed? >> No.?? It is a bug in the hotspot implementation. ? The JNI spec says >> that the context of JNI_OnUnload being called is unknown. The hotspot >> implementation of FindClass uses the class loader associated with >> that native library as the context when invoked from JNI_OnUnload >> which is wrong. > > I'm not sure I agree it is wrong. As I've said elsewhere there's a > good chance that if you are trying to load classes via FindClass as > part of a unload hook (which implies you are using custom > classloaders), then it may be only the current loader or a parent > (still custom) can load that class. But we're on the fringe of > realistic expectations here as the context is specified as being > "unknown". > For a native unload hook to access some class defined by this class loader, definitely it should not write to any fields since the class and class loader are not strongly reachable.?? Reading the current state stored in the class can be done by writing to the native fields. I'd like to know what other use cases that FindClass must ressurrect a class defined by this class loader or find a class defined by its ancestor if you have any in mind that the existing code can't be replaced due to the proposed change. > That said given the spec says "unknown" the behaviour of the VM could > change and still be in spec. > > I presume that when using a cleaner the current classloader that would > be used by FindClass is the system loader? Hence the observed > behaviour of FindClass "changes" if you switch to the cleaner from the > finalizer - and can't be reverted to the old behaviour by using a > command-line flag. Hence if we want to be able to revert we have to do > that in a FindClass-only change first. Then drop-in the cleaner update > and remove the flag. > >> I will file a separate JBS issue to separate this JNI bug. > > Okay. I see this as a RFE not a bug per-se: change from "unknown > context" to a specific well known context. This case is arguable whether it's considered as a RFE or a bug because the current spec of JNI_OnUnload and JNI_FindClass are not aligned.? I lean toward a bug.??? The bottom line:? do you agree with this proposed JNI spec change? Mandy From ioi.lam at oracle.com Wed Sep 27 03:44:22 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 26 Sep 2017 20:44:22 -0700 Subject: RFR(XXS) 8187979 Clean up info printing at CDS dump time In-Reply-To: <3CDBD952-0FF7-4412-824E-C78DDFF59D0F@oracle.com> References: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> <3CDBD952-0FF7-4412-824E-C78DDFF59D0F@oracle.com> Message-ID: Hi Jiangli, The "xx space:" column is lined up and the "Unknown" row is removed. *** Before *** mc space:???? 21736 [? 0.1% of total] out of???? 24576 bytes [ 88.4% used] at 0x0000000800000000 rw space:?? 4229896 [ 23.4% of total] out of?? 4231168 bytes [100.0% used] at 0x0000000800006000 ro space:?? 7227952 [ 40.0% of total] out of?? 7229440 bytes [100.0% used] at 0x000000080040f000 md space:????? 6064 [? 0.0% of total] out of????? 8192 bytes [ 74.0% used] at 0x0000000800af4000 od space:?? 6404560 [ 35.4% of total] out of?? 6406144 bytes [100.0% used] at 0x0000000800af6000 st0 space:??? 102400 [? 0.6% of total] out of??? 102400 bytes [100% used] at 0x00000007bfc00000 oa0 space:???? 65536 [? 0.4% of total] out of???? 65536 bytes [100% used] at 0x00000007bf800000 total?? :? 18058144 [100.0% of total] out of? 18067456 bytes [ 99.9% used] [3.470s][info ][cds???????????? ] Detailed metadata info (excluding od/st regions; rw stats include md/mc regions): ??????????????????????? ro_cnt?? ro_bytes???? % |?? rw_cnt rw_bytes???? % |? all_cnt? all_bytes???? % --------------------+---------------------------+---------------------------+-------------------------- Unknown???????????? :??????? 0????????? 0?? 0.0 | 0????????? 0?? 0.0 |??????? 0????????? 0?? 0.0 Class?????????????? :??????? 0????????? 0?? 0.0 |???? 1237 783584? 18.4 |???? 1237???? 783584?? 6.8 Symbol????????????? :??? 36210??? 1415496? 19.6 | 0????????? 0?? 0.0 |??? 36210??? 1415496? 12.3 *** After *** mc? space:???? 21736 [? 0.1% of total] out of???? 24576 bytes [ 88.4% used] at 0x0000000800000000 rw? space:?? 4229896 [ 23.4% of total] out of?? 4231168 bytes [100.0% used] at 0x0000000800006000 ro? space:?? 7227952 [ 40.0% of total] out of?? 7229440 bytes [100.0% used] at 0x000000080040f000 md? space:????? 6064 [? 0.0% of total] out of????? 8192 bytes [ 74.0% used] at 0x0000000800af4000 od? space:?? 6404560 [ 35.4% of total] out of?? 6406144 bytes [100.0% used] at 0x0000000800af6000 st0 space:??? 102400 [? 0.6% of total] out of??? 102400 bytes [100% used] at 0x00000007bfc00000 oa0 space:???? 65536 [? 0.4% of total] out of???? 65536 bytes [100% used] at 0x00000007bf800000 total??? :? 18058144 [100.0% of total] out of? 18067456 bytes [ 99.9% used] [3.692s][info ][cds???????????? ] Detailed metadata info (excluding od/st regions; rw stats include md/mc regions): ??????????????????????? ro_cnt?? ro_bytes???? % |?? rw_cnt rw_bytes???? % |? all_cnt? all_bytes???? % --------------------+---------------------------+---------------------------+-------------------------- Class?????????????? :??????? 0????????? 0?? 0.0 |???? 1237 783584? 18.4 |???? 1237???? 783584?? 6.8 Symbol????????????? :??? 36210??? 1415496? 19.6 | 0????????? 0?? 0.0 |??? 36210??? 1415496? 12.3 Thanks - Ioi On 9/26/17 6:02 PM, Jiangli Zhou wrote: > Hi Ioi, > > Could you please send an updated dump output? > > Thanks, > Jiangli > >> On Sep 26, 2017, at 3:03 PM, Ioi Lam wrote: >> >> A small clean up to removed obsolete info and improve indentation >> >> * https://bugs.openjdk.java.net/browse/JDK-8187979 >> * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/ >> >> Webrev doesn't show the lines that has only changes in blank spaces, but you can see the full diff here: >> >> * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/open.patch >> >> Thanks >> >> - Ioi >> From david.holmes at oracle.com Wed Sep 27 04:36:47 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 Sep 2017 14:36:47 +1000 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <1e71b112-cdd5-ba0d-d519-5d54e5ace549@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> <3d52390c-d468-1f74-6e1e-4a9bc960b681@oracle.com> <1e71b112-cdd5-ba0d-d519-5d54e5ace549@oracle.com> Message-ID: <0c6802f2-678b-977d-e827-ef03c7f376a2@oracle.com> On 27/09/2017 1:36 PM, mandy chung wrote: > On 9/26/17 7:35 PM, David Holmes wrote: >> On 27/09/2017 12:11 PM, mandy chung wrote: >>> On 9/26/17 7:06 PM, David Holmes wrote: >>>> >>>>> It is not tied with the Cleaner change. Instead, the FindClass bug >>>>> blocks the finalizer to Cleaner change. >>>>> >>>>> FindClass bug is uncovered when I implemented the change from >>>>> finalizer to Cleaner (or phantom reference).?? There is a test >>>>> calling FindClass to look for a class defined by the class loader >>>>> being unloaded, say L. L is not Gc'ed and so FindClass successfully >>>>> finds the class (which resurrect the class loader which was marked >>>>> finalizable). >>>>> >>>>> Is that clearer? >>>> >>>> So the issue is only that this test breaks?? >>> >>> No.? The test reveals a bug in JNI_FindClass that uses a class loader >>> being finalized as the context when NativeLibrary is the caller. >>>> And you want to change the FindClass spec to make it clear the test >>>> is what needs to be changed? >>> No.?? It is a bug in the hotspot implementation. ? The JNI spec says >>> that the context of JNI_OnUnload being called is unknown. The hotspot >>> implementation of FindClass uses the class loader associated with >>> that native library as the context when invoked from JNI_OnUnload >>> which is wrong. >> >> I'm not sure I agree it is wrong. As I've said elsewhere there's a >> good chance that if you are trying to load classes via FindClass as >> part of a unload hook (which implies you are using custom >> classloaders), then it may be only the current loader or a parent >> (still custom) can load that class. But we're on the fringe of >> realistic expectations here as the context is specified as being >> "unknown". >> > For a native unload hook to access some class defined by this class > loader, definitely it should not write to any fields since the class and > class loader are not strongly reachable.?? Reading the current state > stored in the class can be done by writing to the native fields. Yes that is a good point - but as the spec says due to the unknown context the hook has to be very careful about what it tries to do. I agree it is doubtful that anyone can, or should, be relying on the direct use of the classloader that has become unreachable, but ... > I'd like to know what other use cases that FindClass must ressurrect a > class defined by this class loader or find a class defined by its > ancestor if you have any in mind that the existing code can't be > replaced due to the proposed change. ... I can easily imagine a subsystem that runs under a custom loader and which then instantiates further execution contexts (per connection for example) each with their own classloader and which can be reclaimed after the request is complete. I can then easily imagine that they use an unload hook to record statistics about native library use, and that the statistics classes are in the top-level custom loader, and not locatable from the system loader. While the spec makes no guarantees this will work it only says programmers "should be conservative in their use of VM services" which strongly suggests to me a "try it and see if it works" approach. In the current code while loading from the loader being reclaimed is highly dubious, delegating through that loader seems fairly reasonable to me. >> That said given the spec says "unknown" the behaviour of the VM could >> change and still be in spec. >> >> I presume that when using a cleaner the current classloader that would >> be used by FindClass is the system loader? Hence the observed >> behaviour of FindClass "changes" if you switch to the cleaner from the >> finalizer - and can't be reverted to the old behaviour by using a >> command-line flag. Hence if we want to be able to revert we have to do >> that in a FindClass-only change first. Then drop-in the cleaner update >> and remove the flag. >> >>> I will file a separate JBS issue to separate this JNI bug. >> >> Okay. I see this as a RFE not a bug per-se: change from "unknown >> context" to a specific well known context. > This case is arguable whether it's considered as a RFE or a bug because > the current spec of JNI_OnUnload and JNI_FindClass are not aligned.? I > lean toward a bug.??? The bottom line:? do you agree with this proposed > JNI spec change? I don't think the spec _has_ to change because I disagree that there is a misalignment between JNI_OnUnload and JNI_FindClass. FindClass clearly states it uses the current loader or else the system loader if there is no notion of a current loader. OnUnload says it runs in an unknown context, so you don't know what the current loader may be, or even if there is one. But regardless a call to FindClass from OnUnload should use the current loader if it exists, or else the system loader. The fact it may be dubious to use the current loader when it is itself in the process of being unloaded does not impinge on the voracity of the spec in my opinion. So you can change to using a Cleaner instead of a finalizer and while it will behave differently, that change in behaviour does not violate the spec in any way - again in my opinion. Now if you want to pave the way for a future switch to Cleaner by changing the spec for JNI_OnUnload such that it must be executed in a context where (equivalently) there either is no current loader or the current loader is the system loader, then I do not oppose that. But the only purpose that serves is to allow a migration path to the new behaviour - and then forever locks us in. Note however I would not want to see the implementation of FindClass having to special case this - I would hope it just happens naturally if the Cleaner thread reports the current class loader as the system loader. Does it? Thanks, David > > Mandy From david.holmes at oracle.com Wed Sep 27 12:49:10 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 Sep 2017 22:49:10 +1000 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <0c6802f2-678b-977d-e827-ef03c7f376a2@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> <3d52390c-d468-1f74-6e1e-4a9bc960b681@oracle.com> <1e71b112-cdd5-ba0d-d519-5d54e5ace549@oracle.com> <0c6802f2-678b-977d-e827-ef03c7f376a2@oracle.com> Message-ID: <22d101c6-5255-b9b7-606c-6f5519a5c597@oracle.com> Corrections ... On 27/09/2017 2:36 PM, David Holmes wrote: > On 27/09/2017 1:36 PM, mandy chung wrote: >> On 9/26/17 7:35 PM, David Holmes wrote: >>> On 27/09/2017 12:11 PM, mandy chung wrote: >>>> On 9/26/17 7:06 PM, David Holmes wrote: >>>>> >>>>>> It is not tied with the Cleaner change. Instead, the FindClass bug >>>>>> blocks the finalizer to Cleaner change. >>>>>> >>>>>> FindClass bug is uncovered when I implemented the change from >>>>>> finalizer to Cleaner (or phantom reference).?? There is a test >>>>>> calling FindClass to look for a class defined by the class loader >>>>>> being unloaded, say L. L is not Gc'ed and so FindClass >>>>>> successfully finds the class (which resurrect the class loader >>>>>> which was marked finalizable). >>>>>> >>>>>> Is that clearer? >>>>> >>>>> So the issue is only that this test breaks?? >>>> >>>> No.? The test reveals a bug in JNI_FindClass that uses a class >>>> loader being finalized as the context when NativeLibrary is the caller. >>>>> And you want to change the FindClass spec to make it clear the test >>>>> is what needs to be changed? >>>> No.?? It is a bug in the hotspot implementation. ? The JNI spec says >>>> that the context of JNI_OnUnload being called is unknown. The >>>> hotspot implementation of FindClass uses the class loader associated >>>> with that native library as the context when invoked from >>>> JNI_OnUnload which is wrong. >>> >>> I'm not sure I agree it is wrong. As I've said elsewhere there's a >>> good chance that if you are trying to load classes via FindClass as >>> part of a unload hook (which implies you are using custom >>> classloaders), then it may be only the current loader or a parent >>> (still custom) can load that class. But we're on the fringe of >>> realistic expectations here as the context is specified as being >>> "unknown". >>> >> For a native unload hook to access some class defined by this class >> loader, definitely it should not write to any fields since the class >> and class loader are not strongly reachable.?? Reading the current >> state stored in the class can be done by writing to the native fields. > > Yes that is a good point - but as the spec says due to the unknown > context the hook has to be very careful about what it tries to do. I > agree it is doubtful that anyone can, or should, be relying on the > direct use of the classloader that has become unreachable, but ... > >> I'd like to know what other use cases that FindClass must ressurrect a >> class defined by this class loader or find a class defined by its >> ancestor if you have any in mind that the existing code can't be >> replaced due to the proposed change. > > ... I can easily imagine a subsystem that runs under a custom loader and > which then instantiates further execution contexts (per connection for > example) each with their own classloader and which can be reclaimed > after the request is complete. I can then easily imagine that they use > an unload hook to record statistics about native library use, and that > the statistics classes are in the top-level custom loader, and not > locatable from the system loader. > > While the spec makes no guarantees this will work it only says > programmers "should be conservative in their use of VM services" which > strongly suggests to me a "try it and see if it works" approach. In the > current code while loading from the loader being reclaimed is highly > dubious, delegating through that loader seems fairly reasonable to me. > >>> That said given the spec says "unknown" the behaviour of the VM could >>> change and still be in spec. >>> >>> I presume that when using a cleaner the current classloader that >>> would be used by FindClass is the system loader? Hence the observed >>> behaviour of FindClass "changes" if you switch to the cleaner from >>> the finalizer - and can't be reverted to the old behaviour by using a >>> command-line flag. Hence if we want to be able to revert we have to >>> do that in a FindClass-only change first. Then drop-in the cleaner >>> update and remove the flag. >>> >>>> I will file a separate JBS issue to separate this JNI bug. >>> >>> Okay. I see this as a RFE not a bug per-se: change from "unknown >>> context" to a specific well known context. >> This case is arguable whether it's considered as a RFE or a bug >> because the current spec of JNI_OnUnload and JNI_FindClass are not >> aligned.? I lean toward a bug.??? The bottom line:? do you agree with >> this proposed JNI spec change? > > I don't think the spec _has_ to change because I disagree that there is > a misalignment between JNI_OnUnload and JNI_FindClass. FindClass clearly > states it uses the current loader or else the system loader if there is That is not accurate - sorry. FindClass doesn't actually address the possibility of being called via these load/unload hooks. See more below. > no notion of a current loader. OnUnload says it runs in an unknown > context, so you don't know what the current loader may be, or even if > there is one. But regardless a call to FindClass from OnUnload should > use the current loader if it exists, or else the system loader. The fact > it may be dubious to use the current loader when it is itself in the > process of being unloaded does not impinge on the voracity of the spec > in my opinion. > > So you can change to using a Cleaner instead of a finalizer and while it > will behave differently, that change in behaviour does not violate the > spec in any way - again in my opinion. > > Now if you want to pave the way for a future switch to Cleaner by > changing the spec for JNI_OnUnload such that it must be executed in a > context where (equivalently) there either is no current loader or the > current loader is the system loader, then I do not oppose that. But the > only purpose that serves is to allow a migration path to the new > behaviour - and then forever locks us in. > > Note however I would not want to see the implementation of FindClass > having to special case this - I would hope it just happens naturally if > the Cleaner thread reports the current class loader as the system > loader. Does it? I missed the fact that we already special case this for JNI_OnLoad and JNI_OnUnload. I would have thought that in the OnLoad case we would find the classloader of the class loading the native library without any need to resort to the NativeLibrary support code in ClassLoader. I guess that this: // Find calling class Klass* k = thread->security_get_caller_class(0); does not find the "caller" that I would have expected, but instead finds java.lang.System because we're executing System.loadLibrary - and hence finds the boot loader not the actual loader required. But the fact we jump through all these hoops is in itself questionable because the specification for JNI_FindClass does not indicate this will happen. It only accounts for two cases: 1. A JNI call from a declared native method - which uses the loader of the class that defines the method 2. A JNI call "through the Invocation Interface" which I interpret as being a JNI call from C code, from an attached thread, with no Java frames on the stack. In which case the system loader is used. A call from JNI_OnLoad (or OnUnload) does not, to me, fit either of these cases; nor does JNI_OnLoad say anything about the context in which it executes. So it seems we have presumed that this case should mean "use the loader of the class which loaded the native library". A very reasonable approach, but not one defined by the specification as far as I can see. But given this, it is not unreasonable to also use the same interpretation for JNI_OnUnload. So there is a gap in the specification regarding the execution context of the library lifecycle function hooks - other than onUnload being an "unknown context". David ----- > > Thanks, > David > >> >> Mandy From harold.seigel at oracle.com Wed Sep 27 13:38:29 2017 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 27 Sep 2017 09:38:29 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> Message-ID: Hi David, Please review this updated webrev at: http://cr.openjdk.java.net/~hseigel/bug_8186092.3/webrev/index.html The only change from the previous webrev involves adding the assert() at lines 1202 and 1203 of klassVtable.cpp. Thanks, Harold On 9/26/2017 8:25 PM, David Holmes wrote: > Hi Harold, > > On 27/09/2017 5:13 AM, harold seigel wrote: >> Hi David, >> >> Thanks for looking at this change!? Please see updated webrev at: >> >> http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html > > Test changes seem fine. > >> and also see comments embedded below. > > Follow up below. > >> Thanks, Harold >> >> >> On 9/26/2017 3:30 AM, David Holmes wrote: >>> Hi Harold, >>> >>> This looks okay to me. A few comments below but only one real query. >>> >>> On 26/09/2017 1:21 AM, harold seigel wrote: >>>> Hi, >>>> >>>> Please review this JDK-10 change to fix JDK-8186092.? The change >>>> prevents the checking of loader constraints during vtable and >>>> itable creation if the selected method is an overpass method. >>>> Overpass methods are created by the JVM to throw exceptions and so >>>> should not be subjected to loader constraint checking. >>> >>> Okay. >>> >>>> Additionally, this change improves the LinkageError exception error >>>> text when a loader constraint violation occurs during vtable and >>>> itable creation. >>> >>> Hmmm :) I think I put those in initially. Not sure I 100% agree with >>> the changed terminology, but I'll defer to you as the current expert >>> in this area. :) >> I'm hoping better experts also review the changed messages. >>> >>>> The fix includes four new tests, one test each to check that loader >>>> constraint checking is not done for overpass methods during vtable >>>> and itable creation, and one test each to test the new vtable and >>>> itable loader constraint error messages. >>> >>> *.jasm: can you add a comment indicating why these are jasm files as >>> it is not obvious to me what is special about them. >> Thanks for pointing this out.? I converted the two Task.jasm files to >> Task.java file and added a comment to the remaining .jasm file, C.jasm. >>> >>> */Test.java: >>> ?- You can place multiple files on one @compile tag (and still list >>> one file per line). >>> - you don't need to specify java.lang in the name of the exception >>> classes >> Done. >>> >>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>> >>> The real query: >>> >>> 1201???? if (target == NULL || !target->is_public() || >>> target->is_abstract() || target->is_overpass()) { >>> 1202?????? // Entry does not resolve. Leave it empty for >>> AbstractMethodError. >>> 1203?????? if (!(target == NULL) && !target->is_public()) { >>> 1204???????? // Stuff an IllegalAccessError throwing method in there >>> instead. >>> 1205???????? itableOffsetEntry::method_entry(_klass, >>> method_table_offset)[m->itable_index()]. >>> 1206 initialize(Universe::throw_illegal_access_error()); >>> 1207?????? } >>> >>> Not clear why you added the overpass check here? If it is non-public >>> then you're replacing it with an IllegalAccessError instead of >>> whatever the Overpass was going to throw. ?? >> Currently, all overpass methods are public methods. So, they would >> not get replaced with IllegalAccessError.? However, in case >> non-public overpass methods exist in the future, I added "&& >> !target->is_overpass()" to line 1203. >> >> Alternatively, I considered adding an "assert(!target->is_overpass() >> || target->is_public(), "Non-public overpass method");" between lines >> 1201 and 1202 but didn't think that this code should be concerned >> about whether or not overpass methods are public.? I also thought >> about adding "&& !target->is_overpass()" to line 1211 but thought it >> better that all checks on 'target', that prevent loader constraints >> checking, be done at the same place. > > Okay I see what you are trying to do now. We want overpass methods to > follow the "if" path at 1201, but for them it should currently be a > no-op. I'd be inclined to add in the assertion - the code is already > concerned about not processing non-public overpasses with your > proposed change to 1203. The assertion would ensure that anyone > introducing a non-public overpass has it quickly drawn to their > attention that doing so needs additional consideration. > > Thanks, > David > ---- > >>> >>> Thanks, >>> David >>> ----- >>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>>> >>>> The change was tested with the JCK Lang and VM tests, the JTreg >>>> hotspot, java/io, java/lang, java/util, and other tests, the >>>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>>> >>>> Thanks, Harold >>>> >> From coleen.phillimore at oracle.com Wed Sep 27 14:20:05 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 27 Sep 2017 10:20:05 -0400 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> <6f01f8a3-8911-d5fc-9208-7dcac5d1874b@redhat.com> <1e5afb73-8cb3-35aa-dad3-5fc7f8b25a43@redhat.com> Message-ID: <43624339-b16c-bee6-b170-1812f0e19c33@oracle.com> This code seems good.? I can sponsor it for you. Coleen On 9/5/17 1:43 PM, Zhengyu Gu wrote: > Hi Andrew, > > Thanks for the review and suggestions. The webrev is updated according > to the discussions. > > Webrev: http://cr.openjdk.java.net/~zgu/8186770/webrev.01/index.html > > > The sample outputs: > > Summary: > > -???????????????????? Class (reserved=1074360KB, committed=28856KB) > ??????????????????????????? (classes #4028) > ??????????????????????????? (malloc=1208KB #16218) > ??????????????????????????? (mmap: reserved=1073152KB, committed=27648KB) > ??????????????????????????? (? Metadata:?? ) > ??????????????????????????? (??? reserved=24576KB, committed=24320KB) > ??????????????????????????? (??? used=20914KB) > ??????????????????????????? (??? free=3295KB) > ??????????????????????????? (??? waste=111KB =0.46%) > ??????????????????????????? (? Class space:) > ??????????????????????????? (??? reserved=1048576KB, committed=3328KB) > ??????????????????????????? (??? used=2649KB) > ??????????????????????????? (??? free=679KB) > ??????????????????????????? (??? waste=0KB =0.00%) > > > Summary diff: > > -???????????????????? Class (reserved=1076455KB +2129KB, > committed=29415KB +849KB) > ??????????????????????????? (classes #4037 +13) > ??????????????????????????? (malloc=1255KB +81KB #17477 +2214) > ??????????????????????????? (mmap: reserved=1075200KB +2048KB, > committed=28160KB +768KB) > ??????????????????????????? (? Metadata:?? ) > ??????????????????????????? (??? reserved=26624KB +2048KB, > committed=24832KB +768KB) > ??????????????????????????? (??? used=21368KB +718KB) > ??????????????????????????? (??? free=3336KB -21KB) > ??????????????????????????? (??? waste=128KB =0.52% +71KB) > ??????????????????????????? (? Class space:) > ??????????????????????????? (??? reserved=1048576KB, committed=3328KB) > ??????????????????????????? (??? used=2654KB +7KB) > ??????????????????????????? (??? free=674KB -7KB) > ??????????????????????????? (??? waste=0KB =0.00%) > > Thanks, > > -Zhengyu > > > On 09/05/2017 11:18 AM, Andrew Dinn wrote: >> On 29/08/17 17:31, Zhengyu Gu wrote: >>> Okay, I see what you mean. But in this case, capacity = committed. >> >> Well, it does not always seem to be exactly the same. If you add up all >> the pieces to derive the capacity then it sometimes seems to fall short >> of committed. I looked deeper into this and found that sometimes the >> difference is down to rounding up/down. However, there also seems >> occasionally to be more space unaccounted for that cannot be explained >> by rounding errors. >> >> I looked into your suggestion that this might be accounted for by 'dark >> matter' i.e. tail ends of a chunk left unused when the last block is >> carved out and the chunk retired because the tail is too small to insert >> into the block dictionary. However, from my reading of the code I think >> that any such 'dark matter' will still to show up in the waste space >> count. >> >> Rather than hold up this current change I'd prefer to see it pushed and >> address the arithmetic problem in a follow-up issue. Even with an >> occasional small disparity in the reported figures I think it is really >> helpful to have this detailed info available as part of the NMT output. >> >>> I wonder if it is cleaner that just reports free, used and waste, e.g. >>> >>> ????????????????????????? ( Metadata:??????????????????????????? ) >>> ????????????????????????? (??? reserved=22528KB, committed=21504KB) >>> ????????????????????????? (??? used=20654KB) >>> ????????????????????????? (??? free=786KBKB) >>> ????????????????????????? (??? waste=64KB =0.30%) >>> >>> where free = (capacity - used) + free_chunks + available >>> ?????? waste = committed - capacity - free_chunks - available >>> ?????? total = committed >> Yes, I agree that it's ok to leave the available figure implicit -- it >> is easily computed from the committed total by subtracting used and >> waste (that's only correct modulo the occasional small disparity between >> capacity and committed but the difference is small enough not to be >> significant). So, I'm happy with this version. >> >> regards, >> >> >> Andrew Dinn >> ----------- >> Senior Principal Software Engineer >> Red Hat UK Ltd >> Registered in England and Wales under Company Registration No. 03798903 >> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander >> From coleen.phillimore at oracle.com Wed Sep 27 14:26:14 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 27 Sep 2017 10:26:14 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> Message-ID: <4b109fe5-d6d1-9fb1-7aec-8208d21d06f2@oracle.com> Harold, Since you changed both instances of the mostly duplicated error message, would this be a good time to make it a function? thanks, Coleen On 9/27/17 9:38 AM, harold seigel wrote: > Hi David, > > Please review this updated webrev at: > > http://cr.openjdk.java.net/~hseigel/bug_8186092.3/webrev/index.html > > The only change from the previous webrev involves adding the assert() > at lines 1202 and 1203 of klassVtable.cpp. > > Thanks, Harold > > On 9/26/2017 8:25 PM, David Holmes wrote: >> Hi Harold, >> >> On 27/09/2017 5:13 AM, harold seigel wrote: >>> Hi David, >>> >>> Thanks for looking at this change!? Please see updated webrev at: >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html >> >> Test changes seem fine. >> >>> and also see comments embedded below. >> >> Follow up below. >> >>> Thanks, Harold >>> >>> >>> On 9/26/2017 3:30 AM, David Holmes wrote: >>>> Hi Harold, >>>> >>>> This looks okay to me. A few comments below but only one real query. >>>> >>>> On 26/09/2017 1:21 AM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this JDK-10 change to fix JDK-8186092.? The change >>>>> prevents the checking of loader constraints during vtable and >>>>> itable creation if the selected method is an overpass method. >>>>> Overpass methods are created by the JVM to throw exceptions and so >>>>> should not be subjected to loader constraint checking. >>>> >>>> Okay. >>>> >>>>> Additionally, this change improves the LinkageError exception >>>>> error text when a loader constraint violation occurs during vtable >>>>> and itable creation. >>>> >>>> Hmmm :) I think I put those in initially. Not sure I 100% agree >>>> with the changed terminology, but I'll defer to you as the current >>>> expert in this area. :) >>> I'm hoping better experts also review the changed messages. >>>> >>>>> The fix includes four new tests, one test each to check that >>>>> loader constraint checking is not done for overpass methods during >>>>> vtable and itable creation, and one test each to test the new >>>>> vtable and itable loader constraint error messages. >>>> >>>> *.jasm: can you add a comment indicating why these are jasm files >>>> as it is not obvious to me what is special about them. >>> Thanks for pointing this out.? I converted the two Task.jasm files >>> to Task.java file and added a comment to the remaining .jasm file, >>> C.jasm. >>>> >>>> */Test.java: >>>> ?- You can place multiple files on one @compile tag (and still list >>>> one file per line). >>>> - you don't need to specify java.lang in the name of the exception >>>> classes >>> Done. >>>> >>>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>>> >>>> The real query: >>>> >>>> 1201???? if (target == NULL || !target->is_public() || >>>> target->is_abstract() || target->is_overpass()) { >>>> 1202?????? // Entry does not resolve. Leave it empty for >>>> AbstractMethodError. >>>> 1203?????? if (!(target == NULL) && !target->is_public()) { >>>> 1204???????? // Stuff an IllegalAccessError throwing method in >>>> there instead. >>>> 1205???????? itableOffsetEntry::method_entry(_klass, >>>> method_table_offset)[m->itable_index()]. >>>> 1206 initialize(Universe::throw_illegal_access_error()); >>>> 1207?????? } >>>> >>>> Not clear why you added the overpass check here? If it is >>>> non-public then you're replacing it with an IllegalAccessError >>>> instead of whatever the Overpass was going to throw. ?? >>> Currently, all overpass methods are public methods. So, they would >>> not get replaced with IllegalAccessError.? However, in case >>> non-public overpass methods exist in the future, I added "&& >>> !target->is_overpass()" to line 1203. >>> >>> Alternatively, I considered adding an "assert(!target->is_overpass() >>> || target->is_public(), "Non-public overpass method");" between >>> lines 1201 and 1202 but didn't think that this code should be >>> concerned about whether or not overpass methods are public.? I also >>> thought about adding "&& !target->is_overpass()" to line 1211 but >>> thought it better that all checks on 'target', that prevent loader >>> constraints checking, be done at the same place. >> >> Okay I see what you are trying to do now. We want overpass methods to >> follow the "if" path at 1201, but for them it should currently be a >> no-op. I'd be inclined to add in the assertion - the code is already >> concerned about not processing non-public overpasses with your >> proposed change to 1203. The assertion would ensure that anyone >> introducing a non-public overpass has it quickly drawn to their >> attention that doing so needs additional consideration. >> >> Thanks, >> David >> ---- >> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>>>> >>>>> The change was tested with the JCK Lang and VM tests, the JTreg >>>>> hotspot, java/io, java/lang, java/util, and other tests, the >>>>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>>>> >>>>> Thanks, Harold >>>>> >>> > From calvin.cheung at oracle.com Wed Sep 27 15:50:33 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 27 Sep 2017 08:50:33 -0700 Subject: RFR(XXS) 8187979 Clean up info printing at CDS dump time In-Reply-To: References: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> <3CDBD952-0FF7-4412-824E-C78DDFF59D0F@oracle.com> Message-ID: <59CBC8C9.5070900@oracle.com> Hi Ioi, The new output format looks good. thanks, Calvin On 9/26/17, 8:44 PM, Ioi Lam wrote: > Hi Jiangli, > > The "xx space:" column is lined up and the "Unknown" row is removed. > > > *** Before *** > > > mc space: 21736 [ 0.1% of total] out of 24576 bytes [ 88.4% > used] at 0x0000000800000000 > rw space: 4229896 [ 23.4% of total] out of 4231168 bytes [100.0% > used] at 0x0000000800006000 > ro space: 7227952 [ 40.0% of total] out of 7229440 bytes [100.0% > used] at 0x000000080040f000 > md space: 6064 [ 0.0% of total] out of 8192 bytes [ 74.0% > used] at 0x0000000800af4000 > od space: 6404560 [ 35.4% of total] out of 6406144 bytes [100.0% > used] at 0x0000000800af6000 > st0 space: 102400 [ 0.6% of total] out of 102400 bytes [100% > used] at 0x00000007bfc00000 > oa0 space: 65536 [ 0.4% of total] out of 65536 bytes [100% > used] at 0x00000007bf800000 > total : 18058144 [100.0% of total] out of 18067456 bytes [ 99.9% > used] > [3.470s][info ][cds ] Detailed metadata info (excluding > od/st regions; rw stats include md/mc regions): > ro_cnt ro_bytes % | rw_cnt > rw_bytes % | all_cnt all_bytes % > --------------------+---------------------------+---------------------------+-------------------------- > > Unknown : 0 0 0.0 | 0 0 0.0 > | 0 0 0.0 > Class : 0 0 0.0 | 1237 783584 > 18.4 | 1237 783584 6.8 > Symbol : 36210 1415496 19.6 | 0 0 0.0 > | 36210 1415496 12.3 > > > *** After *** > > > mc space: 21736 [ 0.1% of total] out of 24576 bytes [ 88.4% > used] at 0x0000000800000000 > rw space: 4229896 [ 23.4% of total] out of 4231168 bytes [100.0% > used] at 0x0000000800006000 > ro space: 7227952 [ 40.0% of total] out of 7229440 bytes [100.0% > used] at 0x000000080040f000 > md space: 6064 [ 0.0% of total] out of 8192 bytes [ 74.0% > used] at 0x0000000800af4000 > od space: 6404560 [ 35.4% of total] out of 6406144 bytes [100.0% > used] at 0x0000000800af6000 > st0 space: 102400 [ 0.6% of total] out of 102400 bytes [100% > used] at 0x00000007bfc00000 > oa0 space: 65536 [ 0.4% of total] out of 65536 bytes [100% > used] at 0x00000007bf800000 > total : 18058144 [100.0% of total] out of 18067456 bytes [ 99.9% > used] > [3.692s][info ][cds ] Detailed metadata info (excluding > od/st regions; rw stats include md/mc regions): > ro_cnt ro_bytes % | rw_cnt > rw_bytes % | all_cnt all_bytes % > --------------------+---------------------------+---------------------------+-------------------------- > > Class : 0 0 0.0 | 1237 783584 > 18.4 | 1237 783584 6.8 > Symbol : 36210 1415496 19.6 | 0 0 0.0 > | 36210 1415496 12.3 > > > Thanks > - Ioi > > > On 9/26/17 6:02 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> Could you please send an updated dump output? >> >> Thanks, >> Jiangli >> >>> On Sep 26, 2017, at 3:03 PM, Ioi Lam wrote: >>> >>> A small clean up to removed obsolete info and improve indentation >>> >>> * https://bugs.openjdk.java.net/browse/JDK-8187979 >>> * >>> http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/ >>> >>> Webrev doesn't show the lines that has only changes in blank spaces, >>> but you can see the full diff here: >>> >>> * >>> http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/open.patch >>> >>> Thanks >>> >>> - Ioi >>> > From jiangli.zhou at Oracle.COM Wed Sep 27 17:00:11 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Wed, 27 Sep 2017 10:00:11 -0700 Subject: RFR(XXS) 8187979 Clean up info printing at CDS dump time In-Reply-To: References: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> <3CDBD952-0FF7-4412-824E-C78DDFF59D0F@oracle.com> Message-ID: <331F0B6F-CF1F-45E1-BB45-B2411DF180EA@oracle.com> Hi Ioi, Thanks. The changes look good. Since you are in the area, could you please also fix ?[100% used ? " in the dump output for the st* and oa* spaces. od space: 6478224 [ 35.4% of total] out of 6479872 bytes [100.0% used] at 0x0000000800b12000 st0 space: 122880 [ 0.7% of total] out of 122880 bytes [100% used] at 0x00000007ffc00000 Thanks, Jiangli > On Sep 26, 2017, at 8:44 PM, Ioi Lam wrote: > > Hi Jiangli, > > The "xx space:" column is lined up and the "Unknown" row is removed. > > > *** Before *** > > > mc space: 21736 [ 0.1% of total] out of 24576 bytes [ 88.4% used] at 0x0000000800000000 > rw space: 4229896 [ 23.4% of total] out of 4231168 bytes [100.0% used] at 0x0000000800006000 > ro space: 7227952 [ 40.0% of total] out of 7229440 bytes [100.0% used] at 0x000000080040f000 > md space: 6064 [ 0.0% of total] out of 8192 bytes [ 74.0% used] at 0x0000000800af4000 > od space: 6404560 [ 35.4% of total] out of 6406144 bytes [100.0% used] at 0x0000000800af6000 > st0 space: 102400 [ 0.6% of total] out of 102400 bytes [100% used] at 0x00000007bfc00000 > oa0 space: 65536 [ 0.4% of total] out of 65536 bytes [100% used] at 0x00000007bf800000 > total : 18058144 [100.0% of total] out of 18067456 bytes [ 99.9% used] > [3.470s][info ][cds ] Detailed metadata info (excluding od/st regions; rw stats include md/mc regions): > ro_cnt ro_bytes % | rw_cnt rw_bytes % | all_cnt all_bytes % > --------------------+---------------------------+---------------------------+-------------------------- > Unknown : 0 0 0.0 | 0 0 0.0 | 0 0 0.0 > Class : 0 0 0.0 | 1237 783584 18.4 | 1237 783584 6.8 > Symbol : 36210 1415496 19.6 | 0 0 0.0 | 36210 1415496 12.3 > > > *** After *** > > > mc space: 21736 [ 0.1% of total] out of 24576 bytes [ 88.4% used] at 0x0000000800000000 > rw space: 4229896 [ 23.4% of total] out of 4231168 bytes [100.0% used] at 0x0000000800006000 > ro space: 7227952 [ 40.0% of total] out of 7229440 bytes [100.0% used] at 0x000000080040f000 > md space: 6064 [ 0.0% of total] out of 8192 bytes [ 74.0% used] at 0x0000000800af4000 > od space: 6404560 [ 35.4% of total] out of 6406144 bytes [100.0% used] at 0x0000000800af6000 > st0 space: 102400 [ 0.6% of total] out of 102400 bytes [100% used] at 0x00000007bfc00000 > oa0 space: 65536 [ 0.4% of total] out of 65536 bytes [100% used] at 0x00000007bf800000 > total : 18058144 [100.0% of total] out of 18067456 bytes [ 99.9% used] > [3.692s][info ][cds ] Detailed metadata info (excluding od/st regions; rw stats include md/mc regions): > ro_cnt ro_bytes % | rw_cnt rw_bytes % | all_cnt all_bytes % > --------------------+---------------------------+---------------------------+-------------------------- > Class : 0 0 0.0 | 1237 783584 18.4 | 1237 783584 6.8 > Symbol : 36210 1415496 19.6 | 0 0 0.0 | 36210 1415496 12.3 > > > Thanks > - Ioi > > > On 9/26/17 6:02 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> Could you please send an updated dump output? >> >> Thanks, >> Jiangli >> >>> On Sep 26, 2017, at 3:03 PM, Ioi Lam wrote: >>> >>> A small clean up to removed obsolete info and improve indentation >>> >>> * https://bugs.openjdk.java.net/browse/JDK-8187979 >>> * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/ >>> >>> Webrev doesn't show the lines that has only changes in blank spaces, but you can see the full diff here: >>> >>> * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/open.patch >>> >>> Thanks >>> >>> - Ioi >>> > From mikhailo.seledtsov at oracle.com Wed Sep 27 18:00:15 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Wed, 27 Sep 2017 11:00:15 -0700 Subject: RFR(S): 8181592: [TESTBUG] Docker test utils and docker jdk basic test In-Reply-To: <5F386D9A-CA05-4732-8F68-F493DD2E8E99@oracle.com> References: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> <5F386D9A-CA05-4732-8F68-F493DD2E8E99@oracle.com> Message-ID: Leonid, Thank you for review and constructive feedback. See my comment in line. On 09/26/2017 11:19 AM, Leonid Mesnik wrote: > Misha > > http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/hotspot/jtreg/runtime/containers/docker/DockerBasicTest.java.html > > Copyright is incorrect, need to updated it for GPL. Fixed > > The Hotspot is Oracle VM name only so test might fail for OpenJDK. I > think you need to fix this check. I see. I fixed this by using Platform.vmName which should be correct in all cases. I double-checked with OpenJDK also. > > The requires checks only that test is executed only on the 64-bit > linux. Does it make a sense to introduce more docker-specific check? I agree this is a better way. I will do some prototyping; if such check is feasible and efficient in at requires then I will add it. > > > http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/hotspot/jtreg/runtime/containers/docker/Dockerfile-BasicTest.html > > Could you please explain why oraclelinux 7.0 is used as a base image > for test. I have upgraded to Oracle Linux 7.2. If we have specific requirement I will change it to that. If we have requirements in the future to support multiple OS, I can add Dockerfile generation. For this basic sanity tests I think this should suffice. > > http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/lib/jdk/test/lib/containers/docker/DockerTestUtils.java.html > > The content looks fine. > > I don?t see anything to clean up docker images on the system. Could > you please explain how tests are going to cleanup images. To clean up containers I will add "--rm" to the 'docker run' command. This should ensure that container data is removed after container stops. As for the image - I use the same image name. The image will stay in the local registry unless manually removed. I should probably do 'docker rmi' at the end of the test to clean this up. Once I implement these changes I will send the updated webrev. Thank you, Misha > > Leonid > > >> On Sep 21, 2017, at 5:58 PM, mikhailo > > wrote: >> >> Please review this initial drop of Docker test utils and a sanity >> test. This change lays ground >> for further test development and test utils improvement in this area. >> >> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8181592 >> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8181592.00/ >> >> ??? Testing: >> ?????? - run this test on machine with Docker enabled - works >> ?????? - run this test on Linux-x64 with no Docker engine or Docker >> disabled - test skipped (as expected) >> ?????? - run this test on automated system - in progress >> >> >> Thank you, >> Misha >> > From ioi.lam at oracle.com Wed Sep 27 17:58:06 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 27 Sep 2017 10:58:06 -0700 Subject: RFR(XXS) 8187979 Clean up info printing at CDS dump time In-Reply-To: <331F0B6F-CF1F-45E1-BB45-B2411DF180EA@oracle.com> References: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> <3CDBD952-0FF7-4412-824E-C78DDFF59D0F@oracle.com> <331F0B6F-CF1F-45E1-BB45-B2411DF180EA@oracle.com> Message-ID: <4b049cf3-d32b-b510-e610-c80d4eaa520a@oracle.com> Hi Jiangli, Thanks for noticing this. I have added the following to my patch for metaspaceShared.cpp: -????? tty->print_cr("%s%d space: " SIZE_FORMAT_W(9) " [ %4.1f%% of total] out of " SIZE_FORMAT_W(9) " bytes [100%% used] at " INTPTR_FORMAT, +????? tty->print_cr("%s%d space: " SIZE_FORMAT_W(9) " [ %4.1f%% of total] out of " SIZE_FORMAT_W(9) " bytes [100.0%% used] at " INTPTR_FORMAT, mc? space:???? 21736 [? 0.1% of total] out of???? 24576 bytes [ 88.4% used] at 0x0000000800000000 rw? space:?? 4229896 [ 23.4% of total] out of?? 4231168 bytes [100.0% used] at 0x0000000800006000 ro? space:?? 7227952 [ 40.0% of total] out of?? 7229440 bytes [100.0% used] at 0x000000080040f000 md? space:????? 6064 [? 0.0% of total] out of????? 8192 bytes [ 74.0% used] at 0x0000000800af4000 od? space:?? 6404560 [ 35.4% of total] out of?? 6406144 bytes [100.0% used] at 0x0000000800af6000 st0 space:??? 102400 [? 0.6% of total] out of??? 102400 bytes [100.0% used] at 0x00000007bfc00000 oa0 space:???? 65536 [? 0.4% of total] out of???? 65536 bytes [100.0% used] at 0x00000007bf800000 total??? :? 18058144 [100.0% of total] out of? 18067456 bytes [ 99.9% used] Thanks - Ioi On 9/27/17 10:00 AM, Jiangli Zhou wrote: > Hi Ioi, > > Thanks. The changes look good. > > Since you are in the area, could you please also fix ?[100% used ? " > in the dump output for the st* and oa* spaces. > > od? space: ? 6478224 [ 35.4% of total] out of 6479872 bytes [100.0% > used] at 0x0000000800b12000 > st0 space:? ? 122880 [? 0.7% of total] out of 122880 bytes [100% used] > at 0x00000007ffc00000 > > Thanks, > Jiangli > >> On Sep 26, 2017, at 8:44 PM, Ioi Lam > > wrote: >> >> Hi Jiangli, >> >> The "xx space:" column is lined up and the "Unknown" row is removed. >> >> >> *** Before *** >> >> >> mc space:???? 21736 [? 0.1% of total] out of???? 24576 bytes [ 88.4% >> used] at 0x0000000800000000 >> rw space:?? 4229896 [ 23.4% of total] out of?? 4231168 bytes [100.0% >> used] at 0x0000000800006000 >> ro space:?? 7227952 [ 40.0% of total] out of?? 7229440 bytes [100.0% >> used] at 0x000000080040f000 >> md space:????? 6064 [? 0.0% of total] out of????? 8192 bytes [ 74.0% >> used] at 0x0000000800af4000 >> od space:?? 6404560 [ 35.4% of total] out of?? 6406144 bytes [100.0% >> used] at 0x0000000800af6000 >> st0 space:??? 102400 [? 0.6% of total] out of??? 102400 bytes [100% >> used] at 0x00000007bfc00000 >> oa0 space:???? 65536 [? 0.4% of total] out of???? 65536 bytes [100% >> used] at 0x00000007bf800000 >> total?? :? 18058144 [100.0% of total] out of? 18067456 bytes [ 99.9% >> used] >> [3.470s][info ][cds???????????? ] Detailed metadata info (excluding >> od/st regions; rw stats include md/mc regions): >> ??????????????????????? ro_cnt?? ro_bytes???? % |?? rw_cnt >> rw_bytes???? % |? all_cnt? all_bytes???? % >> --------------------+---------------------------+---------------------------+-------------------------- >> Unknown???????????? :??????? 0????????? 0?? 0.0 | 0????????? 0?? 0.0 >> |??????? 0????????? 0?? 0.0 >> Class?????????????? :??????? 0????????? 0?? 0.0 |???? 1237 783584? >> 18.4 |???? 1237???? 783584?? 6.8 >> Symbol????????????? :??? 36210??? 1415496? 19.6 | 0????????? 0?? 0.0 >> |??? 36210??? 1415496? 12.3 >> >> >> *** After *** >> >> >> mc? space:???? 21736 [? 0.1% of total] out of???? 24576 bytes [ 88.4% >> used] at 0x0000000800000000 >> rw? space:?? 4229896 [ 23.4% of total] out of?? 4231168 bytes [100.0% >> used] at 0x0000000800006000 >> ro? space:?? 7227952 [ 40.0% of total] out of?? 7229440 bytes [100.0% >> used] at 0x000000080040f000 >> md? space:????? 6064 [? 0.0% of total] out of????? 8192 bytes [ 74.0% >> used] at 0x0000000800af4000 >> od? space:?? 6404560 [ 35.4% of total] out of?? 6406144 bytes [100.0% >> used] at 0x0000000800af6000 >> st0 space:??? 102400 [? 0.6% of total] out of??? 102400 bytes [100% >> used] at 0x00000007bfc00000 >> oa0 space:???? 65536 [? 0.4% of total] out of???? 65536 bytes [100% >> used] at 0x00000007bf800000 >> total??? :? 18058144 [100.0% of total] out of? 18067456 bytes [ 99.9% >> used] >> [3.692s][info ][cds???????????? ] Detailed metadata info (excluding >> od/st regions; rw stats include md/mc regions): >> ??????????????????????? ro_cnt?? ro_bytes???? % |?? rw_cnt >> rw_bytes???? % |? all_cnt? all_bytes???? % >> --------------------+---------------------------+---------------------------+-------------------------- >> Class?????????????? :??????? 0????????? 0?? 0.0 |???? 1237 783584? >> 18.4 |???? 1237???? 783584?? 6.8 >> Symbol????????????? :??? 36210??? 1415496? 19.6 | 0????????? 0?? 0.0 >> |??? 36210??? 1415496? 12.3 >> >> >> Thanks >> - Ioi >> >> >> On 9/26/17 6:02 PM, Jiangli Zhou wrote: >>> Hi Ioi, >>> >>> Could you please send an updated dump output? >>> >>> Thanks, >>> Jiangli >>> >>>> On Sep 26, 2017, at 3:03 PM, Ioi Lam >>> > wrote: >>>> >>>> A small clean up to removed obsolete info and improve indentation >>>> >>>> * https://bugs.openjdk.java.net/browse/JDK-8187979 >>>> * >>>> http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/ >>>> >>>> >>>> Webrev doesn't show the lines that has only changes in blank >>>> spaces, but you can see the full diff here: >>>> >>>> * >>>> http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/open.patch >>>> >>>> >>>> Thanks >>>> >>>> - Ioi >>>> >> > From jiangli.zhou at oracle.com Wed Sep 27 17:59:29 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 27 Sep 2017 10:59:29 -0700 Subject: RFR(XXS) 8187979 Clean up info printing at CDS dump time In-Reply-To: <4b049cf3-d32b-b510-e610-c80d4eaa520a@oracle.com> References: <5b8b5e0f-66c2-fe7e-e820-40532878e8e4@oracle.com> <3CDBD952-0FF7-4412-824E-C78DDFF59D0F@oracle.com> <331F0B6F-CF1F-45E1-BB45-B2411DF180EA@oracle.com> <4b049cf3-d32b-b510-e610-c80d4eaa520a@oracle.com> Message-ID: <5132308F-6B96-4BF9-88B4-D19633F20E83@oracle.com> Looks good. Thanks for fixing it. Jiangli > On Sep 27, 2017, at 10:58 AM, Ioi Lam wrote: > > Hi Jiangli, > > Thanks for noticing this. I have added the following to my patch for metaspaceShared.cpp: > > - tty->print_cr("%s%d space: " SIZE_FORMAT_W(9) " [ %4.1f%% of total] out of " SIZE_FORMAT_W(9) " bytes [100%% used] at " INTPTR_FORMAT, > + tty->print_cr("%s%d space: " SIZE_FORMAT_W(9) " [ %4.1f%% of total] out of " SIZE_FORMAT_W(9) " bytes [100.0%% used] at " INTPTR_FORMAT, > > > mc space: 21736 [ 0.1% of total] out of 24576 bytes [ 88.4% used] at 0x0000000800000000 > rw space: 4229896 [ 23.4% of total] out of 4231168 bytes [100.0% used] at 0x0000000800006000 > ro space: 7227952 [ 40.0% of total] out of 7229440 bytes [100.0% used] at 0x000000080040f000 > md space: 6064 [ 0.0% of total] out of 8192 bytes [ 74.0% used] at 0x0000000800af4000 > od space: 6404560 [ 35.4% of total] out of 6406144 bytes [100.0% used] at 0x0000000800af6000 > st0 space: 102400 [ 0.6% of total] out of 102400 bytes [100.0% used] at 0x00000007bfc00000 > oa0 space: 65536 [ 0.4% of total] out of 65536 bytes [100.0% used] at 0x00000007bf800000 > total : 18058144 [100.0% of total] out of 18067456 bytes [ 99.9% used] > > > Thanks > > - Ioi > > On 9/27/17 10:00 AM, Jiangli Zhou wrote: >> Hi Ioi, >> >> Thanks. The changes look good. >> >> Since you are in the area, could you please also fix ?[100% used ? " in the dump output for the st* and oa* spaces. >> >> od space: 6478224 [ 35.4% of total] out of 6479872 bytes [100.0% used] at 0x0000000800b12000 >> st0 space: 122880 [ 0.7% of total] out of 122880 bytes [100% used] at 0x00000007ffc00000 >> >> Thanks, >> Jiangli >> >>> On Sep 26, 2017, at 8:44 PM, Ioi Lam > wrote: >>> >>> Hi Jiangli, >>> >>> The "xx space:" column is lined up and the "Unknown" row is removed. >>> >>> >>> *** Before *** >>> >>> >>> mc space: 21736 [ 0.1% of total] out of 24576 bytes [ 88.4% used] at 0x0000000800000000 >>> rw space: 4229896 [ 23.4% of total] out of 4231168 bytes [100.0% used] at 0x0000000800006000 >>> ro space: 7227952 [ 40.0% of total] out of 7229440 bytes [100.0% used] at 0x000000080040f000 >>> md space: 6064 [ 0.0% of total] out of 8192 bytes [ 74.0% used] at 0x0000000800af4000 >>> od space: 6404560 [ 35.4% of total] out of 6406144 bytes [100.0% used] at 0x0000000800af6000 >>> st0 space: 102400 [ 0.6% of total] out of 102400 bytes [100% used] at 0x00000007bfc00000 >>> oa0 space: 65536 [ 0.4% of total] out of 65536 bytes [100% used] at 0x00000007bf800000 >>> total : 18058144 [100.0% of total] out of 18067456 bytes [ 99.9% used] >>> [3.470s][info ][cds ] Detailed metadata info (excluding od/st regions; rw stats include md/mc regions): >>> ro_cnt ro_bytes % | rw_cnt rw_bytes % | all_cnt all_bytes % >>> --------------------+---------------------------+---------------------------+-------------------------- >>> Unknown : 0 0 0.0 | 0 0 0.0 | 0 0 0.0 >>> Class : 0 0 0.0 | 1237 783584 18.4 | 1237 783584 6.8 >>> Symbol : 36210 1415496 19.6 | 0 0 0.0 | 36210 1415496 12.3 >>> >>> >>> *** After *** >>> >>> >>> mc space: 21736 [ 0.1% of total] out of 24576 bytes [ 88.4% used] at 0x0000000800000000 >>> rw space: 4229896 [ 23.4% of total] out of 4231168 bytes [100.0% used] at 0x0000000800006000 >>> ro space: 7227952 [ 40.0% of total] out of 7229440 bytes [100.0% used] at 0x000000080040f000 >>> md space: 6064 [ 0.0% of total] out of 8192 bytes [ 74.0% used] at 0x0000000800af4000 >>> od space: 6404560 [ 35.4% of total] out of 6406144 bytes [100.0% used] at 0x0000000800af6000 >>> st0 space: 102400 [ 0.6% of total] out of 102400 bytes [100% used] at 0x00000007bfc00000 >>> oa0 space: 65536 [ 0.4% of total] out of 65536 bytes [100% used] at 0x00000007bf800000 >>> total : 18058144 [100.0% of total] out of 18067456 bytes [ 99.9% used] >>> [3.692s][info ][cds ] Detailed metadata info (excluding od/st regions; rw stats include md/mc regions): >>> ro_cnt ro_bytes % | rw_cnt rw_bytes % | all_cnt all_bytes % >>> --------------------+---------------------------+---------------------------+-------------------------- >>> Class : 0 0 0.0 | 1237 783584 18.4 | 1237 783584 6.8 >>> Symbol : 36210 1415496 19.6 | 0 0 0.0 | 36210 1415496 12.3 >>> >>> >>> Thanks >>> - Ioi >>> >>> >>> On 9/26/17 6:02 PM, Jiangli Zhou wrote: >>>> Hi Ioi, >>>> >>>> Could you please send an updated dump output? >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> On Sep 26, 2017, at 3:03 PM, Ioi Lam > wrote: >>>>> >>>>> A small clean up to removed obsolete info and improve indentation >>>>> >>>>> * https://bugs.openjdk.java.net/browse/JDK-8187979 >>>>> * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/ >>>>> >>>>> Webrev doesn't show the lines that has only changes in blank spaces, but you can see the full diff here: >>>>> >>>>> * http://cr.openjdk.java.net/~iklam/jdk10/8187979-dump-info-cleanup.v01/open.patch >>>>> >>>>> Thanks >>>>> >>>>> - Ioi >>>>> >>> >> > From mandy.chung at oracle.com Wed Sep 27 18:58:22 2017 From: mandy.chung at oracle.com (mandy chung) Date: Wed, 27 Sep 2017 11:58:22 -0700 Subject: Review Request JDK-8164512: Replace ClassLoader use of finalizer with phantom reference to unload native library In-Reply-To: <22d101c6-5255-b9b7-606c-6f5519a5c597@oracle.com> References: <1643128c-8714-4d6d-253a-b7413a5eb8ef@oracle.com> <9b5b908b-e54c-67c4-13a4-250d2087241f@oracle.com> <42cd522c-4a1f-f231-3323-6bb4a0183a1e@oracle.com> <3a0fe286-f38a-f7f1-4a23-15c43bb42a80@oracle.com> <3d52390c-d468-1f74-6e1e-4a9bc960b681@oracle.com> <1e71b112-cdd5-ba0d-d519-5d54e5ace549@oracle.com> <0c6802f2-678b-977d-e827-ef03c7f376a2@oracle.com> <22d101c6-5255-b9b7-606c-6f5519a5c597@oracle.com> Message-ID: On 9/27/17 5:49 AM, David Holmes wrote: > > I missed the fact that we already special case this for JNI_OnLoad and > JNI_OnUnload. Yes this is buried in JNI_FindClass method. > I would have thought that in the OnLoad case we would find the > classloader of the class loading the native library without any need > to resort to the NativeLibrary support code in ClassLoader. I guess > that this: > > ? // Find calling class > ? Klass* k = thread->security_get_caller_class(0); > > does not find the "caller" that I would have expected, but instead > finds java.lang.System because we're executing System.loadLibrary - > and hence finds the boot loader not the actual loader required. > In the current implementation (without this change), NativeLibrary.load and NativeLibrary.unload native methods are the caller calling JNI_OnLoad and JNI_OnUnload respectively. > But the fact we jump through all these hoops is in itself questionable > because the specification for JNI_FindClass does not indicate this > will happen. It only accounts for two cases: > > 1. A JNI call from a declared native method - which uses the loader of > the class that defines the method > > 2. A JNI call "through the Invocation Interface" which I interpret as > being a JNI call from C code, from an attached thread, with no Java > frames on the stack. In which case the system loader is used. > > A call from JNI_OnLoad (or OnUnload) does not, to me, fit either of > these cases; nor does JNI_OnLoad say anything about the context in > which it executes. So it seems we have presumed that this case should > mean "use the loader of the class which loaded the native library". A > very reasonable approach, but not one defined by the specification as > far as I can see. That's the whole point of this discussion and the spec needs clarification. > But given this, it is not unreasonable to also use the same > interpretation for JNI_OnUnload. > That might be how it ends up the current implementation. > So there is a gap in the specification regarding the execution context > of the library lifecycle function hooks - other than onUnload being an > "unknown context". This suggests that we should clarify in JNI_OnLoad spec to specify the context. FYI.? I file a separate issue: https://bugs.openjdk.java.net/browse/JDK-8188052 Mandy From david.holmes at oracle.com Wed Sep 27 21:42:58 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 28 Sep 2017 07:42:58 +1000 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> Message-ID: <0302f5f6-f783-7079-2d05-fc96b958d3e4@oracle.com> On 27/09/2017 11:38 PM, harold seigel wrote: > Hi David, > > Please review this updated webrev at: > > http://cr.openjdk.java.net/~hseigel/bug_8186092.3/webrev/index.html > > The only change from the previous webrev involves adding the assert() at > lines 1202 and 1203 of klassVtable.cpp. Looks good. Thanks, David > Thanks, Harold > > On 9/26/2017 8:25 PM, David Holmes wrote: >> Hi Harold, >> >> On 27/09/2017 5:13 AM, harold seigel wrote: >>> Hi David, >>> >>> Thanks for looking at this change!? Please see updated webrev at: >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html >> >> Test changes seem fine. >> >>> and also see comments embedded below. >> >> Follow up below. >> >>> Thanks, Harold >>> >>> >>> On 9/26/2017 3:30 AM, David Holmes wrote: >>>> Hi Harold, >>>> >>>> This looks okay to me. A few comments below but only one real query. >>>> >>>> On 26/09/2017 1:21 AM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this JDK-10 change to fix JDK-8186092.? The change >>>>> prevents the checking of loader constraints during vtable and >>>>> itable creation if the selected method is an overpass method. >>>>> Overpass methods are created by the JVM to throw exceptions and so >>>>> should not be subjected to loader constraint checking. >>>> >>>> Okay. >>>> >>>>> Additionally, this change improves the LinkageError exception error >>>>> text when a loader constraint violation occurs during vtable and >>>>> itable creation. >>>> >>>> Hmmm :) I think I put those in initially. Not sure I 100% agree with >>>> the changed terminology, but I'll defer to you as the current expert >>>> in this area. :) >>> I'm hoping better experts also review the changed messages. >>>> >>>>> The fix includes four new tests, one test each to check that loader >>>>> constraint checking is not done for overpass methods during vtable >>>>> and itable creation, and one test each to test the new vtable and >>>>> itable loader constraint error messages. >>>> >>>> *.jasm: can you add a comment indicating why these are jasm files as >>>> it is not obvious to me what is special about them. >>> Thanks for pointing this out.? I converted the two Task.jasm files to >>> Task.java file and added a comment to the remaining .jasm file, C.jasm. >>>> >>>> */Test.java: >>>> ?- You can place multiple files on one @compile tag (and still list >>>> one file per line). >>>> - you don't need to specify java.lang in the name of the exception >>>> classes >>> Done. >>>> >>>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>>> >>>> The real query: >>>> >>>> 1201???? if (target == NULL || !target->is_public() || >>>> target->is_abstract() || target->is_overpass()) { >>>> 1202?????? // Entry does not resolve. Leave it empty for >>>> AbstractMethodError. >>>> 1203?????? if (!(target == NULL) && !target->is_public()) { >>>> 1204???????? // Stuff an IllegalAccessError throwing method in there >>>> instead. >>>> 1205???????? itableOffsetEntry::method_entry(_klass, >>>> method_table_offset)[m->itable_index()]. >>>> 1206 initialize(Universe::throw_illegal_access_error()); >>>> 1207?????? } >>>> >>>> Not clear why you added the overpass check here? If it is non-public >>>> then you're replacing it with an IllegalAccessError instead of >>>> whatever the Overpass was going to throw. ?? >>> Currently, all overpass methods are public methods. So, they would >>> not get replaced with IllegalAccessError.? However, in case >>> non-public overpass methods exist in the future, I added "&& >>> !target->is_overpass()" to line 1203. >>> >>> Alternatively, I considered adding an "assert(!target->is_overpass() >>> || target->is_public(), "Non-public overpass method");" between lines >>> 1201 and 1202 but didn't think that this code should be concerned >>> about whether or not overpass methods are public.? I also thought >>> about adding "&& !target->is_overpass()" to line 1211 but thought it >>> better that all checks on 'target', that prevent loader constraints >>> checking, be done at the same place. >> >> Okay I see what you are trying to do now. We want overpass methods to >> follow the "if" path at 1201, but for them it should currently be a >> no-op. I'd be inclined to add in the assertion - the code is already >> concerned about not processing non-public overpasses with your >> proposed change to 1203. The assertion would ensure that anyone >> introducing a non-public overpass has it quickly drawn to their >> attention that doing so needs additional consideration. >> >> Thanks, >> David >> ---- >> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>>>> >>>>> The change was tested with the JCK Lang and VM tests, the JTreg >>>>> hotspot, java/io, java/lang, java/util, and other tests, the >>>>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>>>> >>>>> Thanks, Harold >>>>> >>> > From david.holmes at oracle.com Wed Sep 27 21:43:59 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 28 Sep 2017 07:43:59 +1000 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: <4b109fe5-d6d1-9fb1-7aec-8208d21d06f2@oracle.com> References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> <4b109fe5-d6d1-9fb1-7aec-8208d21d06f2@oracle.com> Message-ID: On 28/09/2017 12:26 AM, coleen.phillimore at oracle.com wrote: > > Harold, > > Since you changed both instances of the mostly duplicated error message, > would this be a good time to make it a function? There are similarities in the error messages but they are not the same. David > thanks, > Coleen > > On 9/27/17 9:38 AM, harold seigel wrote: >> Hi David, >> >> Please review this updated webrev at: >> >> http://cr.openjdk.java.net/~hseigel/bug_8186092.3/webrev/index.html >> >> The only change from the previous webrev involves adding the assert() >> at lines 1202 and 1203 of klassVtable.cpp. >> >> Thanks, Harold >> >> On 9/26/2017 8:25 PM, David Holmes wrote: >>> Hi Harold, >>> >>> On 27/09/2017 5:13 AM, harold seigel wrote: >>>> Hi David, >>>> >>>> Thanks for looking at this change!? Please see updated webrev at: >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html >>> >>> Test changes seem fine. >>> >>>> and also see comments embedded below. >>> >>> Follow up below. >>> >>>> Thanks, Harold >>>> >>>> >>>> On 9/26/2017 3:30 AM, David Holmes wrote: >>>>> Hi Harold, >>>>> >>>>> This looks okay to me. A few comments below but only one real query. >>>>> >>>>> On 26/09/2017 1:21 AM, harold seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review this JDK-10 change to fix JDK-8186092.? The change >>>>>> prevents the checking of loader constraints during vtable and >>>>>> itable creation if the selected method is an overpass method. >>>>>> Overpass methods are created by the JVM to throw exceptions and so >>>>>> should not be subjected to loader constraint checking. >>>>> >>>>> Okay. >>>>> >>>>>> Additionally, this change improves the LinkageError exception >>>>>> error text when a loader constraint violation occurs during vtable >>>>>> and itable creation. >>>>> >>>>> Hmmm :) I think I put those in initially. Not sure I 100% agree >>>>> with the changed terminology, but I'll defer to you as the current >>>>> expert in this area. :) >>>> I'm hoping better experts also review the changed messages. >>>>> >>>>>> The fix includes four new tests, one test each to check that >>>>>> loader constraint checking is not done for overpass methods during >>>>>> vtable and itable creation, and one test each to test the new >>>>>> vtable and itable loader constraint error messages. >>>>> >>>>> *.jasm: can you add a comment indicating why these are jasm files >>>>> as it is not obvious to me what is special about them. >>>> Thanks for pointing this out.? I converted the two Task.jasm files >>>> to Task.java file and added a comment to the remaining .jasm file, >>>> C.jasm. >>>>> >>>>> */Test.java: >>>>> ?- You can place multiple files on one @compile tag (and still list >>>>> one file per line). >>>>> - you don't need to specify java.lang in the name of the exception >>>>> classes >>>> Done. >>>>> >>>>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>>>> >>>>> The real query: >>>>> >>>>> 1201???? if (target == NULL || !target->is_public() || >>>>> target->is_abstract() || target->is_overpass()) { >>>>> 1202?????? // Entry does not resolve. Leave it empty for >>>>> AbstractMethodError. >>>>> 1203?????? if (!(target == NULL) && !target->is_public()) { >>>>> 1204???????? // Stuff an IllegalAccessError throwing method in >>>>> there instead. >>>>> 1205???????? itableOffsetEntry::method_entry(_klass, >>>>> method_table_offset)[m->itable_index()]. >>>>> 1206 initialize(Universe::throw_illegal_access_error()); >>>>> 1207?????? } >>>>> >>>>> Not clear why you added the overpass check here? If it is >>>>> non-public then you're replacing it with an IllegalAccessError >>>>> instead of whatever the Overpass was going to throw. ?? >>>> Currently, all overpass methods are public methods. So, they would >>>> not get replaced with IllegalAccessError.? However, in case >>>> non-public overpass methods exist in the future, I added "&& >>>> !target->is_overpass()" to line 1203. >>>> >>>> Alternatively, I considered adding an "assert(!target->is_overpass() >>>> || target->is_public(), "Non-public overpass method");" between >>>> lines 1201 and 1202 but didn't think that this code should be >>>> concerned about whether or not overpass methods are public.? I also >>>> thought about adding "&& !target->is_overpass()" to line 1211 but >>>> thought it better that all checks on 'target', that prevent loader >>>> constraints checking, be done at the same place. >>> >>> Okay I see what you are trying to do now. We want overpass methods to >>> follow the "if" path at 1201, but for them it should currently be a >>> no-op. I'd be inclined to add in the assertion - the code is already >>> concerned about not processing non-public overpasses with your >>> proposed change to 1203. The assertion would ensure that anyone >>> introducing a non-public overpass has it quickly drawn to their >>> attention that doing so needs additional consideration. >>> >>> Thanks, >>> David >>> ---- >>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>>>>> >>>>>> The change was tested with the JCK Lang and VM tests, the JTreg >>>>>> hotspot, java/io, java/lang, java/util, and other tests, the >>>>>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>>>>> >>>>>> Thanks, Harold >>>>>> >>>> >> > From zgu at redhat.com Thu Sep 28 12:29:36 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 28 Sep 2017 08:29:36 -0400 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: <43624339-b16c-bee6-b170-1812f0e19c33@oracle.com> References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> <6f01f8a3-8911-d5fc-9208-7dcac5d1874b@redhat.com> <1e5afb73-8cb3-35aa-dad3-5fc7f8b25a43@redhat.com> <43624339-b16c-bee6-b170-1812f0e19c33@oracle.com> Message-ID: Thanks, as always, Coleen. -Zhengyu On 09/27/2017 10:20 AM, coleen.phillimore at oracle.com wrote: > This code seems good. I can sponsor it for you. > Coleen > > > On 9/5/17 1:43 PM, Zhengyu Gu wrote: >> Hi Andrew, >> >> Thanks for the review and suggestions. The webrev is updated according >> to the discussions. >> >> Webrev: http://cr.openjdk.java.net/~zgu/8186770/webrev.01/index.html >> >> >> The sample outputs: >> >> Summary: >> >> - Class (reserved=1074360KB, committed=28856KB) >> (classes #4028) >> (malloc=1208KB #16218) >> (mmap: reserved=1073152KB, committed=27648KB) >> ( Metadata: ) >> ( reserved=24576KB, committed=24320KB) >> ( used=20914KB) >> ( free=3295KB) >> ( waste=111KB =0.46%) >> ( Class space:) >> ( reserved=1048576KB, committed=3328KB) >> ( used=2649KB) >> ( free=679KB) >> ( waste=0KB =0.00%) >> >> >> Summary diff: >> >> - Class (reserved=1076455KB +2129KB, >> committed=29415KB +849KB) >> (classes #4037 +13) >> (malloc=1255KB +81KB #17477 +2214) >> (mmap: reserved=1075200KB +2048KB, >> committed=28160KB +768KB) >> ( Metadata: ) >> ( reserved=26624KB +2048KB, >> committed=24832KB +768KB) >> ( used=21368KB +718KB) >> ( free=3336KB -21KB) >> ( waste=128KB =0.52% +71KB) >> ( Class space:) >> ( reserved=1048576KB, committed=3328KB) >> ( used=2654KB +7KB) >> ( free=674KB -7KB) >> ( waste=0KB =0.00%) >> >> Thanks, >> >> -Zhengyu >> >> >> On 09/05/2017 11:18 AM, Andrew Dinn wrote: >>> On 29/08/17 17:31, Zhengyu Gu wrote: >>>> Okay, I see what you mean. But in this case, capacity = committed. >>> >>> Well, it does not always seem to be exactly the same. If you add up all >>> the pieces to derive the capacity then it sometimes seems to fall short >>> of committed. I looked deeper into this and found that sometimes the >>> difference is down to rounding up/down. However, there also seems >>> occasionally to be more space unaccounted for that cannot be explained >>> by rounding errors. >>> >>> I looked into your suggestion that this might be accounted for by 'dark >>> matter' i.e. tail ends of a chunk left unused when the last block is >>> carved out and the chunk retired because the tail is too small to insert >>> into the block dictionary. However, from my reading of the code I think >>> that any such 'dark matter' will still to show up in the waste space >>> count. >>> >>> Rather than hold up this current change I'd prefer to see it pushed and >>> address the arithmetic problem in a follow-up issue. Even with an >>> occasional small disparity in the reported figures I think it is really >>> helpful to have this detailed info available as part of the NMT output. >>> >>>> I wonder if it is cleaner that just reports free, used and waste, e.g. >>>> >>>> ( Metadata: ) >>>> ( reserved=22528KB, committed=21504KB) >>>> ( used=20654KB) >>>> ( free=786KBKB) >>>> ( waste=64KB =0.30%) >>>> >>>> where free = (capacity - used) + free_chunks + available >>>> waste = committed - capacity - free_chunks - available >>>> total = committed >>> Yes, I agree that it's ok to leave the available figure implicit -- it >>> is easily computed from the committed total by subtracting used and >>> waste (that's only correct modulo the occasional small disparity between >>> capacity and committed but the difference is small enough not to be >>> significant). So, I'm happy with this version. >>> >>> regards, >>> >>> >>> Andrew Dinn >>> ----------- >>> Senior Principal Software Engineer >>> Red Hat UK Ltd >>> Registered in England and Wales under Company Registration No. 03798903 >>> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander >>> > -------------- next part -------------- A non-text attachment was scrubbed... Name: 8186770.patch Type: text/x-patch Size: 15010 bytes Desc: not available URL: From harold.seigel at oracle.com Thu Sep 28 12:50:27 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 28 Sep 2017 08:50:27 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: <0302f5f6-f783-7079-2d05-fc96b958d3e4@oracle.com> References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> <0302f5f6-f783-7079-2d05-fc96b958d3e4@oracle.com> Message-ID: Thanks! Harold On 9/27/2017 5:42 PM, David Holmes wrote: > On 27/09/2017 11:38 PM, harold seigel wrote: >> Hi David, >> >> Please review this updated webrev at: >> >> http://cr.openjdk.java.net/~hseigel/bug_8186092.3/webrev/index.html >> >> The only change from the previous webrev involves adding the assert() >> at lines 1202 and 1203 of klassVtable.cpp. > > Looks good. > > Thanks, > David > >> Thanks, Harold >> >> On 9/26/2017 8:25 PM, David Holmes wrote: >>> Hi Harold, >>> >>> On 27/09/2017 5:13 AM, harold seigel wrote: >>>> Hi David, >>>> >>>> Thanks for looking at this change!? Please see updated webrev at: >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html >>> >>> Test changes seem fine. >>> >>>> and also see comments embedded below. >>> >>> Follow up below. >>> >>>> Thanks, Harold >>>> >>>> >>>> On 9/26/2017 3:30 AM, David Holmes wrote: >>>>> Hi Harold, >>>>> >>>>> This looks okay to me. A few comments below but only one real query. >>>>> >>>>> On 26/09/2017 1:21 AM, harold seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review this JDK-10 change to fix JDK-8186092. The change >>>>>> prevents the checking of loader constraints during vtable and >>>>>> itable creation if the selected method is an overpass method. >>>>>> Overpass methods are created by the JVM to throw exceptions and >>>>>> so should not be subjected to loader constraint checking. >>>>> >>>>> Okay. >>>>> >>>>>> Additionally, this change improves the LinkageError exception >>>>>> error text when a loader constraint violation occurs during >>>>>> vtable and itable creation. >>>>> >>>>> Hmmm :) I think I put those in initially. Not sure I 100% agree >>>>> with the changed terminology, but I'll defer to you as the current >>>>> expert in this area. :) >>>> I'm hoping better experts also review the changed messages. >>>>> >>>>>> The fix includes four new tests, one test each to check that >>>>>> loader constraint checking is not done for overpass methods >>>>>> during vtable and itable creation, and one test each to test the >>>>>> new vtable and itable loader constraint error messages. >>>>> >>>>> *.jasm: can you add a comment indicating why these are jasm files >>>>> as it is not obvious to me what is special about them. >>>> Thanks for pointing this out.? I converted the two Task.jasm files >>>> to Task.java file and added a comment to the remaining .jasm file, >>>> C.jasm. >>>>> >>>>> */Test.java: >>>>> ?- You can place multiple files on one @compile tag (and still >>>>> list one file per line). >>>>> - you don't need to specify java.lang in the name of the exception >>>>> classes >>>> Done. >>>>> >>>>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>>>> >>>>> The real query: >>>>> >>>>> 1201???? if (target == NULL || !target->is_public() || >>>>> target->is_abstract() || target->is_overpass()) { >>>>> 1202?????? // Entry does not resolve. Leave it empty for >>>>> AbstractMethodError. >>>>> 1203?????? if (!(target == NULL) && !target->is_public()) { >>>>> 1204???????? // Stuff an IllegalAccessError throwing method in >>>>> there instead. >>>>> 1205???????? itableOffsetEntry::method_entry(_klass, >>>>> method_table_offset)[m->itable_index()]. >>>>> 1206 initialize(Universe::throw_illegal_access_error()); >>>>> 1207?????? } >>>>> >>>>> Not clear why you added the overpass check here? If it is >>>>> non-public then you're replacing it with an IllegalAccessError >>>>> instead of whatever the Overpass was going to throw. ?? >>>> Currently, all overpass methods are public methods. So, they would >>>> not get replaced with IllegalAccessError.? However, in case >>>> non-public overpass methods exist in the future, I added "&& >>>> !target->is_overpass()" to line 1203. >>>> >>>> Alternatively, I considered adding an >>>> "assert(!target->is_overpass() || target->is_public(), "Non-public >>>> overpass method");" between lines 1201 and 1202 but didn't think >>>> that this code should be concerned about whether or not overpass >>>> methods are public.? I also thought about adding "&& >>>> !target->is_overpass()" to line 1211 but thought it better that all >>>> checks on 'target', that prevent loader constraints checking, be >>>> done at the same place. >>> >>> Okay I see what you are trying to do now. We want overpass methods >>> to follow the "if" path at 1201, but for them it should currently be >>> a no-op. I'd be inclined to add in the assertion - the code is >>> already concerned about not processing non-public overpasses with >>> your proposed change to 1203. The assertion would ensure that anyone >>> introducing a non-public overpass has it quickly drawn to their >>> attention that doing so needs additional consideration. >>> >>> Thanks, >>> David >>> ---- >>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>>>>> >>>>>> The change was tested with the JCK Lang and VM tests, the JTreg >>>>>> hotspot, java/io, java/lang, java/util, and other tests, the >>>>>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>>>>> >>>>>> Thanks, Harold >>>>>> >>>> >> From karen.kinnear at oracle.com Thu Sep 28 14:59:02 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 28 Sep 2017 10:59:02 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> Message-ID: Many thanks Harold. Looks good! Couple of minor comments: 1. klassVtable.cpp line 484 - ?They are not resolved methods?. Since the target_method is intended to be a ?selected method? this comment is a bit confusing. Perhaps just take out that sentence. To answer David - an overpass is always created public. I like the improved assertion. So I code reviewed the original exception messages, and missed the problem too. These look corrected. Perhaps you could send the error test results from running the tests in the email so it is easier to see that the error messages are correct. To make this a wee bit clearer: ?when resolving method? line 1224 - should be ?when selecting method? and ?line 501 - "when resolving overriding? should be ?when selecting overriding? That also changes the error message in the tests - at least in vtableLdrConstraint/Test.java vtableAME/Test.java line 46: inheriting - extra ?t? Thank you so much for covering so many test cases. thanks, Karen > On Sep 25, 2017, at 11:21 AM, harold seigel wrote: > > Hi, > > Please review this JDK-10 change to fix JDK-8186092. The change prevents the checking of loader constraints during vtable and itable creation if the selected method is an overpass method. Overpass methods are created by the JVM to throw exceptions and so should not be subjected to loader constraint checking. > > Additionally, this change improves the LinkageError exception error text when a loader constraint violation occurs during vtable and itable creation. > > The fix includes four new tests, one test each to check that loader constraint checking is not done for overpass methods during vtable and itable creation, and one test each to test the new vtable and itable loader constraint error messages. > > Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 > > The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests, the co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. > > Thanks, Harold > From harold.seigel at oracle.com Thu Sep 28 15:30:27 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 28 Sep 2017 11:30:27 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> Message-ID: <56a5e150-9749-312a-d27c-d80bb365e8bb@oracle.com> Hi Karen, Thanks for looking at this!? Please see embedded comments and updated webrev at: http://cr.openjdk.java.net/~hseigel/bug_8186092.4/webrev/ Thanks! Harold On 9/28/2017 10:59 AM, Karen Kinnear wrote: > Many thanks Harold. Looks good! > > Couple of minor comments: > 1. klassVtable.cpp > > line 484 - ?They are not resolved methods?. Since the target_method is intended to be a ?selected method? this comment is a bit > confusing. Perhaps just take out that sentence. I removed the sentence. > > To answer David - an overpass is always created public. I like the improved assertion. > > So I code reviewed the original exception messages, and missed the problem too. These look corrected. > Perhaps you could send the error test results from running the tests in the email so it is easier to see that the error messages are correct. This is the full message from test vtableLdrConstraint (without your suggestions): ?loader constraint violation for class Task: when resolving overriding method "Task.m()LFoo;" the class loader (instance of PreemptingClassLoader) of the selected method's type Task, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) for its super type J have different Class objects for the type Foo used in the signature This is the full message from test itableLdrConstraint (without your suggestions): ?loader constraint violation in interface itable initialization for class C: when resolving method "I.m()LFoo;" the class loader (instance of PreemptingClassLoader) for super interface I, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) of the selected method's type, J have different Class objects for the type Foo used in the signature > > > To make this a wee bit clearer: > ?when resolving method? line 1224 - should be ?when selecting method? > and > ?line 501 - "when resolving overriding? should be ?when selecting overriding? Done.? The new messages look like this: vtableLdrConstraint test: loader constraint violation for class Task: when selecting overriding method "Task.m()LFoo;" the class loader (instance of PreemptingClassLoader) of the selected method's type Task, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) for its super type J have different Class objects for the type Foo used in the signature itableLdrConstraint test: loader constraint violation in interface itable initialization for class C: when selecting method "I.m()LFoo;" the class loader (instance of PreemptingClassLoader) for super interface I, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) of the selected method's type, J have different Class objects for the type Foo used in the signature > > That also changes the error message in the tests - at least in vtableLdrConstraint/Test.java > vtableAME/Test.java line 46: inheriting - extra ?t? Fixed. > > Thank you so much for covering so many test cases. > > thanks, > Karen > > >> On Sep 25, 2017, at 11:21 AM, harold seigel wrote: >> >> Hi, >> >> Please review this JDK-10 change to fix JDK-8186092. The change prevents the checking of loader constraints during vtable and itable creation if the selected method is an overpass method. Overpass methods are created by the JVM to throw exceptions and so should not be subjected to loader constraint checking. >> >> Additionally, this change improves the LinkageError exception error text when a loader constraint violation occurs during vtable and itable creation. >> >> The fix includes four new tests, one test each to check that loader constraint checking is not done for overpass methods during vtable and itable creation, and one test each to test the new vtable and itable loader constraint error messages. >> >> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >> >> The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests, the co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >> >> Thanks, Harold >> From karen.kinnear at oracle.com Thu Sep 28 15:48:23 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 28 Sep 2017 11:48:23 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: <56a5e150-9749-312a-d27c-d80bb365e8bb@oracle.com> References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <56a5e150-9749-312a-d27c-d80bb365e8bb@oracle.com> Message-ID: Ship it! Many thanks, Karen > On Sep 28, 2017, at 11:30 AM, harold seigel wrote: > > Hi Karen, > Thanks for looking at this! Please see embedded comments and updated webrev at: > http://cr.openjdk.java.net/~hseigel/bug_8186092.4/webrev/ > Thanks! Harold > > On 9/28/2017 10:59 AM, Karen Kinnear wrote: >> Many thanks Harold. Looks good! >> >> Couple of minor comments: >> 1. klassVtable.cpp >> >> line 484 - ?They are not resolved methods?. Since the target_method is intended to be a ?selected method? this comment is a bit >> confusing. Perhaps just take out that sentence. > I removed the sentence. >> >> To answer David - an overpass is always created public. I like the improved assertion. >> >> So I code reviewed the original exception messages, and missed the problem too. These look corrected. >> Perhaps you could send the error test results from running the tests in the email so it is easier to see that the error messages are correct. > This is the full message from test vtableLdrConstraint (without your suggestions): > loader constraint violation for class Task: when resolving overriding method "Task.m()LFoo;" the class loader (instance of PreemptingClassLoader) of the selected method's type Task, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) for its super type J have different Class objects for the type Foo used in the signature > This is the full message from test itableLdrConstraint (without your suggestions): > loader constraint violation in interface itable initialization for class C: when resolving method "I.m()LFoo;" the class loader (instance of PreemptingClassLoader) for super interface I, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) of the selected method's type, J have different Class objects for the type Foo used in the signature >> >> >> To make this a wee bit clearer: >> ?when resolving method? line 1224 - should be ?when selecting method? >> and >> ?line 501 - "when resolving overriding? should be ?when selecting overriding? > Done. The new messages look like this: > > vtableLdrConstraint test: > loader constraint violation for class Task: when selecting overriding method "Task.m()LFoo;" the class loader (instance of PreemptingClassLoader) of the selected method's type Task, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) for its super type J have different Class objects for the type Foo used in the signature > > itableLdrConstraint test: > loader constraint violation in interface itable initialization for class C: when selecting method "I.m()LFoo;" the class loader (instance of PreemptingClassLoader) for super interface I, and the class loader (instance of jdk/internal/loader/ClassLoaders$AppClassLoader) of the selected method's type, J have different Class objects for the type Foo used in the signature >> >> That also changes the error message in the tests - at least in vtableLdrConstraint/Test.java >> vtableAME/Test.java line 46: inheriting - extra ?t? > Fixed. >> >> Thank you so much for covering so many test cases. >> >> thanks, >> Karen >> >> >>> On Sep 25, 2017, at 11:21 AM, harold seigel wrote: >>> >>> Hi, >>> >>> Please review this JDK-10 change to fix JDK-8186092. The change prevents the checking of loader constraints during vtable and itable creation if the selected method is an overpass method. Overpass methods are created by the JVM to throw exceptions and so should not be subjected to loader constraint checking. >>> >>> Additionally, this change improves the LinkageError exception error text when a loader constraint violation occurs during vtable and itable creation. >>> >>> The fix includes four new tests, one test each to check that loader constraint checking is not done for overpass methods during vtable and itable creation, and one test each to test the new vtable and itable loader constraint error messages. >>> >>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>> >>> The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests, the co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>> >>> Thanks, Harold >>> > From harold.seigel at oracle.com Thu Sep 28 17:05:49 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 28 Sep 2017 13:05:49 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: <4b109fe5-d6d1-9fb1-7aec-8208d21d06f2@oracle.com> References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> <4b109fe5-d6d1-9fb1-7aec-8208d21d06f2@oracle.com> Message-ID: Hi Coleen, I looked into moving the error message code into a function but there wasn't enough commonality to make it worthwhile.? The new function would require several parameters making it rather clunky. Thanks, Haorld On 9/27/2017 10:26 AM, coleen.phillimore at oracle.com wrote: > > Harold, > > Since you changed both instances of the mostly duplicated error > message, would this be a good time to make it a function? > > thanks, > Coleen > > On 9/27/17 9:38 AM, harold seigel wrote: >> Hi David, >> >> Please review this updated webrev at: >> >> http://cr.openjdk.java.net/~hseigel/bug_8186092.3/webrev/index.html >> >> The only change from the previous webrev involves adding the assert() >> at lines 1202 and 1203 of klassVtable.cpp. >> >> Thanks, Harold >> >> On 9/26/2017 8:25 PM, David Holmes wrote: >>> Hi Harold, >>> >>> On 27/09/2017 5:13 AM, harold seigel wrote: >>>> Hi David, >>>> >>>> Thanks for looking at this change!? Please see updated webrev at: >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html >>> >>> Test changes seem fine. >>> >>>> and also see comments embedded below. >>> >>> Follow up below. >>> >>>> Thanks, Harold >>>> >>>> >>>> On 9/26/2017 3:30 AM, David Holmes wrote: >>>>> Hi Harold, >>>>> >>>>> This looks okay to me. A few comments below but only one real query. >>>>> >>>>> On 26/09/2017 1:21 AM, harold seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review this JDK-10 change to fix JDK-8186092. The change >>>>>> prevents the checking of loader constraints during vtable and >>>>>> itable creation if the selected method is an overpass method. >>>>>> Overpass methods are created by the JVM to throw exceptions and >>>>>> so should not be subjected to loader constraint checking. >>>>> >>>>> Okay. >>>>> >>>>>> Additionally, this change improves the LinkageError exception >>>>>> error text when a loader constraint violation occurs during >>>>>> vtable and itable creation. >>>>> >>>>> Hmmm :) I think I put those in initially. Not sure I 100% agree >>>>> with the changed terminology, but I'll defer to you as the current >>>>> expert in this area. :) >>>> I'm hoping better experts also review the changed messages. >>>>> >>>>>> The fix includes four new tests, one test each to check that >>>>>> loader constraint checking is not done for overpass methods >>>>>> during vtable and itable creation, and one test each to test the >>>>>> new vtable and itable loader constraint error messages. >>>>> >>>>> *.jasm: can you add a comment indicating why these are jasm files >>>>> as it is not obvious to me what is special about them. >>>> Thanks for pointing this out.? I converted the two Task.jasm files >>>> to Task.java file and added a comment to the remaining .jasm file, >>>> C.jasm. >>>>> >>>>> */Test.java: >>>>> ?- You can place multiple files on one @compile tag (and still >>>>> list one file per line). >>>>> - you don't need to specify java.lang in the name of the exception >>>>> classes >>>> Done. >>>>> >>>>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>>>> >>>>> The real query: >>>>> >>>>> 1201???? if (target == NULL || !target->is_public() || >>>>> target->is_abstract() || target->is_overpass()) { >>>>> 1202?????? // Entry does not resolve. Leave it empty for >>>>> AbstractMethodError. >>>>> 1203?????? if (!(target == NULL) && !target->is_public()) { >>>>> 1204???????? // Stuff an IllegalAccessError throwing method in >>>>> there instead. >>>>> 1205???????? itableOffsetEntry::method_entry(_klass, >>>>> method_table_offset)[m->itable_index()]. >>>>> 1206 initialize(Universe::throw_illegal_access_error()); >>>>> 1207?????? } >>>>> >>>>> Not clear why you added the overpass check here? If it is >>>>> non-public then you're replacing it with an IllegalAccessError >>>>> instead of whatever the Overpass was going to throw. ?? >>>> Currently, all overpass methods are public methods. So, they would >>>> not get replaced with IllegalAccessError.? However, in case >>>> non-public overpass methods exist in the future, I added "&& >>>> !target->is_overpass()" to line 1203. >>>> >>>> Alternatively, I considered adding an >>>> "assert(!target->is_overpass() || target->is_public(), "Non-public >>>> overpass method");" between lines 1201 and 1202 but didn't think >>>> that this code should be concerned about whether or not overpass >>>> methods are public.? I also thought about adding "&& >>>> !target->is_overpass()" to line 1211 but thought it better that all >>>> checks on 'target', that prevent loader constraints checking, be >>>> done at the same place. >>> >>> Okay I see what you are trying to do now. We want overpass methods >>> to follow the "if" path at 1201, but for them it should currently be >>> a no-op. I'd be inclined to add in the assertion - the code is >>> already concerned about not processing non-public overpasses with >>> your proposed change to 1203. The assertion would ensure that anyone >>> introducing a non-public overpass has it quickly drawn to their >>> attention that doing so needs additional consideration. >>> >>> Thanks, >>> David >>> ---- >>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>>>>> >>>>>> The change was tested with the JCK Lang and VM tests, the JTreg >>>>>> hotspot, java/io, java/lang, java/util, and other tests, the >>>>>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>>>>> >>>>>> Thanks, Harold >>>>>> >>>> >> > From coleen.phillimore at oracle.com Thu Sep 28 18:52:04 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 28 Sep 2017 14:52:04 -0400 Subject: RFR 8186092: Unnecessary loader constraints produced when there are multiple defaults In-Reply-To: References: <1d26ea41-e62c-5572-0e70-adc1eda37a85@oracle.com> <86af3584-89ff-1819-1d20-736dad01a42c@oracle.com> <4b109fe5-d6d1-9fb1-7aec-8208d21d06f2@oracle.com> Message-ID: <7d749a75-333c-6963-0f71-5908c5adbd70@oracle.com> Ok, like Karen said, ship it! Coleen On 9/28/17 1:05 PM, harold seigel wrote: > Hi Coleen, > > I looked into moving the error message code into a function but there > wasn't enough commonality to make it worthwhile.? The new function > would require several parameters making it rather clunky. > > Thanks, Haorld > > > On 9/27/2017 10:26 AM, coleen.phillimore at oracle.com wrote: >> >> Harold, >> >> Since you changed both instances of the mostly duplicated error >> message, would this be a good time to make it a function? >> >> thanks, >> Coleen >> >> On 9/27/17 9:38 AM, harold seigel wrote: >>> Hi David, >>> >>> Please review this updated webrev at: >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8186092.3/webrev/index.html >>> >>> The only change from the previous webrev involves adding the >>> assert() at lines 1202 and 1203 of klassVtable.cpp. >>> >>> Thanks, Harold >>> >>> On 9/26/2017 8:25 PM, David Holmes wrote: >>>> Hi Harold, >>>> >>>> On 27/09/2017 5:13 AM, harold seigel wrote: >>>>> Hi David, >>>>> >>>>> Thanks for looking at this change!? Please see updated webrev at: >>>>> >>>>> http://cr.openjdk.java.net/~hseigel/bug_8186092.2/webrev/index.html >>>> >>>> Test changes seem fine. >>>> >>>>> and also see comments embedded below. >>>> >>>> Follow up below. >>>> >>>>> Thanks, Harold >>>>> >>>>> >>>>> On 9/26/2017 3:30 AM, David Holmes wrote: >>>>>> Hi Harold, >>>>>> >>>>>> This looks okay to me. A few comments below but only one real query. >>>>>> >>>>>> On 26/09/2017 1:21 AM, harold seigel wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please review this JDK-10 change to fix JDK-8186092. The change >>>>>>> prevents the checking of loader constraints during vtable and >>>>>>> itable creation if the selected method is an overpass method. >>>>>>> Overpass methods are created by the JVM to throw exceptions and >>>>>>> so should not be subjected to loader constraint checking. >>>>>> >>>>>> Okay. >>>>>> >>>>>>> Additionally, this change improves the LinkageError exception >>>>>>> error text when a loader constraint violation occurs during >>>>>>> vtable and itable creation. >>>>>> >>>>>> Hmmm :) I think I put those in initially. Not sure I 100% agree >>>>>> with the changed terminology, but I'll defer to you as the >>>>>> current expert in this area. :) >>>>> I'm hoping better experts also review the changed messages. >>>>>> >>>>>>> The fix includes four new tests, one test each to check that >>>>>>> loader constraint checking is not done for overpass methods >>>>>>> during vtable and itable creation, and one test each to test the >>>>>>> new vtable and itable loader constraint error messages. >>>>>> >>>>>> *.jasm: can you add a comment indicating why these are jasm files >>>>>> as it is not obvious to me what is special about them. >>>>> Thanks for pointing this out.? I converted the two Task.jasm files >>>>> to Task.java file and added a comment to the remaining .jasm file, >>>>> C.jasm. >>>>>> >>>>>> */Test.java: >>>>>> ?- You can place multiple files on one @compile tag (and still >>>>>> list one file per line). >>>>>> - you don't need to specify java.lang in the name of the >>>>>> exception classes >>>>> Done. >>>>>> >>>>>>> Open webrev: >>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8186092/webrev/ >>>>>> >>>>>> The real query: >>>>>> >>>>>> 1201???? if (target == NULL || !target->is_public() || >>>>>> target->is_abstract() || target->is_overpass()) { >>>>>> 1202?????? // Entry does not resolve. Leave it empty for >>>>>> AbstractMethodError. >>>>>> 1203?????? if (!(target == NULL) && !target->is_public()) { >>>>>> 1204???????? // Stuff an IllegalAccessError throwing method in >>>>>> there instead. >>>>>> 1205???????? itableOffsetEntry::method_entry(_klass, >>>>>> method_table_offset)[m->itable_index()]. >>>>>> 1206 initialize(Universe::throw_illegal_access_error()); >>>>>> 1207?????? } >>>>>> >>>>>> Not clear why you added the overpass check here? If it is >>>>>> non-public then you're replacing it with an IllegalAccessError >>>>>> instead of whatever the Overpass was going to throw. ?? >>>>> Currently, all overpass methods are public methods. So, they would >>>>> not get replaced with IllegalAccessError. However, in case >>>>> non-public overpass methods exist in the future, I added "&& >>>>> !target->is_overpass()" to line 1203. >>>>> >>>>> Alternatively, I considered adding an >>>>> "assert(!target->is_overpass() || target->is_public(), "Non-public >>>>> overpass method");" between lines 1201 and 1202 but didn't think >>>>> that this code should be concerned about whether or not overpass >>>>> methods are public.? I also thought about adding "&& >>>>> !target->is_overpass()" to line 1211 but thought it better that >>>>> all checks on 'target', that prevent loader constraints checking, >>>>> be done at the same place. >>>> >>>> Okay I see what you are trying to do now. We want overpass methods >>>> to follow the "if" path at 1201, but for them it should currently >>>> be a no-op. I'd be inclined to add in the assertion - the code is >>>> already concerned about not processing non-public overpasses with >>>> your proposed change to 1203. The assertion would ensure that >>>> anyone introducing a non-public overpass has it quickly drawn to >>>> their attention that doing so needs additional consideration. >>>> >>>> Thanks, >>>> David >>>> ---- >>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186092 >>>>>>> >>>>>>> The change was tested with the JCK Lang and VM tests, the JTreg >>>>>>> hotspot, java/io, java/lang, java/util, and other tests, the >>>>>>> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >>>>>>> >>>>>>> Thanks, Harold >>>>>>> >>>>> >>> >> > From mikhailo.seledtsov at oracle.com Thu Sep 28 22:11:56 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Thu, 28 Sep 2017 15:11:56 -0700 Subject: RFR(S): 8181592: [TESTBUG] Docker test utils and docker jdk basic test In-Reply-To: References: <6b95b720-a2cc-39e1-c0c1-6885b106ac16@oracle.com> <5F386D9A-CA05-4732-8F68-F493DD2E8E99@oracle.com> Message-ID: <59CD73AC.3080207@oracle.com> Here is the updated webrev: http://cr.openjdk.java.net/~mseledtsov/8181592.02/ Leonid, George - thank you for your comments. In addition to addressing your feedback, I also did: - implemented @requires docker.support - added dockerRunJava() method and associated data structure DockerRunOptions, for running Java tests inside docker environment, and to account for java opts, test java opts, docker opts, classes and class params - added a simple HelloWorld test case that runs HelloWorld inside a container - ran jtreg with extra Java options, make sure they are added correctly to the docker run command - added docker image cleanup after testing is done Please review. Misha On 9/27/17, 11:00 AM, mikhailo wrote: > Leonid, > > Thank you for review and constructive feedback. See my comment in line. > > > On 09/26/2017 11:19 AM, Leonid Mesnik wrote: >> Misha >> >> http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/hotspot/jtreg/runtime/containers/docker/DockerBasicTest.java.html >> >> >> Copyright is incorrect, need to updated it for GPL. > Fixed >> >> The Hotspot is Oracle VM name only so test might fail for OpenJDK. I >> think you need to fix this check. > I see. I fixed this by using Platform.vmName which should be correct > in all cases. I double-checked with OpenJDK also. >> >> The requires checks only that test is executed only on the 64-bit >> linux. Does it make a sense to introduce more docker-specific check? > I agree this is a better way. I will do some prototyping; if such > check is feasible and efficient in at requires then I will add it. >> >> >> http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/hotspot/jtreg/runtime/containers/docker/Dockerfile-BasicTest.html >> >> >> Could you please explain why oraclelinux 7.0 is used as a base image >> for test. > I have upgraded to Oracle Linux 7.2. If we have specific requirement I > will change it to that. If we have requirements in the future to > support multiple OS, I can add Dockerfile generation. > For this basic sanity tests I think this should suffice. >> >> http://cr.openjdk.java.net/~mseledtsov/8181592.00/test/lib/jdk/test/lib/containers/docker/DockerTestUtils.java.html >> >> >> The content looks fine. >> >> I don?t see anything to clean up docker images on the system. Could >> you please explain how tests are going to cleanup images. > To clean up containers I will add "--rm" to the 'docker run' command. > This should ensure that container data is removed after container stops. > As for the image - I use the same image name. The image will stay in > the local registry unless manually removed. I should probably do > 'docker rmi' at the end of the test to clean this up. > > > Once I implement these changes I will send the updated webrev. > > Thank you, > Misha >> >> Leonid >> >> >>> On Sep 21, 2017, at 5:58 PM, mikhailo >> > wrote: >>> >>> Please review this initial drop of Docker test utils and a sanity >>> test. This change lays ground >>> for further test development and test utils improvement in this area. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8181592 >>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8181592.00/ >>> >>> Testing: >>> - run this test on machine with Docker enabled - works >>> - run this test on Linux-x64 with no Docker engine or Docker >>> disabled - test skipped (as expected) >>> - run this test on automated system - in progress >>> >>> >>> Thank you, >>> Misha >>> >> > From rkennke at redhat.com Fri Sep 29 10:47:11 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 29 Sep 2017 12:47:11 +0200 Subject: RFR: 8179387: Factor out CMS specific code from GenCollectedHeap into its own subclass In-Reply-To: <51d7f145-27e3-5bfb-f1da-decd6e19858c@oracle.com> References: <3d8b55a2-a787-3051-b351-ab9b0a24f5e0@redhat.com> <47e22e86-7d7c-606f-1936-346229f39ca2@oracle.com> <9a846161-c8ac-dedf-5952-f457d546fd9a@redhat.com> <4d5e6af8-d975-7803-64c5-7295e0d56154@redhat.com> <13358626-e399-e352-1711-587416621aac@redhat.com> <27af0ad2-fe78-3536-2143-996dd42583ab@oracle.com> <4bc53aaa-b98a-8a61-73bf-d30ac3f402b8@redhat.com> <666af7f2-27e9-48c6-91e4-eaefa5289e18@redhat.com> <3ec8a6a3-5a4b-a910-f6ec-ed1c0dad4cad@oracle.com> <5417889c-5289-37cd-eb31-a2b55f70e85e@redhat.com> <088d467c-8038-60bc-1eab-b34061ad20d9@redhat.com> <7abbeec1-b353-c6e9-9827-e70f49269d50@oracle.com> <6e3b0ca0-68c9-f5b6-1cb7-ddca71df2c0e@redhat.com> <51d7f145-27e3-5bfb-f1da-decd6e19858c@oracle.com> Message-ID: <51036840-16f4-c8eb-082e-76e0a5b1534f@redhat.com> Am 19.07.2017 um 14:29 schrieb coleen.phillimore at oracle.com: > > > On 7/19/17 6:42 AM, Roman Kennke wrote: >> Am 17.07.2017 um 16:42 schrieb coleen.phillimore at oracle.com: >>> >>> On 7/17/17 8:07 AM, Roman Kennke wrote: >>>> (I included hotspot-runtime-dev and serviceability-dev to review >>>> vmStructs.cpp changes. see below) >>>> >>>> Hi Erik, >>>> >>>>>> Ok, added those and some more that I found. Not sure why we'd need >>>>>> #include "gc/cms/concurrentMarkSweepGeneration.hpp" ? Left that out >>>>>> for now. >>>>> Because you are accessing CMSCollcetor in: >>>>> >>>>> ?? 99?? NOT_PRODUCT( >>>>> ?? 100???? virtual size_t skip_header_HeapWords() { return >>>>> CMSCollector::skip_header_HeapWords(); } >>>>> ?? 101?? ) >>>>> >>>>> and CMSCollector is declared in concurrentMarkSweepGeneration.hpp. An >>>>> alternative would of course be to just declare >>>>> skip_header_HeapWords() >>>>> in cmsHeap.hpp and define skip_header_HeapWords in cmsHeap.cpp, then >>>>> you only need to include concurrentMarkSweeoGeneration.hpp in >>>>> cmsHeap.cpp. >>>> Ah ok, I've missed that one. Added it now. >>>> >>>>>>> IMO, I would just make the three functions above private. I know >>>>>>> they >>>>>>> are protected in GenCollectedHeap, but it should be fine to have >>>>>>> them >>>>>>> private in CMSHeap. Having them protected signals, at least to me, >>>>>>> that this class could be considered as a base class (protected >>>>>>> to me >>>>>>> reads "this can be accessed by classes inheriting from this class), >>>>>>> and we don't want any class to inherit from CMSHeap. >>>>>> How can they be called from the superclass if they are private in >>>>>> the >>>>>> subclass? Would that work in C++? >>>>>> >>>>>> protected (to me) means visibility between super and subclasses. If >>>>>> I'd >>>>>> want to signal that I intend that to be overridden, I'd say >>>>>> 'virtual'. >>>>> It is perfectly fine to have private virtual methods in C++ (see for >>>>> example >>>>> https://stackoverflow.com/questions/2170688/private-virtual-method-in-c). >>>>> >>>>> >>>>> A virtual function only needs to be protected if a "child class" >>>>> needs >>>>> to access the function in the "parent class". For both gc_prologue >>>>> and >>>>> gc_epilogue, this is the case, which is why they have to be >>>>> 'protected' in GenCollectedHeap. But, no class is going to derive >>>>> from >>>>> CMSHeap, so they can be private in CMSHeap. >>>> Cool. Learned something new :-) It actually makes sense. >>>> >>>> I've moved all 3 methods into the private block in CMSHeap. I left >>>> them >>>> virtual (because of missing override), and I also left them in >>>> protected >>>> in GenCollectedHeap (prologue/epilogue because we need to, >>>> skip_header_HeapWords() to not confuse readers.) >>>>>>> This is for the serviceability agent. You will have to poke >>>>>>> around in >>>>>>> hotspot/src/jdk.hotspot.agent and see how GenCollectedHeap is used. >>>>>>> Unfortunately I'm not that familiar with the agent, perhaps someone >>>>>>> else can chime in here? >>>>>> Considering that the remaining references to GenCollectedHeap in >>>>>> vmStructs.cpp don't look like related to CMSHeap, I'd argue that >>>>>> what I >>>>>> did is all that's needed for now. Do you agree? >>>>> Honestly, I don't know, that is why I asked if someone else with more >>>>> knowledge in this area can comment. Have you tried building and using >>>>> the SA agent with your change? You can also ask around on >>>>> hotspot-rt-dev and or serviceability-dev. >>>> I haven't tried building SA. I poked around >>>> hotspot/src/jdk.hotspot.agent and I think it should be ok. Can >>>> somebody >>>> who knows about it confirm this? >>>> >>>> Differential webrev: >>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.07.diff/ >>>> >>>> Full webrev: >>>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.07/ >>>> >>> http://cr.openjdk.java.net/~rkennke/8179387/webrev.07/src/share/vm/runtime/vmStructs.cpp.udiff.html >>> >>> >>> >>> I'm not sure why you added this because it's not in the agent java >>> files.? SA builds as a part of the whole build and there are some >>> basic tests in hotspot/test/serviceability/sa? .? If those run, you're >>> probably fine.? The SA has some copied code for CMS but it appears to >>> be minimal and the team is working now on deciding what functionality >>> to provide, so I suggest not adding code that might not be used. >> It's just a constant/enum, and I added it because all the other possible >> values of that enum are also there. I can remove it though: >> >> Differential: >> http://cr.openjdk.java.net/~rkennke/8179387/webrev.08.diff/ >> > > Thanks, this is good.? I don't know enough about the rest of the > change to be a reviewer, but I think you have your reviews. > Coleen > >> Full: >> http://cr.openjdk.java.net/~rkennke/8179387/webrev.08/ >> Ping? I believe this never actually went in. Is there anything missing? Do you want me to re-base it onto the single-repo and post another webrev? Roman