From david.holmes at oracle.com Mon Aug 3 05:51:58 2015 From: david.holmes at oracle.com (David Holmes) Date: Mon, 3 Aug 2015 15:51:58 +1000 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() Message-ID: <55BF017E.3070801@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ To support the old LinuxThreads implementation we had to distinguish between threading libraries with fixed-stack-size threads (LinuxThreads), and variable-stack-sized-threads (NPTL). As LinuxThreads support was removed years ago and all the code related to it has now been removed, we have a situation where supports_variable_stack_size() is always true and so the function and its use can be removed. While this notion was only ever relevant to Linux it also got copied across to the BSD and AIX ports. Thanks, David From kim.barrett at oracle.com Mon Aug 3 06:15:51 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 3 Aug 2015 02:15:51 -0400 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() In-Reply-To: <55BF017E.3070801@oracle.com> References: <55BF017E.3070801@oracle.com> Message-ID: On Aug 3, 2015, at 1:51 AM, David Holmes wrote: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 > > webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ > > To support the old LinuxThreads implementation we had to distinguish between threading libraries with fixed-stack-size threads (LinuxThreads), and variable-stack-sized-threads (NPTL). As LinuxThreads support was removed years ago and all the code related to it has now been removed, we have a situation where supports_variable_stack_size() is always true and so the function and its use can be removed. > > While this notion was only ever relevant to Linux it also got copied across to the BSD and AIX ports. Looks good. From david.holmes at oracle.com Mon Aug 3 06:40:12 2015 From: david.holmes at oracle.com (David Holmes) Date: Mon, 3 Aug 2015 16:40:12 +1000 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() In-Reply-To: References: <55BF017E.3070801@oracle.com> Message-ID: <55BF0CCC.6090905@oracle.com> Thanks Kim! David On 3/08/2015 4:15 PM, Kim Barrett wrote: > On Aug 3, 2015, at 1:51 AM, David Holmes wrote: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 >> >> webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ >> >> To support the old LinuxThreads implementation we had to distinguish between threading libraries with fixed-stack-size threads (LinuxThreads), and variable-stack-sized-threads (NPTL). As LinuxThreads support was removed years ago and all the code related to it has now been removed, we have a situation where supports_variable_stack_size() is always true and so the function and its use can be removed. >> >> While this notion was only ever relevant to Linux it also got copied across to the BSD and AIX ports. > > Looks good. > From volker.simonis at gmail.com Mon Aug 3 08:58:22 2015 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 3 Aug 2015 10:58:22 +0200 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() In-Reply-To: <55BF017E.3070801@oracle.com> References: <55BF017E.3070801@oracle.com> Message-ID: Hi David, the change looks good. Thanks for fixing our platforms as well. Regards, Volker On Mon, Aug 3, 2015 at 7:51 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 > > webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ > > To support the old LinuxThreads implementation we had to distinguish > between threading libraries with fixed-stack-size threads (LinuxThreads), > and variable-stack-sized-threads (NPTL). As LinuxThreads support was > removed years ago and all the code related to it has now been removed, we > have a situation where supports_variable_stack_size() is always true and so > the function and its use can be removed. > > While this notion was only ever relevant to Linux it also got copied > across to the BSD and AIX ports. > > Thanks, > David > From david.holmes at oracle.com Mon Aug 3 09:09:18 2015 From: david.holmes at oracle.com (David Holmes) Date: Mon, 3 Aug 2015 19:09:18 +1000 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() In-Reply-To: References: <55BF017E.3070801@oracle.com> Message-ID: <55BF2FBE.7020607@oracle.com> Thanks Volker! David On 3/08/2015 6:58 PM, Volker Simonis wrote: > Hi David, > > the change looks good. > Thanks for fixing our platforms as well. > > Regards, > Volker > > > On Mon, Aug 3, 2015 at 7:51 AM, David Holmes > wrote: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 > > webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ > > To support the old LinuxThreads implementation we had to distinguish > between threading libraries with fixed-stack-size threads > (LinuxThreads), and variable-stack-sized-threads (NPTL). As > LinuxThreads support was removed years ago and all the code related > to it has now been removed, we have a situation where > supports_variable_stack_size() is always true and so the function > and its use can be removed. > > While this notion was only ever relevant to Linux it also got copied > across to the BSD and AIX ports. > > Thanks, > David > > From thomas.stuefe at gmail.com Mon Aug 3 13:20:57 2015 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 3 Aug 2015 15:20:57 +0200 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() In-Reply-To: <55BF017E.3070801@oracle.com> References: <55BF017E.3070801@oracle.com> Message-ID: Hi David, looks good, thank you for doing this. Kind Regards, Thomas On Mon, Aug 3, 2015 at 7:51 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 > > webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ > > To support the old LinuxThreads implementation we had to distinguish > between threading libraries with fixed-stack-size threads (LinuxThreads), > and variable-stack-sized-threads (NPTL). As LinuxThreads support was > removed years ago and all the code related to it has now been removed, we > have a situation where supports_variable_stack_size() is always true and so > the function and its use can be removed. > > While this notion was only ever relevant to Linux it also got copied > across to the BSD and AIX ports. > > Thanks, > David > From thomas.stuefe at gmail.com Mon Aug 3 15:38:18 2015 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 3 Aug 2015 17:38:18 +0200 Subject: RFR: 8130212: Thread::current() might access freed memory on Solaris In-Reply-To: <55B868A3.2080602@oracle.com> References: <55B868A3.2080602@oracle.com> Message-ID: Hi David, we added compiler-level TLS (__thread) for Linux a while ago to do exactly what you are doing, but then had to remove it again. Bugs in the glibc caused memory leaks - basically it looked like glibc was not cleaning TLS related structures. Leak was small but it added up over time. I know you implemented this for Solaris. Just thought I give you a warning, maybe this is something to keep in mind. Thanks, Thomas On Wed, Jul 29, 2015 at 7:46 AM, David Holmes wrote: > Summary: replace complex custom code for maintaining ThreadLocalStorage > with compiler supported thread-local variables (Solaris only) > > This is a non-public bug so let me explain with some background, the bug, > and then the fix - which involves lots of complex-code deletion and > addition of some very simple code. :) > > webrev: http://cr.openjdk.java.net/~dholmes/8130212/webrev/ > > In various parts of the runtime and in compiler generated code we need to > get a reference to the (VM-level) Thread* of the currently executing > thread. This is what Thread::current() returns. For performance reasons we > also have a fast-path on 64-bit where the Thread* is stashed away in a > register (g7 on sparc, r15 on x64). > > So Thread::current() is actually a slow-path mechanism and it delegates to > ThreadLocalStorage::thread(). > > On some systems ThreadLocalStorage::thread utilizes a caching mechanism to > try and speed up access to the current thread. Otherwise it calls into yet > another "slow" path which uses the available platform > thread-specific-storage APIs. > > Compiled code also has a slow-path get_thread() method which uses assembly > code to invoke the same platform thread-specific-storage APIs (in some > cases - on sparc it simply calls ThreadLocalStorage::thread()). > > On Solaris 64-bit (which is all we support today) there is a simple > 1-level thread cache which is an array of Thread*. If a thread doesn't find > itself in the slot for the hash of its id it inserts itself there. As a > thread terminates it clears out its ThreadLocalStorage values including any > cached reference. > > The bug is that we have potential for a read-after-free error due to this > code: > > 46 uintptr_t raw = pd_raw_thread_id(); > 47 int ix = pd_cache_index(raw); // hashes id > 48 Thread* candidate = ThreadLocalStorage::_get_thread_cache[ix]; > 49 if (candidate->self_raw_id() == raw) { > 50 // hit > 51 return candidate; > 52 } else { > 53 return ThreadLocalStorage::get_thread_via_cache_slowly(raw, ix); > 54 } > > The problem is that the value read as candidate could be a thread that > (after line 48) terminated and was freed. But line #49 then reads the raw > id of that thread, which is then a read-after-free - which is a "Bad Thing > (TM)". > > There's no simple fix for the caching code - you would need a completely > different approach (or synchronization that would nullify the whole point > of the cache). > > Now all this ThreadLocalStorage code is pretty old and was put in place to > deal with inadequacies of the system provided thread-specific-storage API. > In fact on Solaris we even by-pass the public API > (thr_getspecific/thr_setspecific) when we can and implement our own version > using lower-level APIs available in the T1/T2 threading libraries! > > In mid-2015 things have changed considerably and we have reliable and > performant support for thread-local variables at the C+ language-level. So > the way to maintain the current thread is simply using: > > // Declaration of thread-local variable > static __thread Thread * _thr_current > > inline Thread* ThreadLocalStorage::thread() { > return _thr_current; > } > > inline void ThreadLocalStorage::set_thread(Thread* thread) { > _thr_current = thread; > } > > And all the complex ThreadLocalStorage code with caching etc all vanishes! > > For my next trick I plan to try and remove the ThreadLocalStorage class > completely by using language-based thread-locals on all platforms. But for > now this is just Solaris and so we still need the ThreadLocalStorage API. > However a lot of that API is not needed any more on Solaris so I have > excluded it from there in the shared code (ifndef SOLARIS). But to avoid > changing other shared-code callsites of ThreadLocalStorage I've kept part > of the API with trivial implementations on Solaris. > > Testing: JPRT > All hotspot regression tests > > I'm happy to run more tests but the nice thing about such low-level code > is that if it is broken, it is always broken :) Every use of > Thread::current or MacroAssembler::get_thread now hits this code. > > Performance: I've run a basic set of benchmarks that is readily available > to me on our performance testing system. The best way to describe the > result is neutral. There are some slight wins, and some slight losses, with > most showing no statistical difference. And even the "wins" and "losses" > are within the natural variations of the benchmarks. So a lot of complex > code has been replaced by simple code and we haven't lost any observable > performance - which seems like a win to me. > > Also product mode x64 libjvm.so has shrunk by 921KB - which is a little > surprising but very nice. > > Thanks, > David > From coleen.phillimore at oracle.com Mon Aug 3 17:23:13 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 03 Aug 2015 13:23:13 -0400 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() In-Reply-To: <55BF017E.3070801@oracle.com> References: <55BF017E.3070801@oracle.com> Message-ID: <55BFA381.30707@oracle.com> Looks good! Coleen On 8/3/15 1:51 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 > > webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ > > To support the old LinuxThreads implementation we had to distinguish > between threading libraries with fixed-stack-size threads > (LinuxThreads), and variable-stack-sized-threads (NPTL). As > LinuxThreads support was removed years ago and all the code related to > it has now been removed, we have a situation where > supports_variable_stack_size() is always true and so the function and > its use can be removed. > > While this notion was only ever relevant to Linux it also got copied > across to the BSD and AIX ports. > > Thanks, > David From yumin.qi at oracle.com Mon Aug 3 18:25:12 2015 From: yumin.qi at oracle.com (Yumin Qi) Date: Mon, 3 Aug 2015 11:25:12 -0700 Subject: RFR:8075253: Multiversion JAR feature: AppCDS does not support MV-JARs Message-ID: <55BFB208.7000705@oracle.com> Hi, Can I have your codereview on: bug: https://bugs.openjdk.java.net/browse/JDK-8075253 webrev: http://cr.openjdk.java.net/~minqi/8075253/webrev02/ Summary: Multiple versioned jars will be supported by JEP: JEP 238: Multiple-Release jar files. This change adds support for MVJ's in the CDS shared-archive dumping process. Currently JEP 238 has not been integrated, hotspot changes will be pushed after JEP 238 integration. Thanks Yumin From yumin.qi at oracle.com Mon Aug 3 18:37:38 2015 From: yumin.qi at oracle.com (Yumin Qi) Date: Mon, 3 Aug 2015 11:37:38 -0700 Subject: RFR:8075253: Multiversion JAR feature: AppCDS does not support MV-JARs In-Reply-To: <55BFB208.7000705@oracle.com> References: <55BFB208.7000705@oracle.com> Message-ID: <55BFB4F2.4010007@oracle.com> Tests: JPRT, jtreg, runtime quick test list (in testing ...). Thanks Yumin On 8/3/2015 11:25 AM, Yumin Qi wrote: > Hi, Can I have your codereview on: > > bug: https://bugs.openjdk.java.net/browse/JDK-8075253 > webrev: http://cr.openjdk.java.net/~minqi/8075253/webrev02/ > > Summary: Multiple versioned jars will be supported by JEP: JEP 238: > Multiple-Release jar files. This change adds support for MVJ's in the > CDS shared-archive dumping process. > > Currently JEP 238 has not been integrated, hotspot changes will be > pushed after JEP 238 integration. > > Thanks > Yumin From david.holmes at oracle.com Mon Aug 3 20:22:28 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 4 Aug 2015 06:22:28 +1000 Subject: RFR: 8130212: Thread::current() might access freed memory on Solaris In-Reply-To: References: <55B868A3.2080602@oracle.com> Message-ID: <55BFCD84.2040102@oracle.com> Hi Thomas, On 4/08/2015 1:38 AM, Thomas St?fe wrote: > Hi David, > > we added compiler-level TLS (__thread) for Linux a while ago to do > exactly what you are doing, but then had to remove it again. Bugs in the > glibc caused memory leaks - basically it looked like glibc was not > cleaning TLS related structures. Leak was small but it added up over time. > > I know you implemented this for Solaris. Just thought I give you a > warning, maybe this is something to keep in mind. Thanks for the heads-up! Linux et al are next on the list. I'll put together a simple thread creation test and see if the memory use changes over time. David > Thanks, Thomas > > > On Wed, Jul 29, 2015 at 7:46 AM, David Holmes > wrote: > > Summary: replace complex custom code for maintaining > ThreadLocalStorage with compiler supported thread-local variables > (Solaris only) > > This is a non-public bug so let me explain with some background, the > bug, and then the fix - which involves lots of complex-code deletion > and addition of some very simple code. :) > > webrev: http://cr.openjdk.java.net/~dholmes/8130212/webrev/ > > In various parts of the runtime and in compiler generated code we > need to get a reference to the (VM-level) Thread* of the currently > executing thread. This is what Thread::current() returns. For > performance reasons we also have a fast-path on 64-bit where the > Thread* is stashed away in a register (g7 on sparc, r15 on x64). > > So Thread::current() is actually a slow-path mechanism and it > delegates to ThreadLocalStorage::thread(). > > On some systems ThreadLocalStorage::thread utilizes a caching > mechanism to try and speed up access to the current thread. > Otherwise it calls into yet another "slow" path which uses the > available platform thread-specific-storage APIs. > > Compiled code also has a slow-path get_thread() method which uses > assembly code to invoke the same platform thread-specific-storage > APIs (in some cases - on sparc it simply calls > ThreadLocalStorage::thread()). > > On Solaris 64-bit (which is all we support today) there is a simple > 1-level thread cache which is an array of Thread*. If a thread > doesn't find itself in the slot for the hash of its id it inserts > itself there. As a thread terminates it clears out its > ThreadLocalStorage values including any cached reference. > > The bug is that we have potential for a read-after-free error due to > this code: > > 46 uintptr_t raw = pd_raw_thread_id(); > 47 int ix = pd_cache_index(raw); // hashes id > 48 Thread* candidate = ThreadLocalStorage::_get_thread_cache[ix]; > 49 if (candidate->self_raw_id() == raw) { > 50 // hit > 51 return candidate; > 52 } else { > 53 return > ThreadLocalStorage::get_thread_via_cache_slowly(raw, ix); > 54 } > > The problem is that the value read as candidate could be a thread > that (after line 48) terminated and was freed. But line #49 then > reads the raw id of that thread, which is then a read-after-free - > which is a "Bad Thing (TM)". > > There's no simple fix for the caching code - you would need a > completely different approach (or synchronization that would nullify > the whole point of the cache). > > Now all this ThreadLocalStorage code is pretty old and was put in > place to deal with inadequacies of the system provided > thread-specific-storage API. In fact on Solaris we even by-pass the > public API (thr_getspecific/thr_setspecific) when we can and > implement our own version using lower-level APIs available in the > T1/T2 threading libraries! > > In mid-2015 things have changed considerably and we have reliable > and performant support for thread-local variables at the C+ > language-level. So the way to maintain the current thread is simply > using: > > // Declaration of thread-local variable > static __thread Thread * _thr_current > > inline Thread* ThreadLocalStorage::thread() { > return _thr_current; > } > > inline void ThreadLocalStorage::set_thread(Thread* thread) { > _thr_current = thread; > } > > And all the complex ThreadLocalStorage code with caching etc all > vanishes! > > For my next trick I plan to try and remove the ThreadLocalStorage > class completely by using language-based thread-locals on all > platforms. But for now this is just Solaris and so we still need the > ThreadLocalStorage API. However a lot of that API is not needed any > more on Solaris so I have excluded it from there in the shared code > (ifndef SOLARIS). But to avoid changing other shared-code callsites > of ThreadLocalStorage I've kept part of the API with trivial > implementations on Solaris. > > Testing: JPRT > All hotspot regression tests > > I'm happy to run more tests but the nice thing about such low-level > code is that if it is broken, it is always broken :) Every use of > Thread::current or MacroAssembler::get_thread now hits this code. > > Performance: I've run a basic set of benchmarks that is readily > available to me on our performance testing system. The best way to > describe the result is neutral. There are some slight wins, and some > slight losses, with most showing no statistical difference. And even > the "wins" and "losses" are within the natural variations of the > benchmarks. So a lot of complex code has been replaced by simple > code and we haven't lost any observable performance - which seems > like a win to me. > > Also product mode x64 libjvm.so has shrunk by 921KB - which is a > little surprising but very nice. > > Thanks, > David > > From david.holmes at oracle.com Mon Aug 3 20:47:46 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 4 Aug 2015 06:47:46 +1000 Subject: RFR (S) 8080298: Clean up os::...::supports_variable_stack_size() In-Reply-To: <55BFA381.30707@oracle.com> References: <55BF017E.3070801@oracle.com> <55BFA381.30707@oracle.com> Message-ID: <55BFD372.8070104@oracle.com> Thanks Coleen! David On 4/08/2015 3:23 AM, Coleen Phillimore wrote: > > Looks good! > Coleen > > On 8/3/15 1:51 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8080298 >> >> webrev: http://cr.openjdk.java.net/~dholmes/8080298/webrev/ >> >> To support the old LinuxThreads implementation we had to distinguish >> between threading libraries with fixed-stack-size threads >> (LinuxThreads), and variable-stack-sized-threads (NPTL). As >> LinuxThreads support was removed years ago and all the code related to >> it has now been removed, we have a situation where >> supports_variable_stack_size() is always true and so the function and >> its use can be removed. >> >> While this notion was only ever relevant to Linux it also got copied >> across to the BSD and AIX ports. >> >> Thanks, >> David > From staffan.larsen at oracle.com Tue Aug 4 09:12:54 2015 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 4 Aug 2015 11:12:54 +0200 Subject: RFR[ 9u-dev] JDK-8075773 - jps running as root fails after the fix of JDK-8050807 In-Reply-To: <55BB1AC1.4040909@oracle.com> References: <55B6653A.3070007@oracle.com> <55B66BD6.6080307@oracle.com> <55B6A604.4010206@oracle.com> <55BB1AC1.4040909@oracle.com> Message-ID: <7F24DE1F-5944-472A-AD58-A8B31A5A4438@oracle.com> Looks good! /Staffan > On 31 jul 2015, at 08:50, cheleswer sahu wrote: > > Hi, > Thanks Dmitry and Jerry for your review comments. I have fixed the spacing and indentation issue. > > Update web review link: http://cr.openjdk.java.net/~poonam/8075773/webrev.01/ > > Regards, > Cheleswer > > On 7/28/2015 3:13 AM, Gerald Thornbrugh wrote: >> Hi Cheleswer, >> >> Other than the issues Dimitry mentioned below your changes look good. >> >> I am also not a "Reviewer". >> >> Thanks! >> >> Jerry >>> Cheleswer, >>> >>> src/os/linux/vm/perfMemory_linux.cpp >>> >>> 220 space missed after // >>> 222 space missed after != >>> >>> src/os/solaris/vm/perfMemory_solaris.cpp >>> >>> 222 extra space before // (wrong indent) >>> >>> Otherwise looks good. (not a Reviewer) >>> >>> -Dmitry >>> >>> >>> On 2015-07-27 20:07, cheleswer sahu wrote: >>>> Hi, >>>> >>>> Please review the code changes for >>>> "https://bugs.openjdk.java.net/browse/JDK-8075773" . >>>> Web review Link: http://cr.openjdk.java.net/~poonam/8075773/webrev.00/ >>>> >>>> Bug brief: This bug was introduced after the fix of JDK-8050807. JPS >>>> reads the process information from >>>> "/tmp/hsperfdata_$username_$ProcessID". In order to ensure the file is >>>> secure to open and read, it tries to match the UID with the effective >>>> user ID of that file. When JPS is run as root user this check gets failed. >>>> >>>> Fix: If JPS is running as a root user, then the check which matches, UID >>>> with effective user id is skipped. >>>> >>>> I have test this fix, it's working fine and found no security issue. >>>> >>>> >>>> Regards, >>>> Cheleswer >>> >> > From dmitry.dmitriev at oracle.com Tue Aug 4 09:14:16 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Tue, 4 Aug 2015 12:14:16 +0300 Subject: RFR (XS): 8132892: Memory must be freed after calling Arguments::set_sysclasspath function Message-ID: <55C08268.8090900@oracle.com> Hello, Please review this small fix which fix small memory leak. Also, I need a sponsor for this fix, who can push it. Arguments::set_sysclasspath function call set_value method of SystemProperty class which copy passed value. In several code paths memory is allocated for string and then this string is passed to Arguments::set_sysclasspath. Therefore allocated string should be freed after calling Arguments::set_sysclasspath function. Webrev: http://cr.openjdk.java.net/~ddmitriev/8132892/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8132892 Tested: JPRT(hotspot test set), hotspot all, vm.quick Thanks, Dmitry From thomas.stuefe at gmail.com Tue Aug 4 09:32:11 2015 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 4 Aug 2015 11:32:11 +0200 Subject: RFR: 8130212: Thread::current() might access freed memory on Solaris In-Reply-To: <55BFCD84.2040102@oracle.com> References: <55B868A3.2080602@oracle.com> <55BFCD84.2040102@oracle.com> Message-ID: Hi David, On Mon, Aug 3, 2015 at 10:22 PM, David Holmes wrote: > Hi Thomas, > > On 4/08/2015 1:38 AM, Thomas St?fe wrote: > >> Hi David, >> >> we added compiler-level TLS (__thread) for Linux a while ago to do >> exactly what you are doing, but then had to remove it again. Bugs in the >> glibc caused memory leaks - basically it looked like glibc was not >> cleaning TLS related structures. Leak was small but it added up over time. >> >> I know you implemented this for Solaris. Just thought I give you a >> warning, maybe this is something to keep in mind. >> > > Thanks for the heads-up! Linux et al are next on the list. I'll put > together a simple thread creation test and see if the memory use changes > over time. > > Sounds good. Unfortunately this was a gcc/glibc bug and therefore you may not see the bug on every linux system. I think this may have been the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=12650 Kind Regards, Thomas David > > Thanks, Thomas >> >> >> On Wed, Jul 29, 2015 at 7:46 AM, David Holmes > > wrote: >> >> Summary: replace complex custom code for maintaining >> ThreadLocalStorage with compiler supported thread-local variables >> (Solaris only) >> >> This is a non-public bug so let me explain with some background, the >> bug, and then the fix - which involves lots of complex-code deletion >> and addition of some very simple code. :) >> >> webrev: http://cr.openjdk.java.net/~dholmes/8130212/webrev/ >> >> In various parts of the runtime and in compiler generated code we >> need to get a reference to the (VM-level) Thread* of the currently >> executing thread. This is what Thread::current() returns. For >> performance reasons we also have a fast-path on 64-bit where the >> Thread* is stashed away in a register (g7 on sparc, r15 on x64). >> >> So Thread::current() is actually a slow-path mechanism and it >> delegates to ThreadLocalStorage::thread(). >> >> On some systems ThreadLocalStorage::thread utilizes a caching >> mechanism to try and speed up access to the current thread. >> Otherwise it calls into yet another "slow" path which uses the >> available platform thread-specific-storage APIs. >> >> Compiled code also has a slow-path get_thread() method which uses >> assembly code to invoke the same platform thread-specific-storage >> APIs (in some cases - on sparc it simply calls >> ThreadLocalStorage::thread()). >> >> On Solaris 64-bit (which is all we support today) there is a simple >> 1-level thread cache which is an array of Thread*. If a thread >> doesn't find itself in the slot for the hash of its id it inserts >> itself there. As a thread terminates it clears out its >> ThreadLocalStorage values including any cached reference. >> >> The bug is that we have potential for a read-after-free error due to >> this code: >> >> 46 uintptr_t raw = pd_raw_thread_id(); >> 47 int ix = pd_cache_index(raw); // hashes id >> 48 Thread* candidate = ThreadLocalStorage::_get_thread_cache[ix]; >> 49 if (candidate->self_raw_id() == raw) { >> 50 // hit >> 51 return candidate; >> 52 } else { >> 53 return >> ThreadLocalStorage::get_thread_via_cache_slowly(raw, ix); >> 54 } >> >> The problem is that the value read as candidate could be a thread >> that (after line 48) terminated and was freed. But line #49 then >> reads the raw id of that thread, which is then a read-after-free - >> which is a "Bad Thing (TM)". >> >> There's no simple fix for the caching code - you would need a >> completely different approach (or synchronization that would nullify >> the whole point of the cache). >> >> Now all this ThreadLocalStorage code is pretty old and was put in >> place to deal with inadequacies of the system provided >> thread-specific-storage API. In fact on Solaris we even by-pass the >> public API (thr_getspecific/thr_setspecific) when we can and >> implement our own version using lower-level APIs available in the >> T1/T2 threading libraries! >> >> In mid-2015 things have changed considerably and we have reliable >> and performant support for thread-local variables at the C+ >> language-level. So the way to maintain the current thread is simply >> using: >> >> // Declaration of thread-local variable >> static __thread Thread * _thr_current >> >> inline Thread* ThreadLocalStorage::thread() { >> return _thr_current; >> } >> >> inline void ThreadLocalStorage::set_thread(Thread* thread) { >> _thr_current = thread; >> } >> >> And all the complex ThreadLocalStorage code with caching etc all >> vanishes! >> >> For my next trick I plan to try and remove the ThreadLocalStorage >> class completely by using language-based thread-locals on all >> platforms. But for now this is just Solaris and so we still need the >> ThreadLocalStorage API. However a lot of that API is not needed any >> more on Solaris so I have excluded it from there in the shared code >> (ifndef SOLARIS). But to avoid changing other shared-code callsites >> of ThreadLocalStorage I've kept part of the API with trivial >> implementations on Solaris. >> >> Testing: JPRT >> All hotspot regression tests >> >> I'm happy to run more tests but the nice thing about such low-level >> code is that if it is broken, it is always broken :) Every use of >> Thread::current or MacroAssembler::get_thread now hits this code. >> >> Performance: I've run a basic set of benchmarks that is readily >> available to me on our performance testing system. The best way to >> describe the result is neutral. There are some slight wins, and some >> slight losses, with most showing no statistical difference. And even >> the "wins" and "losses" are within the natural variations of the >> benchmarks. So a lot of complex code has been replaced by simple >> code and we haven't lost any observable performance - which seems >> like a win to me. >> >> Also product mode x64 libjvm.so has shrunk by 921KB - which is a >> little surprising but very nice. >> >> Thanks, >> David >> >> >> From bengt.rutisson at oracle.com Tue Aug 4 10:31:06 2015 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 04 Aug 2015 12:31:06 +0200 Subject: RFR (XS): JDK-8132953: imageDecompressor.hpp should not include precompiled.hpp Message-ID: <55C0946A.9050803@oracle.com> Hi all, Could I have a couple of reviews for this cleanup of an include statement? .hpp files should not be including precompiled.hpp. http://cr.openjdk.java.net/~brutisso/8132953/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8132953 Thanks, Bengt From david.holmes at oracle.com Tue Aug 4 10:45:52 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 4 Aug 2015 20:45:52 +1000 Subject: RFR: 8130212: Thread::current() might access freed memory on Solaris In-Reply-To: References: <55B868A3.2080602@oracle.com> <55BFCD84.2040102@oracle.com> Message-ID: <55C097E0.9050908@oracle.com> On 4/08/2015 7:32 PM, Thomas St?fe wrote: > Hi David, > > On Mon, Aug 3, 2015 at 10:22 PM, David Holmes > wrote: > > Hi Thomas, > > On 4/08/2015 1:38 AM, Thomas St?fe wrote: > > Hi David, > > we added compiler-level TLS (__thread) for Linux a while ago to do > exactly what you are doing, but then had to remove it again. > Bugs in the > glibc caused memory leaks - basically it looked like glibc was not > cleaning TLS related structures. Leak was small but it added up > over time. > > I know you implemented this for Solaris. Just thought I give you a > warning, maybe this is something to keep in mind. > > > Thanks for the heads-up! Linux et al are next on the list. I'll put > together a simple thread creation test and see if the memory use > changes over time. > > > Sounds good. Unfortunately this was a gcc/glibc bug and therefore you > may not see the bug on every linux system. > > I think this may have been the bug: > > https://sourceware.org/bugzilla/show_bug.cgi?id=12650 Many thanks! That explains why my simple test showed no issues - I need to implement in the VM and see what happens. David > Kind Regards, Thomas > > David > > Thanks, Thomas > > > On Wed, Jul 29, 2015 at 7:46 AM, David Holmes > > >> wrote: > > Summary: replace complex custom code for maintaining > ThreadLocalStorage with compiler supported thread-local > variables > (Solaris only) > > This is a non-public bug so let me explain with some > background, the > bug, and then the fix - which involves lots of complex-code > deletion > and addition of some very simple code. :) > > webrev: http://cr.openjdk.java.net/~dholmes/8130212/webrev/ > > In various parts of the runtime and in compiler generated > code we > need to get a reference to the (VM-level) Thread* of the > currently > executing thread. This is what Thread::current() returns. For > performance reasons we also have a fast-path on 64-bit > where the > Thread* is stashed away in a register (g7 on sparc, r15 on > x64). > > So Thread::current() is actually a slow-path mechanism and it > delegates to ThreadLocalStorage::thread(). > > On some systems ThreadLocalStorage::thread utilizes a caching > mechanism to try and speed up access to the current thread. > Otherwise it calls into yet another "slow" path which uses the > available platform thread-specific-storage APIs. > > Compiled code also has a slow-path get_thread() method > which uses > assembly code to invoke the same platform > thread-specific-storage > APIs (in some cases - on sparc it simply calls > ThreadLocalStorage::thread()). > > On Solaris 64-bit (which is all we support today) there is > a simple > 1-level thread cache which is an array of Thread*. If a thread > doesn't find itself in the slot for the hash of its id it > inserts > itself there. As a thread terminates it clears out its > ThreadLocalStorage values including any cached reference. > > The bug is that we have potential for a read-after-free > error due to > this code: > > 46 uintptr_t raw = pd_raw_thread_id(); > 47 int ix = pd_cache_index(raw); // hashes id > 48 Thread* candidate = > ThreadLocalStorage::_get_thread_cache[ix]; > 49 if (candidate->self_raw_id() == raw) { > 50 // hit > 51 return candidate; > 52 } else { > 53 return > ThreadLocalStorage::get_thread_via_cache_slowly(raw, ix); > 54 } > > The problem is that the value read as candidate could be a > thread > that (after line 48) terminated and was freed. But line #49 > then > reads the raw id of that thread, which is then a > read-after-free - > which is a "Bad Thing (TM)". > > There's no simple fix for the caching code - you would need a > completely different approach (or synchronization that > would nullify > the whole point of the cache). > > Now all this ThreadLocalStorage code is pretty old and was > put in > place to deal with inadequacies of the system provided > thread-specific-storage API. In fact on Solaris we even > by-pass the > public API (thr_getspecific/thr_setspecific) when we can and > implement our own version using lower-level APIs available > in the > T1/T2 threading libraries! > > In mid-2015 things have changed considerably and we have > reliable > and performant support for thread-local variables at the C+ > language-level. So the way to maintain the current thread > is simply > using: > > // Declaration of thread-local variable > static __thread Thread * _thr_current > > inline Thread* ThreadLocalStorage::thread() { > return _thr_current; > } > > inline void ThreadLocalStorage::set_thread(Thread* thread) { > _thr_current = thread; > } > > And all the complex ThreadLocalStorage code with caching > etc all > vanishes! > > For my next trick I plan to try and remove the > ThreadLocalStorage > class completely by using language-based thread-locals on all > platforms. But for now this is just Solaris and so we still > need the > ThreadLocalStorage API. However a lot of that API is not > needed any > more on Solaris so I have excluded it from there in the > shared code > (ifndef SOLARIS). But to avoid changing other shared-code > callsites > of ThreadLocalStorage I've kept part of the API with trivial > implementations on Solaris. > > Testing: JPRT > All hotspot regression tests > > I'm happy to run more tests but the nice thing about such > low-level > code is that if it is broken, it is always broken :) Every > use of > Thread::current or MacroAssembler::get_thread now hits this > code. > > Performance: I've run a basic set of benchmarks that is readily > available to me on our performance testing system. The best > way to > describe the result is neutral. There are some slight wins, > and some > slight losses, with most showing no statistical difference. > And even > the "wins" and "losses" are within the natural variations > of the > benchmarks. So a lot of complex code has been replaced by > simple > code and we haven't lost any observable performance - which > seems > like a win to me. > > Also product mode x64 libjvm.so has shrunk by 921KB - which > is a > little surprising but very nice. > > Thanks, > David > > > From david.holmes at oracle.com Tue Aug 4 10:50:06 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 4 Aug 2015 20:50:06 +1000 Subject: RFR (XS): JDK-8132953: imageDecompressor.hpp should not include precompiled.hpp In-Reply-To: <55C0946A.9050803@oracle.com> References: <55C0946A.9050803@oracle.com> Message-ID: <55C098DE.9010609@oracle.com> On 4/08/2015 8:31 PM, Bengt Rutisson wrote: > > Hi all, > > Could I have a couple of reviews for this cleanup of an include statement? I think one will suffice for such a trivial change. > .hpp files should not be including precompiled.hpp. Not only that but precompiled.hpp must be first in the include list. Which means: ./share/vm/classfile/imageDecompressor.cpp needs fixing. Thanks, David ----- > http://cr.openjdk.java.net/~brutisso/8132953/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8132953 > > Thanks, > Bengt > > > From bengt.rutisson at oracle.com Tue Aug 4 10:49:42 2015 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 04 Aug 2015 12:49:42 +0200 Subject: RFR (XS): JDK-8132953: imageDecompressor.hpp should not include precompiled.hpp In-Reply-To: <55C098DE.9010609@oracle.com> References: <55C0946A.9050803@oracle.com> <55C098DE.9010609@oracle.com> Message-ID: <55C098C6.7060102@oracle.com> Hi David, Thanks for looking at this! On 2015-08-04 12:50, David Holmes wrote: > On 4/08/2015 8:31 PM, Bengt Rutisson wrote: >> >> Hi all, >> >> Could I have a couple of reviews for this cleanup of an include >> statement? > > I think one will suffice for such a trivial change. Sounds good to me. :) > >> .hpp files should not be including precompiled.hpp. > > Not only that but precompiled.hpp must be first in the include list. > > Which means: > > ./share/vm/classfile/imageDecompressor.cpp > > needs fixing. Good catch. Here's an updated webrev: http://cr.openjdk.java.net/~brutisso/8132953/webrev.01/ Thanks, Bengt > > Thanks, > David > ----- > >> http://cr.openjdk.java.net/~brutisso/8132953/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8132953 >> >> Thanks, >> Bengt >> >> >> From david.holmes at oracle.com Tue Aug 4 11:03:01 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 4 Aug 2015 21:03:01 +1000 Subject: RFR (XS): JDK-8132953: imageDecompressor.hpp should not include precompiled.hpp In-Reply-To: <55C098C6.7060102@oracle.com> References: <55C0946A.9050803@oracle.com> <55C098DE.9010609@oracle.com> <55C098C6.7060102@oracle.com> Message-ID: <55C09BE5.5050208@oracle.com> Ship it! :) Thanks, David On 4/08/2015 8:49 PM, Bengt Rutisson wrote: > > Hi David, > > Thanks for looking at this! > > On 2015-08-04 12:50, David Holmes wrote: >> On 4/08/2015 8:31 PM, Bengt Rutisson wrote: >>> >>> Hi all, >>> >>> Could I have a couple of reviews for this cleanup of an include >>> statement? >> >> I think one will suffice for such a trivial change. > > Sounds good to me. :) > >> >>> .hpp files should not be including precompiled.hpp. >> >> Not only that but precompiled.hpp must be first in the include list. >> >> Which means: >> >> ./share/vm/classfile/imageDecompressor.cpp >> >> needs fixing. > > Good catch. > > Here's an updated webrev: > http://cr.openjdk.java.net/~brutisso/8132953/webrev.01/ > > Thanks, > Bengt > >> >> Thanks, >> David >> ----- >> >>> http://cr.openjdk.java.net/~brutisso/8132953/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8132953 >>> >>> Thanks, >>> Bengt >>> >>> >>> > From bengt.rutisson at oracle.com Tue Aug 4 10:58:19 2015 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 04 Aug 2015 12:58:19 +0200 Subject: RFR (XS): JDK-8132953: imageDecompressor.hpp should not include precompiled.hpp In-Reply-To: <55C09BE5.5050208@oracle.com> References: <55C0946A.9050803@oracle.com> <55C098DE.9010609@oracle.com> <55C098C6.7060102@oracle.com> <55C09BE5.5050208@oracle.com> Message-ID: <55C09ACB.2030301@oracle.com> On 2015-08-04 13:03, David Holmes wrote: > Ship it! :) Thanks, David! Bengt > > Thanks, > David > > On 4/08/2015 8:49 PM, Bengt Rutisson wrote: >> >> Hi David, >> >> Thanks for looking at this! >> >> On 2015-08-04 12:50, David Holmes wrote: >>> On 4/08/2015 8:31 PM, Bengt Rutisson wrote: >>>> >>>> Hi all, >>>> >>>> Could I have a couple of reviews for this cleanup of an include >>>> statement? >>> >>> I think one will suffice for such a trivial change. >> >> Sounds good to me. :) >> >>> >>>> .hpp files should not be including precompiled.hpp. >>> >>> Not only that but precompiled.hpp must be first in the include list. >>> >>> Which means: >>> >>> ./share/vm/classfile/imageDecompressor.cpp >>> >>> needs fixing. >> >> Good catch. >> >> Here's an updated webrev: >> http://cr.openjdk.java.net/~brutisso/8132953/webrev.01/ >> >> Thanks, >> Bengt >> >>> >>> Thanks, >>> David >>> ----- >>> >>>> http://cr.openjdk.java.net/~brutisso/8132953/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8132953 >>>> >>>> Thanks, >>>> Bengt >>>> >>>> >>>> >> From KAREN.KINNEAR at oracle.com Tue Aug 4 15:16:38 2015 From: KAREN.KINNEAR at oracle.com (Karen Kinnear) Date: Tue, 4 Aug 2015 11:16:38 -0400 Subject: RFR: 8087342: crash in klassItable_initialize_itable_for_interface In-Reply-To: References: <55A6F6F7.6040508@oracle.com> Message-ID: Actually, I didn't include this in the webrev, but I added some text from the email below to the is_miranda description, so the full text before the method is now ... Suggested edits are welcome. thanks, Karen // Check if a method is a miranda method, given a class's methods array, // its default_method table and its super class. // "Miranda" means an abstract non-private method that would not be // overridden for the local class. // A "miranda" method should only include non-private interface // instance methods, i.e. not private methods, not static methods, // not default methods (concrete interface methods), not overpass methods. // If a given class already has a local (including overpass) method, a // default method, or any of its superclasses has the same which would have // overridden an abstract method, then this is not a miranda method. // // Miranda methods are checked multiple times. // Pass 1: during class load/class file parsing: before vtable size calculation: // include superinterface abstract and default methods (non-private instance). // We include potential default methods to give them space in the vtable. // During the first run, the current instanceKlass has not yet been // created, the superclasses and superinterfaces do have instanceKlasses // but may not have vtables, the default_methods list is empty, no overpasses. // This is seen by default method creation. // // Pass 2: recalculated during vtable initialization: only include abstract methods. // The goal of pass 2 is to walk through the superinterfaces to see if any of // the superinterface methods (which were all abstract pre-default methods) // need to be added to the vtable. // With the addition of default methods, we have three new challenges: // overpasses, static interface methods and private interface methods. // Static and private interface methods do not get added to the vtable and // are not seen by the method resolution process, so we skip those. // Overpass methods are already in the vtable, so vtable lookup will // find them and we don't need to add a miranda method to the end of // the vtable. So we look for overpass methods and if they are found we // return false. Note that we inherit our superclasses vtable, so // the superclass' search also needs to use find_overpass so that if // one is found we return false. // False means - we don't need a miranda method added to the vtable. // // During the second run, default_methods is set up, so concrete methods from // superinterfaces with matching names/signatures to default_methods are already // in the default_methods list and do not need to be appended to the vtable // as mirandas. Abstract methods may already have been handled via // overpasses - either local or superclass overpasses, which may be // in the vtable already. // // Pass 3: They are also checked by link resolution and selection, // for invocation on a method (not interface method) reference that // resolves to a method with an interface as its method_holder. // Used as part of walking from the bottom of the vtable to find // the vtable index for the miranda method. // // Part of the Miranda Rights in the US mean that if you do not have // an attorney one will be appointed for you. On Jul 31, 2015, at 2:01 PM, Karen Kinnear wrote: > Lois, > > Here is an updated webrev. The hotspot code has not changed (except for the fixed comments). I added a test to investigate if > I could have static and instance fields in the same class (obviously with -Xverify:none). I did manage > to have 1 public static, 1 private instance and 1 overpass for an AME from an abstract interface. > Obviously none of this matches the jls, or jvms or passes the verifier - but I did want to make sure > the code did the right thing. > > The test passes in product and fastdebug. I would appreciate a review for the test itself. > Thanks to Harold for jasm example :-) > > updated webrev: http://cr.openjdk.java.net/~acorn/8087342.3/webrev/ > >>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 > > thanks, > Karen > > On Jul 16, 2015, at 6:03 PM, Karen Kinnear wrote: > >> Lois, >> >> Thank you for the detailed review. I really appreciate it. >> On Jul 15, 2015, at 8:12 PM, Lois Foltan wrote: >> >>> >>> On 7/15/2015 12:40 PM, Karen Kinnear wrote: >>>> Please review for JDK9: >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 >>>> webrev: http://cr.openjdk.java.net/~acorn/8087342.2/webrev/ >>>> >>>> Crash occurs in product when we should through IncompatibleClassChangeError. >>> >>> Hi Karen, >>> I think this looks good and I really like how straight forward klassVtable::is_miranda now is. Some minor clarification comments: >>> >>> - src/share/vm/instanceKlass.cpp >>> comments for the find_local_method* were changed to state: >>> +// note that the local methods array can have up to one overpass, one static >>> +// and one instance (private or not) with the same name/signature >>> I think there are two combinations that the code depends on not occurring, they are: >>> 1. all 3 are in existence in the local methods array (one overpass, one static and one instance) >>> 2. the combination of one static and one instance (private or not) >>> In other words there has to be an overpass to cause more than one method with the same name/signature within the local methods array. And it is either an overpass and a static or an overpass and an instance, but not all 3. Correct me if I am wrong. >> I need to write an additional test to check that. I agree that both the spec and the ClassFileParser if _need_verify is set will >> prevent instance and static overlap. I need to see what happens if you skip verification. I will get back to you with that and update the >> comments to clarify if needed. >>> >>> - src/share/vm/oops/klassVtable.cpp >> Let me see if I can make this clearer - let me know if I can make the comments clearer. I truly appreciate your trying to see >> if this all makes sense and is consistent. It is still too complex. >>> Thank you for adding the improved comments ahead of is_miranda. My read is that overpass methods are not considered miranda methods and I agree with that statement. >> Yes, they are not considered miranda methods because you don't need to add them to the vtable as abstract methods because >> they already are in the vtable from being in the class' LOCAL methods array. >> So pass 1: overpasses do not exist >> pass 2: overpasses are already in the vtable when we calculate mirandas >> pass 3: overpasses in a class have the class as their method_holder, not an interface, so we aren't looking them up here >> >> So - pass 2 is the one that cares about the find_local_method(Klass:find_overpass vs. Klass::skip_overpass). >> >> >>> Yet, Klass::find_overpass is specified in the code. I think the code is correct, but based on the comment I would have thought Klass::skip_overpass should have been specified? >> I also think the code is correct. >> So what pass 2 is doing is walking through the superinterfaces to see if any of the superinterface methods (which all used to be abstract) >> need to be added to the vtable. >> >> So the question is - what superinterface methods belong in the vtable? >> So the searches in is_miranda are designed to find out if there is a method in the vtable already such that we don't >> need to add the superinterface method - e.g. this was abstract and we have an implementation for it. >> >> With the addition of default methods, we have three new challenges - overpasses, static interface methods and private >> interface methods. >> >> Static and private interface methods do not get added to the vtable and are not seen by the method resolution process. >> So we skip those. >> >> Overpass methods are already in the vtable, so vtable lookup will find them there and we don't need to add a miranda method >> to the end of the vtable. So we look for those explicitly. Note that we inherit our superclasses vtable, so the superclass' search >> also needs to use find_overpass. >> >> Does this make sense? >> >> Is there a way I could make this clearer via comments? >> >>> Much like skip_static and skip_private. So based on your later statement that "Abstract methods may already have been handled via overpasses" it implies that overpass methods, although not miranda methods, can satisfy or stand in for an miranda during pass 2. So they must be found, did I understand that correctly? >> >>> >>> Again, looks good. I don't need to see another review. My comments were merely clarification based. >> many thanks, >> Karen >> >>> >>> Thanks, >>> Lois >>> >>> >>> >>> >>> >>> >>> >>> >>>> >>>> testing: >>>> internal tests: Defmeth (updated), SelectionResolution - product and fastdebug >>>> jprt >>>> jck >>>> jtreg/hotspot_all >>>> jtreg/jdk_streams >>>> test,noncolo.testlist, -atk quick >>>> >>>> (jck and macosx testing in progress) >>>> >>>> thanks, >>>> Karen >>>> >>> >> > From lois.foltan at oracle.com Tue Aug 4 17:38:24 2015 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 04 Aug 2015 13:38:24 -0400 Subject: RFR: 8087342: crash in klassItable_initialize_itable_for_interface In-Reply-To: References: <55A6F6F7.6040508@oracle.com> Message-ID: <55C0F890.3000700@oracle.com> On 7/31/2015 2:01 PM, Karen Kinnear wrote: > Lois, > > Here is an updated webrev. The hotspot code has not changed (except > for the fixed comments). I added a test to investigate if > I could have static and instance fields in the same class (obviously > with -Xverify:none). I did manage > to have 1 public static, 1 private instance and 1 overpass for an AME > from an abstract interface. > Obviously none of this matches the jls, or jvms or passes the verifier > - but I did want to make sure > the code did the right thing. > > The test passes in product and fastdebug. I would appreciate a review > for the test itself. > Thanks to Harold for jasm example :-) > > updated webrev: http://cr.openjdk.java.net/~acorn/8087342.3/webrev/ > Hi Karen, Test looks good. I applied your patch to a Windows build to see if I could trigger a failure due to a local method array that had a different layout, but it looks good, the test passed with both fastdebug and product builds. Minor comments: TestStaticandInstance.java: line #99 - I think "n" should be "m" line #100 - I think the "public" should be "private" line #105 - "C.n()" should be "C.m()" At least that is the way it seems to be implemented in ASM. Thanks, Lois >>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 > > thanks, > Karen > > On Jul 16, 2015, at 6:03 PM, Karen Kinnear wrote: > >> Lois, >> >> Thank you for the detailed review. I really appreciate it. >> On Jul 15, 2015, at 8:12 PM, Lois Foltan wrote: >> >>> >>> On 7/15/2015 12:40 PM, Karen Kinnear wrote: >>>> Please review for JDK9: >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 >>>> webrev: http://cr.openjdk.java.net/~acorn/8087342.2/webrev/ >>>> >>>> >>>> Crash occurs in product when we should through >>>> IncompatibleClassChangeError. >>> >>> Hi Karen, >>> I think this looks good and I really like how straight forward >>> klassVtable::is_miranda now is. Some minor clarification comments: >>> >>> - src/share/vm/instanceKlass.cpp >>> comments for the find_local_method* were changed to state: >>> +// note that the local methods array can have up to one overpass, >>> one static >>> +// and one instance (private or not) with the same name/signature >>> I think there are two combinations that the code depends on not >>> occurring, they are: >>> 1. all 3 are in existence in the local methods array (one >>> overpass, one static and one instance) >>> 2. the combination of one static and one instance (private or not) >>> In other words there has to be an overpass to cause more than one >>> method with the same name/signature within the local methods array. >>> And it is either an overpass and a static or an overpass and an >>> instance, but not all 3. Correct me if I am wrong. >> I need to write an additional test to check that. I agree that both >> the spec and the ClassFileParser if _need_verify is set will >> prevent instance and static overlap. I need to see what happens if >> you skip verification. I will get back to you with that and update the >> comments to clarify if needed. >>> >>> - src/share/vm/oops/klassVtable.cpp >> Let me see if I can make this clearer - let me know if I can make the >> comments clearer. I truly appreciate your trying to see >> if this all makes sense and is consistent. It is still too complex. >>> Thank you for adding the improved comments ahead of is_miranda. My >>> read is that overpass methods are not considered miranda methods and >>> I agree with that statement. >> Yes, they are not considered miranda methods because you don't need >> to add them to the vtable as abstract methods because >> they already are in the vtable from being in the class' LOCAL methods >> array. >> So pass 1: overpasses do not exist >> pass 2: overpasses are already in the vtable when we calculate mirandas >> pass 3: overpasses in a class have the class as their method_holder, >> not an interface, so we aren't looking them up here >> >> So - pass 2 is the one that cares about the >> find_local_method(Klass:find_overpass vs. Klass::skip_overpass). >> >> >>> Yet, Klass::find_overpass is specified in the code. I think the >>> code is correct, but based on the comment I would have thought >>> Klass::skip_overpass should have been specified? >> I also think the code is correct. >> So what pass 2 is doing is walking through the superinterfaces to see >> if any of the superinterface methods (which all used to be abstract) >> need to be added to the vtable. >> >> So the question is - what superinterface methods belong in the vtable? >> So the searches in is_miranda are designed to find out if there is a >> method in the vtable already such that we don't >> need to add the superinterface method - e.g. this was abstract and we >> have an implementation for it. >> >> With the addition of default methods, we have three new challenges - >> overpasses, static interface methods and private >> interface methods. >> >> Static and private interface methods do not get added to the vtable >> and are not seen by the method resolution process. >> So we skip those. >> >> Overpass methods are already in the vtable, so vtable lookup will >> find them there and we don't need to add a miranda method >> to the end of the vtable. So we look for those explicitly. Note that >> we inherit our superclasses vtable, so the superclass' search >> also needs to use find_overpass. >> >> Does this make sense? >> >> Is there a way I could make this clearer via comments? >> >>> Much like skip_static and skip_private. So based on your later >>> statement that "Abstract methods may already have been handled via >>> overpasses" it implies that overpass methods, although not miranda >>> methods, can satisfy or stand in for an miranda during pass 2. So >>> they must be found, did I understand that correctly? >> >>> >>> Again, looks good. I don't need to see another review. My comments >>> were merely clarification based. >> many thanks, >> Karen >> >>> >>> Thanks, >>> Lois >>> >>> >>> >>> >>> >>> >>> >>> >>>> >>>> testing: >>>> internal tests: Defmeth (updated), SelectionResolution - product >>>> and fastdebug >>>> jprt >>>> jck >>>> jtreg/hotspot_all >>>> jtreg/jdk_streams >>>> test,noncolo.testlist, -atk quick >>>> >>>> (jck and macosx testing in progress) >>>> >>>> thanks, >>>> Karen >>>> >>> >> > From lois.foltan at oracle.com Tue Aug 4 17:39:08 2015 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 04 Aug 2015 13:39:08 -0400 Subject: RFR: 8087342: crash in klassItable_initialize_itable_for_interface In-Reply-To: References: <55A6F6F7.6040508@oracle.com> Message-ID: <55C0F8BC.4090605@oracle.com> On 8/4/2015 11:16 AM, Karen Kinnear wrote: > Actually, I didn't include this in the webrev, but I added some text > from the email below to the is_miranda > description, so the full text before the method is now ... Suggested > edits are welcome. > Thanks Karen, looks better. I have no further comments. Lois > thanks, > Karen > > // Check if a method is a miranda method, given a class's methods array, > // its default_method table and its super class. > // "Miranda" means an abstract non-private method that would not be > // overridden for the local class. > // A "miranda" method should only include non-private interface > // instance methods, i.e. not private methods, not static methods, > // not default methods (concrete interface methods), not overpass methods. > // If a given class already has a local (including overpass) method, a > // default method, or any of its superclasses has the same which would > have > // overridden an abstract method, then this is not a miranda method. > // > // Miranda methods are checked multiple times. > // Pass 1: during class load/class file parsing: before vtable size > calculation: > // include superinterface abstract and default methods (non-private > instance). > // We include potential default methods to give them space in the vtable. > // During the first run, the current instanceKlass has not yet been > // created, the superclasses and superinterfaces do have instanceKlasses > // but may not have vtables, the default_methods list is empty, no > overpasses. > // This is seen by default method creation. > // > // Pass 2: recalculated during vtable initialization: only include > abstract methods. > // The goal of pass 2 is to walk through the superinterfaces to see if > any of > // the superinterface methods (which were all abstract pre-default > methods) > // need to be added to the vtable. > // With the addition of default methods, we have three new challenges: > // overpasses, static interface methods and private interface methods. > // Static and private interface methods do not get added to the vtable and > // are not seen by the method resolution process, so we skip those. > // Overpass methods are already in the vtable, so vtable lookup will > // find them and we don't need to add a miranda method to the end of > // the vtable. So we look for overpass methods and if they are found we > // return false. Note that we inherit our superclasses vtable, so > // the superclass' search also needs to use find_overpass so that if > // one is found we return false. > // False means - we don't need a miranda method added to the vtable. > // > // During the second run, default_methods is set up, so concrete > methods from > // superinterfaces with matching names/signatures to default_methods > are already > // in the default_methods list and do not need to be appended to the > vtable > // as mirandas. Abstract methods may already have been handled via > // overpasses - either local or superclass overpasses, which may be > // in the vtable already. > // > // Pass 3: They are also checked by link resolution and selection, > // for invocation on a method (not interface method) reference that > // resolves to a method with an interface as its method_holder. > // Used as part of walking from the bottom of the vtable to find > // the vtable index for the miranda method. > // > // Part of the Miranda Rights in the US mean that if you do not have > // an attorney one will be appointed for you. > > > > On Jul 31, 2015, at 2:01 PM, Karen Kinnear wrote: > >> Lois, >> >> Here is an updated webrev. The hotspot code has not changed (except >> for the fixed comments). I added a test to investigate if >> I could have static and instance fields in the same class (obviously >> with -Xverify:none). I did manage >> to have 1 public static, 1 private instance and 1 overpass for an AME >> from an abstract interface. >> Obviously none of this matches the jls, or jvms or passes the >> verifier - but I did want to make sure >> the code did the right thing. >> >> The test passes in product and fastdebug. I would appreciate a review >> for the test itself. >> Thanks to Harold for jasm example :-) >> >> updated webrev: http://cr.openjdk.java.net/~acorn/8087342.3/webrev/ >> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 >> >> thanks, >> Karen >> >> On Jul 16, 2015, at 6:03 PM, Karen Kinnear wrote: >> >>> Lois, >>> >>> Thank you for the detailed review. I really appreciate it. >>> On Jul 15, 2015, at 8:12 PM, Lois Foltan wrote: >>> >>>> >>>> On 7/15/2015 12:40 PM, Karen Kinnear wrote: >>>>> Please review for JDK9: >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 >>>>> webrev: http://cr.openjdk.java.net/~acorn/8087342.2/webrev/ >>>>> >>>>> >>>>> Crash occurs in product when we should through >>>>> IncompatibleClassChangeError. >>>> >>>> Hi Karen, >>>> I think this looks good and I really like how straight forward >>>> klassVtable::is_miranda now is. Some minor clarification comments: >>>> >>>> - src/share/vm/instanceKlass.cpp >>>> comments for the find_local_method* were changed to state: >>>> +// note that the local methods array can have up to one overpass, >>>> one static >>>> +// and one instance (private or not) with the same name/signature >>>> I think there are two combinations that the code depends on not >>>> occurring, they are: >>>> 1. all 3 are in existence in the local methods array (one >>>> overpass, one static and one instance) >>>> 2. the combination of one static and one instance (private or not) >>>> In other words there has to be an overpass to cause more than one >>>> method with the same name/signature within the local methods array. >>>> And it is either an overpass and a static or an overpass and an >>>> instance, but not all 3. Correct me if I am wrong. >>> I need to write an additional test to check that. I agree that both >>> the spec and the ClassFileParser if _need_verify is set will >>> prevent instance and static overlap. I need to see what happens if >>> you skip verification. I will get back to you with that and update the >>> comments to clarify if needed. >>>> >>>> - src/share/vm/oops/klassVtable.cpp >>> Let me see if I can make this clearer - let me know if I can make >>> the comments clearer. I truly appreciate your trying to see >>> if this all makes sense and is consistent. It is still too complex. >>>> Thank you for adding the improved comments ahead of is_miranda. My >>>> read is that overpass methods are not considered miranda methods >>>> and I agree with that statement. >>> Yes, they are not considered miranda methods because you don't need >>> to add them to the vtable as abstract methods because >>> they already are in the vtable from being in the class' LOCAL >>> methods array. >>> So pass 1: overpasses do not exist >>> pass 2: overpasses are already in the vtable when we calculate mirandas >>> pass 3: overpasses in a class have the class as their method_holder, >>> not an interface, so we aren't looking them up here >>> >>> So - pass 2 is the one that cares about the >>> find_local_method(Klass:find_overpass vs. Klass::skip_overpass). >>> >>> >>>> Yet, Klass::find_overpass is specified in the code. I think the >>>> code is correct, but based on the comment I would have thought >>>> Klass::skip_overpass should have been specified? >>> I also think the code is correct. >>> So what pass 2 is doing is walking through the superinterfaces to >>> see if any of the superinterface methods (which all used to be abstract) >>> need to be added to the vtable. >>> >>> So the question is - what superinterface methods belong in the vtable? >>> So the searches in is_miranda are designed to find out if there is a >>> method in the vtable already such that we don't >>> need to add the superinterface method - e.g. this was abstract and >>> we have an implementation for it. >>> >>> With the addition of default methods, we have three new challenges - >>> overpasses, static interface methods and private >>> interface methods. >>> >>> Static and private interface methods do not get added to the vtable >>> and are not seen by the method resolution process. >>> So we skip those. >>> >>> Overpass methods are already in the vtable, so vtable lookup will >>> find them there and we don't need to add a miranda method >>> to the end of the vtable. So we look for those explicitly. Note that >>> we inherit our superclasses vtable, so the superclass' search >>> also needs to use find_overpass. >>> >>> Does this make sense? >>> >>> Is there a way I could make this clearer via comments? >>> >>>> Much like skip_static and skip_private. So based on your later >>>> statement that "Abstract methods may already have been handled via >>>> overpasses" it implies that overpass methods, although not miranda >>>> methods, can satisfy or stand in for an miranda during pass 2. So >>>> they must be found, did I understand that correctly? >>> >>>> >>>> Again, looks good. I don't need to see another review. My >>>> comments were merely clarification based. >>> many thanks, >>> Karen >>> >>>> >>>> Thanks, >>>> Lois >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> testing: >>>>> internal tests: Defmeth (updated), SelectionResolution - product >>>>> and fastdebug >>>>> jprt >>>>> jck >>>>> jtreg/hotspot_all >>>>> jtreg/jdk_streams >>>>> test,noncolo.testlist, -atk quick >>>>> >>>>> (jck and macosx testing in progress) >>>>> >>>>> thanks, >>>>> Karen >>>>> >>>> >>> >> > From karen.kinnear at oracle.com Tue Aug 4 18:01:23 2015 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 4 Aug 2015 14:01:23 -0400 Subject: RFR: 8087342: crash in klassItable_initialize_itable_for_interface In-Reply-To: <55C0F890.3000700@oracle.com> References: <55A6F6F7.6040508@oracle.com> <55C0F890.3000700@oracle.com> Message-ID: <22E26B36-E502-4FE1-95C9-C7343BEB04F4@oracle.com> Lois, Many thanks for the review and for testing this on windows. I rewrote the comments on the lines you suggested to be clearer how to create from javac. Thank you for the suggestion. I will check this in now. thanks, Karen static int m() { return 1;} // javac with "n()" and patch to "m()" private int m() { return 2;} // javac with public and patch to private } public class D { public static int CallStatic() { int staticret = C.m(); // javac with "C.n" and patch to "C.m" ... On Aug 4, 2015, at 1:38 PM, Lois Foltan wrote: > > On 7/31/2015 2:01 PM, Karen Kinnear wrote: >> Lois, >> >> Here is an updated webrev. The hotspot code has not changed (except for the fixed comments). I added a test to investigate if >> I could have static and instance fields in the same class (obviously with -Xverify:none). I did manage >> to have 1 public static, 1 private instance and 1 overpass for an AME from an abstract interface. >> Obviously none of this matches the jls, or jvms or passes the verifier - but I did want to make sure >> the code did the right thing. >> >> The test passes in product and fastdebug. I would appreciate a review for the test itself. >> Thanks to Harold for jasm example :-) >> >> updated webrev: http://cr.openjdk.java.net/~acorn/8087342.3/webrev/ > > Hi Karen, > Test looks good. I applied your patch to a Windows build to see if I could trigger a failure due to a local method array that had a different layout, but it looks good, the test passed with both fastdebug and product builds. Minor comments: > > TestStaticandInstance.java: > line #99 - I think "n" should be "m" > line #100 - I think the "public" should be "private" > line #105 - "C.n()" should be "C.m()" > > At least that is the way it seems to be implemented in ASM. > > Thanks, > Lois > >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 >> >> thanks, >> Karen >> >> On Jul 16, 2015, at 6:03 PM, Karen Kinnear wrote: >> >>> Lois, >>> >>> Thank you for the detailed review. I really appreciate it. >>> On Jul 15, 2015, at 8:12 PM, Lois Foltan wrote: >>> >>>> >>>> On 7/15/2015 12:40 PM, Karen Kinnear wrote: >>>>> Please review for JDK9: >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8087342 >>>>> webrev: http://cr.openjdk.java.net/~acorn/8087342.2/webrev/ >>>>> >>>>> Crash occurs in product when we should through IncompatibleClassChangeError. >>>> >>>> Hi Karen, >>>> I think this looks good and I really like how straight forward klassVtable::is_miranda now is. Some minor clarification comments: >>>> >>>> - src/share/vm/instanceKlass.cpp >>>> comments for the find_local_method* were changed to state: >>>> +// note that the local methods array can have up to one overpass, one static >>>> +// and one instance (private or not) with the same name/signature >>>> I think there are two combinations that the code depends on not occurring, they are: >>>> 1. all 3 are in existence in the local methods array (one overpass, one static and one instance) >>>> 2. the combination of one static and one instance (private or not) >>>> In other words there has to be an overpass to cause more than one method with the same name/signature within the local methods array. And it is either an overpass and a static or an overpass and an instance, but not all 3. Correct me if I am wrong. >>> I need to write an additional test to check that. I agree that both the spec and the ClassFileParser if _need_verify is set will >>> prevent instance and static overlap. I need to see what happens if you skip verification. I will get back to you with that and update the >>> comments to clarify if needed. >>>> >>>> - src/share/vm/oops/klassVtable.cpp >>> Let me see if I can make this clearer - let me know if I can make the comments clearer. I truly appreciate your trying to see >>> if this all makes sense and is consistent. It is still too complex. >>>> Thank you for adding the improved comments ahead of is_miranda. My read is that overpass methods are not considered miranda methods and I agree with that statement. >>> Yes, they are not considered miranda methods because you don't need to add them to the vtable as abstract methods because >>> they already are in the vtable from being in the class' LOCAL methods array. >>> So pass 1: overpasses do not exist >>> pass 2: overpasses are already in the vtable when we calculate mirandas >>> pass 3: overpasses in a class have the class as their method_holder, not an interface, so we aren't looking them up here >>> >>> So - pass 2 is the one that cares about the find_local_method(Klass:find_overpass vs. Klass::skip_overpass). >>> >>> >>>> Yet, Klass::find_overpass is specified in the code. I think the code is correct, but based on the comment I would have thought Klass::skip_overpass should have been specified? >>> I also think the code is correct. >>> So what pass 2 is doing is walking through the superinterfaces to see if any of the superinterface methods (which all used to be abstract) >>> need to be added to the vtable. >>> >>> So the question is - what superinterface methods belong in the vtable? >>> So the searches in is_miranda are designed to find out if there is a method in the vtable already such that we don't >>> need to add the superinterface method - e.g. this was abstract and we have an implementation for it. >>> >>> With the addition of default methods, we have three new challenges - overpasses, static interface methods and private >>> interface methods. >>> >>> Static and private interface methods do not get added to the vtable and are not seen by the method resolution process. >>> So we skip those. >>> >>> Overpass methods are already in the vtable, so vtable lookup will find them there and we don't need to add a miranda method >>> to the end of the vtable. So we look for those explicitly. Note that we inherit our superclasses vtable, so the superclass' search >>> also needs to use find_overpass. >>> >>> Does this make sense? >>> >>> Is there a way I could make this clearer via comments? >>> >>>> Much like skip_static and skip_private. So based on your later statement that "Abstract methods may already have been handled via overpasses" it implies that overpass methods, although not miranda methods, can satisfy or stand in for an miranda during pass 2. So they must be found, did I understand that correctly? >>> >>>> >>>> Again, looks good. I don't need to see another review. My comments were merely clarification based. >>> many thanks, >>> Karen >>> >>>> >>>> Thanks, >>>> Lois >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> testing: >>>>> internal tests: Defmeth (updated), SelectionResolution - product and fastdebug >>>>> jprt >>>>> jck >>>>> jtreg/hotspot_all >>>>> jtreg/jdk_streams >>>>> test,noncolo.testlist, -atk quick >>>>> >>>>> (jck and macosx testing in progress) >>>>> >>>>> thanks, >>>>> Karen >>>>> >>>> >>> >> > From hearn at vinumeris.com Tue Aug 4 18:23:24 2015 From: hearn at vinumeris.com (Mike Hearn) Date: Tue, 4 Aug 2015 20:23:24 +0200 Subject: Future plans for AppCDS? Message-ID: Hi there, I'm wondering if there are any plans to extend AppCDS in future to include serializing the code cache to disk? I ask because I have a (not very complicated) desktop JavaFX app. It starts in about three seconds, which isn't terrible, but I'd like it to be faster. Unfortunately I am frequently frustrated in this goal. Some testing shows that one step of the initialisation sequence takes about a second normally. If I put it in a loop to let it fully compile, that drops to more like a quarter of a second. Likewise, my app loads a series of items to display them on the screen. The first one can take a solid 700-800 msec. The rest are more like 10% of that. I tried parallelising some of the things done during startup, but it made no difference. My theory is that the compile threads are taking up the spare cores that could be doing startup tasks (it's a bit hard to profile this though as the whole sequence only lasts a few seconds). AppCDS already knows how to serialise lots of HotSpot state to disk for unchanging JARs. If it could store compiled method code as well, then I could probably shave a second or two off app startup. I do understand that this situation is a bit rare for Java developers and that due to extra disk IO etc, loading compiled code from disk might not always be faster. But for devices that have an SSD it probably can be, especially if HotSpot were to do the same trick Microsoft does and lay out compiled methods on disk in the order they will be used. From tom.benson at oracle.com Tue Aug 4 18:25:40 2015 From: tom.benson at oracle.com (Tom Benson) Date: Tue, 04 Aug 2015 14:25:40 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto Message-ID: <55C103A4.1060505@oracle.com> Hi, Please review this change for JDK-8131734, an addition to the Archive Region support in G1 to handle -Xshared:auto behavior. With Xshared:auto, if the shared string region file mapping fails, sharing is disabled and execution continues. To allow this, the archive region space which has already been allocated needs to be freed. A free_archive_regions() entry is added to g1CollectedHeap, to be called if the file mapping performed by CDS fails. Its arguments are the same as those given to alloc_archive_regions (and that would have been given to fill_archive_regions if the mapping had succeeded). The CDS code change will be posted separately by Jiangli Zhou. JBS: https://bugs.openjdk.java.net/browse/JDK-8131734 Webrev: http://cr.openjdk.java.net/~tbenson/8131734/webrev/ Tested: JPRT, and benchmarks run with alloc/free_archive_regions calls forced at init time and heap verification enabled. Jiangli also tried this with the failing Xshared:auto test and the corresponding CDS code change to call free_archive_regions. Thanks, Tom From kim.barrett at oracle.com Tue Aug 4 22:06:00 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 4 Aug 2015 18:06:00 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C103A4.1060505@oracle.com> References: <55C103A4.1060505@oracle.com> Message-ID: On Aug 4, 2015, at 2:25 PM, Tom Benson wrote: > > Hi, > Please review this change for JDK-8131734, an addition to the Archive Region support in G1 to handle -Xshared:auto behavior. With Xshared:auto, if the shared string region file mapping fails, sharing is disabled and execution continues. To allow this, the archive region space which has already been allocated needs to be freed. A free_archive_regions() entry is added to g1CollectedHeap, to be called if the file mapping performed by CDS fails. Its arguments are the same as those given to alloc_archive_regions (and that would have been given to fill_archive_regions if the mapping had succeeded). > > The CDS code change will be posted separately by Jiangli Zhou. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8131734 > Webrev: http://cr.openjdk.java.net/~tbenson/8131734/webrev/ > Tested: JPRT, and benchmarks run with alloc/free_archive_regions calls forced at init time and heap verification enabled. > Jiangli also tried this with the failing Xshared:auto test and the corresponding CDS code change to call free_archive_regions. ------------------------------------------------------------------------------ src/share/vm/gc/g1/g1CollectedHeap.cpp 1134 size_used += ranges[i].word_size() * HeapWordSize; Why not ranges[i].byte_size() ? ------------------------------------------------------------------------------ src/share/vm/gc/g1/g1CollectedHeap.cpp 1144 if ((prev_last_region != NULL) && (start_region == prev_last_region)) { I think (prev_last_region != NULL) is unnecessary here. ------------------------------------------------------------------------------ src/share/vm/gc/g1/g1CollectedHeap.cpp 1155 HeapRegion* curr_region = start_region; 1156 while (curr_region != NULL) { ... 1162 if (curr_region != last_region) { 1163 curr_region = _hrm.next_region_in_heap(curr_region); 1164 } else { 1165 curr_region = NULL; 1166 } 1167 } I think I would have found this easier to read if structured something like: while (true) { ... if (curr_region == last_region) { break; } else { curr_region = _hrm.next_region_in_heap(curr_region); assert(curr_region != NULL, ...); } } I?ll leave that up to you. But, is next_region_in_heap really the right stepper function? It skips over regions that are not "is_available". Are the regions we're dealing with in this function all guaranteed to be available? Clearly, bad things happen if last_region is not available while using next_region_in_heap here. ------------------------------------------------------------------------------ src/share/vm/gc/g1/g1CollectedHeap.hpp 789 // For each of the specified MemRegions, free the containing G1 regions 790 // which had been allocated by alloc_archive_regions. This should be called 791 // rather than fill_archive_regions at JVM init time if the archive file 792 // mapping failed. 793 void free_archive_regions(MemRegion* range, size_t count); The implementation presently requires the ranges to be non-overlapping and sorted in increasing order. That ought to be said in the description. ------------------------------------------------------------------------------ src/share/vm/gc/g1/g1MarkSweep.hpp 61 // Mark or un-mark the regions containing the specified address range as archives. 62 static void mark_range_archive(MemRegion range, bool is_archive); With the functionality change, maybe the name should be changed. Perhaps set_range_archive? ------------------------------------------------------------------------------ From david.holmes at oracle.com Wed Aug 5 00:28:00 2015 From: david.holmes at oracle.com (David Holmes) Date: Wed, 5 Aug 2015 10:28:00 +1000 Subject: Future plans for AppCDS? In-Reply-To: References: Message-ID: <55C15890.6070600@oracle.com> Hi Mike, On 5/08/2015 4:23 AM, Mike Hearn wrote: > Hi there, > > I'm wondering if there are any plans to extend AppCDS in future to include > serializing the code cache to disk? The are no existing Projects or JEPs in this area. Cheers, David ------ > I ask because I have a (not very complicated) desktop JavaFX app. It starts > in about three seconds, which isn't terrible, but I'd like it to be faster. > > Unfortunately I am frequently frustrated in this goal. Some testing shows > that one step of the initialisation sequence takes about a second normally. > If I put it in a loop to let it fully compile, that drops to more like a > quarter of a second. > > Likewise, my app loads a series of items to display them on the screen. The > first one can take a solid 700-800 msec. The rest are more like 10% of that. > > I tried parallelising some of the things done during startup, but it made > no difference. My theory is that the compile threads are taking up the > spare cores that could be doing startup tasks (it's a bit hard to profile > this though as the whole sequence only lasts a few seconds). > > AppCDS already knows how to serialise lots of HotSpot state to disk for > unchanging JARs. If it could store compiled method code as well, then I > could probably shave a second or two off app startup. > > I do understand that this situation is a bit rare for Java developers and > that due to extra disk IO etc, loading compiled code from disk might not > always be faster. But for devices that have an SSD it probably can be, > especially if HotSpot were to do the same trick Microsoft does and lay out > compiled methods on disk in the order they will be used. > From thomas.schatzl at oracle.com Wed Aug 5 13:37:24 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 05 Aug 2015 15:37:24 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> Message-ID: <1438781844.2378.60.camel@oracle.com> Hi, On Tue, 2015-08-04 at 18:06 -0400, Kim Barrett wrote: > On Aug 4, 2015, at 2:25 PM, Tom Benson wrote: > > > > Hi, > > Please review this change for JDK-8131734, an addition to the Archive Region support in G1 to handle -Xshared:auto behavior. With Xshared:auto, if the shared string region file mapping fails, sharing is disabled and execution continues. To allow this, the archive region space which has already been allocated needs to be freed. A free_archive_regions() entry is added to g1CollectedHeap, to be called if the file mapping performed by CDS fails. Its arguments are the same as those given to alloc_archive_regions (and that would have been given to fill_archive_regions if the mapping had succeeded). > > > > The CDS code change will be posted separately by Jiangli Zhou. > > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8131734 > > Webrev: http://cr.openjdk.java.net/~tbenson/8131734/webrev/ > > Tested: JPRT, and benchmarks run with alloc/free_archive_regions calls forced at init time and heap verification enabled. > > Jiangli also tried this with the failing Xshared:auto test and the corresponding CDS code change to call free_archive_regions. > > ---------------------------------- [...] > > But, is next_region_in_heap really the right stepper function? It > skips over regions that are not "is_available". Are the regions we're > dealing with in this function all guaranteed to be available? Yes. We assume that in the earlier call in allocate_containing_regions() G1 made them available (committed them). I assume that when mmap'ing it, the > Clearly, bad things happen if last_region is not available while using > next_region_in_heap here. One could do regular pointer arithmetic *within* the MemRegion (which is always a guaranteed contiguous range) and then map the address back to the HeapRegion*. - I have some question about this code, particularly about the comment: 1137 HeapRegion* start_region = _hrm.addr_to_region(start_address); 1138 HeapRegion* last_region = _hrm.addr_to_region(last_address); 1139 1140 // Check for ranges that start in the same G1 region in which the previous 1141 // range ended, and adjust the start address so we don't try to free 1142 // the same region again. If the current range is entirely within that 1143 // region, skip it. 1144 if ((prev_last_region != NULL) && (start_region == prev_last_region)) { 1145 start_address = start_region->end(); 1146 if (start_address > last_address) { 1147 continue; 1148 } 1149 start_region = _hrm.addr_to_region(start_address); 1150 } How could the situation mentioned in line 1140 happen? Are the given memory regions not overlapping already, and the start addresses of these MemRegions at the start of these regions? Since last_region is the region containing the last address within the memory range, wouldn't that mean given above preconditions, this could not happen? - I think the method should add a assert_at_safepoint(true) at the top (and possibly all other archival methods if they are not yet through the call chain), or decrease_used() made safe against concurrent modification using the ParGC_Rare_Event_lock. I would prefer just making sure the code is only run at a safepoint. Thanks, Thomas From thomas.schatzl at oracle.com Wed Aug 5 13:52:37 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 05 Aug 2015 15:52:37 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1438781844.2378.60.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> Message-ID: <1438782757.2378.62.camel@oracle.com> Hi again, [...] > - I have some question about this code, particularly about the comment: > > 1137 HeapRegion* start_region = _hrm.addr_to_region(start_address); > 1138 HeapRegion* last_region = _hrm.addr_to_region(last_address); > 1139 > 1140 // Check for ranges that start in the same G1 region in which the previous > 1141 // range ended, and adjust the start address so we don't try to free > 1142 // the same region again. If the current range is entirely within that > 1143 // region, skip it. > 1144 if ((prev_last_region != NULL) && (start_region == prev_last_region)) { > 1145 start_address = start_region->end(); > 1146 if (start_address > last_address) { > 1147 continue; > 1148 } > 1149 start_region = _hrm.addr_to_region(start_address); > 1150 } > > How could the situation mentioned in line 1140 happen? Are the given > memory regions not overlapping already, and the start addresses of these > MemRegions at the start of these regions? Probably because of using the same memory mapped file created from a VM with different (smaller) heap region size? Thanks, Thomas From tom.benson at oracle.com Wed Aug 5 14:12:41 2015 From: tom.benson at oracle.com (Tom Benson) Date: Wed, 05 Aug 2015 10:12:41 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> Message-ID: <55C219D9.90500@oracle.com> Hi Kim, Thanks very much for the review. On 8/4/2015 6:06 PM, Kim Barrett wrote: > ------------------------------------------------------------------------------ > src/share/vm/gc/g1/g1CollectedHeap.cpp > 1134 size_used += ranges[i].word_size() * HeapWordSize; > > Why not ranges[i].byte_size() ? Indeed, why not? I'll change it. > > ------------------------------------------------------------------------------ > src/share/vm/gc/g1/g1CollectedHeap.cpp > 1144 if ((prev_last_region != NULL) && (start_region == prev_last_region)) { > > I think (prev_last_region != NULL) is unnecessary here. I agree. Leftover from the alloc_archive_regions code I started with. > > ------------------------------------------------------------------------------ > src/share/vm/gc/g1/g1CollectedHeap.cpp > 1155 HeapRegion* curr_region = start_region; > 1156 while (curr_region != NULL) { > ... > 1162 if (curr_region != last_region) { > 1163 curr_region = _hrm.next_region_in_heap(curr_region); > 1164 } else { > 1165 curr_region = NULL; > 1166 } > 1167 } > > I think I would have found this easier to read if structured something > like: > > while (true) { > ... > if (curr_region == last_region) { > break; > } else { > curr_region = _hrm.next_region_in_heap(curr_region); > assert(curr_region != NULL, ...); > } > } > > I?ll leave that up to you. Both alloc_archive_regions and fill_archive_regions are written this way (the former for a slightly better reason), so I think I'll leave them all as-is. > > But, is next_region_in_heap really the right stepper function? I see Thomas talked about this in his reply, but, yes. > It > skips over regions that are not "is_available". Are the regions we're > dealing with in this function all guaranteed to be available? This is basically part of an error path in mapping shared string space. We know the alloc_ succeeded, but the file mapping subsequently failed. > Clearly, bad things happen if last_region is not available while using > next_region_in_heap here. > > ------------------------------------------------------------------------------ > src/share/vm/gc/g1/g1CollectedHeap.hpp > 789 // For each of the specified MemRegions, free the containing G1 regions > 790 // which had been allocated by alloc_archive_regions. This should be called > 791 // rather than fill_archive_regions at JVM init time if the archive file > 792 // mapping failed. > 793 void free_archive_regions(MemRegion* range, size_t count); > > The implementation presently requires the ranges to be non-overlapping > and sorted in increasing order. That ought to be said in the > description. OK. I'll add a comment that the args should be the same as passed to alloc_archive_regions. > > ------------------------------------------------------------------------------ > src/share/vm/gc/g1/g1MarkSweep.hpp > 61 // Mark or un-mark the regions containing the specified address range as archives. > 62 static void mark_range_archive(MemRegion range, bool is_archive); > > With the functionality change, maybe the name should be changed. > Perhaps set_range_archive? > > ------------------------------------------------------------------------------ > I considered this, but didn't see much difference between "marking" a range as archive/non-archive and "setting" one. However, I'll change it to "set" to avoid any possible confusion with "marking" in the normal GC sense. Thanks, Tom From tom.benson at oracle.com Wed Aug 5 14:29:23 2015 From: tom.benson at oracle.com (Tom Benson) Date: Wed, 05 Aug 2015 10:29:23 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1438781844.2378.60.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> Message-ID: <55C21DC3.5090305@oracle.com> Hi Thomas, Thanks for reviewing. On 8/5/2015 9:37 AM, Thomas Schatzl wrote: > Hi, > > On Tue, 2015-08-04 at 18:06 -0400, Kim Barrett wrote: >> But, is next_region_in_heap really the right stepper function? It >> skips over regions that are not "is_available". Are the regions we're >> dealing with in this function all guaranteed to be available? > Yes. We assume that in the earlier call in allocate_containing_regions() > G1 made them available (committed them). > > I assume that when mmap'ing it, the missing end of sentence? If you were going to say.... the mapping fails, so we need to free the archive regions that were just allocated..., then I agree. 8^) > >> Clearly, bad things happen if last_region is not available while using >> next_region_in_heap here. > One could do regular pointer arithmetic *within* the MemRegion (which is > always a guaranteed contiguous range) and then map the address back to > the HeapRegion*. I don't think this was a suggestion... was it? > > - I have some question about this code, particularly about the comment: > > 1137 HeapRegion* start_region = _hrm.addr_to_region(start_address); > 1138 HeapRegion* last_region = _hrm.addr_to_region(last_address); > 1139 > 1140 // Check for ranges that start in the same G1 region in which the previous > 1141 // range ended, and adjust the start address so we don't try to free > 1142 // the same region again. If the current range is entirely within that > 1143 // region, skip it. > 1144 if ((prev_last_region != NULL) && (start_region == prev_last_region)) { > 1145 start_address = start_region->end(); > 1146 if (start_address > last_address) { > 1147 continue; > 1148 } > 1149 start_region = _hrm.addr_to_region(start_address); > 1150 } > > How could the situation mentioned in line 1140 happen? Are the given > memory regions not overlapping already, and the start addresses of these > MemRegions at the start of these regions? As you said in a subsequent message, this can happen if the G1 region size at dump time is smaller than region size at restore time. > Since last_region is the region containing the last address within the > memory range, wouldn't that mean given above preconditions, this could > not happen? > > - I think the method should add a assert_at_safepoint(true) at the top > (and possibly all other archival methods if they are not yet through the > call chain), or decrease_used() made safe against concurrent > modification using the ParGC_Rare_Event_lock. > > I would prefer just making sure the code is only run at a safepoint. These restore-time routines (alloc_/free_ archive regions) are called at the beginning of JVM init, not at a safepoint. The dump-time archive alloc routines (such as archive_mem_allocate) do check for safepoint context. Thanks, Tom > > Thanks, > Thomas > > > From tom.benson at oracle.com Wed Aug 5 14:31:36 2015 From: tom.benson at oracle.com (Tom Benson) Date: Wed, 05 Aug 2015 10:31:36 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1438782757.2378.62.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <1438782757.2378.62.camel@oracle.com> Message-ID: <55C21E48.5060602@oracle.com> Hi again, On 8/5/2015 9:52 AM, Thomas Schatzl wrote: > Hi again, > > [...] >> - I have some question about this code, particularly about the comment: >> >> 1137 HeapRegion* start_region = _hrm.addr_to_region(start_address); >> 1138 HeapRegion* last_region = _hrm.addr_to_region(last_address); >> 1139 >> 1140 // Check for ranges that start in the same G1 region in which the previous >> 1141 // range ended, and adjust the start address so we don't try to free >> 1142 // the same region again. If the current range is entirely within that >> 1143 // region, skip it. >> 1144 if ((prev_last_region != NULL) && (start_region == prev_last_region)) { >> 1145 start_address = start_region->end(); >> 1146 if (start_address > last_address) { >> 1147 continue; >> 1148 } >> 1149 start_region = _hrm.addr_to_region(start_address); >> 1150 } >> >> How could the situation mentioned in line 1140 happen? Are the given >> memory regions not overlapping already, and the start addresses of these >> MemRegions at the start of these regions? > Probably because of using the same memory mapped file created from a VM > with different (smaller) heap region size? Right. Tom > > Thanks, > Thomas > > From thomas.schatzl at oracle.com Wed Aug 5 14:38:52 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 05 Aug 2015 16:38:52 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C21DC3.5090305@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> Message-ID: <1438785532.2378.73.camel@oracle.com> Hi, On Wed, 2015-08-05 at 10:29 -0400, Tom Benson wrote: > Hi Thomas, > Thanks for reviewing. > > On 8/5/2015 9:37 AM, Thomas Schatzl wrote: > > Hi, > > > > On Tue, 2015-08-04 at 18:06 -0400, Kim Barrett wrote: > >> But, is next_region_in_heap really the right stepper function? It > >> skips over regions that are not "is_available". Are the regions we're > >> dealing with in this function all guaranteed to be available? > > Yes. We assume that in the earlier call in allocate_containing_regions() > > G1 made them available (committed them). > > > > I assume that when mmap'ing it, the > > missing end of sentence? If you were going to say.... the mapping > fails, so we need to free the archive regions that were just > allocated..., then I agree. 8^) committed memory will be automatically released. > > > > >> Clearly, bad things happen if last_region is not available while using > >> next_region_in_heap here. > > One could do regular pointer arithmetic *within* the MemRegion (which is > > always a guaranteed contiguous range) and then map the address back to > > the HeapRegion*. > > I don't think this was a suggestion... was it? > An option, maybe Kim likes that more. > > Since last_region is the region containing the last address within the > > memory range, wouldn't that mean given above preconditions, this could > > not happen? > > > > - I think the method should add a assert_at_safepoint(true) at the top > > (and possibly all other archival methods if they are not yet through the > > call chain), or decrease_used() made safe against concurrent > > modification using the ParGC_Rare_Event_lock. > > > > I would prefer just making sure the code is only run at a safepoint. > > These restore-time routines (alloc_/free_ archive regions) are called at > the beginning of JVM init, not at a safepoint. The dump-time archive > alloc routines (such as archive_mem_allocate) do check for safepoint > context. Sigh. Is there a way to check this reliably? As mentioned, at least the decrement_used() call is not guarded against concurrent modification. It is probably easier for now to guard against using these methods at the wrong time (in case somebody wants to try them if some other requirement comes up) instead of trying to make them MT safe. Thanks, Thomas From tom.benson at oracle.com Wed Aug 5 17:33:51 2015 From: tom.benson at oracle.com (Tom Benson) Date: Wed, 05 Aug 2015 13:33:51 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1438785532.2378.73.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> Message-ID: <55C248FF.6010308@oracle.com> Hi, On 8/5/2015 10:38 AM, Thomas Schatzl wrote: > Hi, > > On Wed, 2015-08-05 at 10:29 -0400, Tom Benson wrote: > >> These restore-time routines (alloc_/free_ archive regions) are called at >> the beginning of JVM init, not at a safepoint. The dump-time archive >> alloc routines (such as archive_mem_allocate) do check for safepoint >> context. > Sigh. Is there a way to check this reliably? I notice that cms/compactibleFreeListSpace.cpp contains the assertion: assert(SafepointSynchronize::is_at_safepoint() || !is_init_completed(), "Else races are possible"); ... so perhaps that's a viable option. I'll try it to ensure the CDS use does indeed occur while is_init_completed() is false. Tom > > As mentioned, at least the decrement_used() call is not guarded against > concurrent modification. It is probably easier for now to guard against > using these methods at the wrong time (in case somebody wants to try > them if some other requirement comes up) instead of trying to make them > MT safe. > > Thanks, > Thomas > > From jiangli.zhou at oracle.com Wed Aug 5 18:05:18 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 5 Aug 2015 11:05:18 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C248FF.6010308@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> Message-ID: Hi Tom, On Aug 5, 2015, at 10:33 AM, Tom Benson wrote: > Hi, > > On 8/5/2015 10:38 AM, Thomas Schatzl wrote: >> Hi, >> >> On Wed, 2015-08-05 at 10:29 -0400, Tom Benson wrote: >> >>> These restore-time routines (alloc_/free_ archive regions) are called at >>> the beginning of JVM init, not at a safepoint. The dump-time archive >>> alloc routines (such as archive_mem_allocate) do check for safepoint >>> context. >> Sigh. Is there a way to check this reliably? > I notice that cms/compactibleFreeListSpace.cpp contains the assertion: > > assert(SafepointSynchronize::is_at_safepoint() || !is_init_completed(), > "Else races are possible"); > > ... so perhaps that's a viable option. I'll try it to ensure the CDS use does indeed occur while is_init_completed() is false. The mapping and initialization of the CDS string data happens during VM initialization and before 'set_init_completed()? is called. Just verified that in gdb as well. Thanks, Jiangli > Tom > >> >> As mentioned, at least the decrement_used() call is not guarded against >> concurrent modification. It is probably easier for now to guard against >> using these methods at the wrong time (in case somebody wants to try >> them if some other requirement comes up) instead of trying to make them >> MT safe. >> >> Thanks, >> Thomas >> >> > From tom.benson at oracle.com Wed Aug 5 19:21:45 2015 From: tom.benson at oracle.com (Tom Benson) Date: Wed, 05 Aug 2015 15:21:45 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> Message-ID: <55C26249.1090902@oracle.com> Hi, Updated full and incremental webrevs are at: http://cr.openjdk.java.net/~tbenson/8131734/webrev.01/ http://cr.openjdk.java.net/~tbenson/8131734/webrev.01.vs.00/ On 8/5/2015 2:05 PM, Jiangli Zhou wrote: > Hi Tom, > > On Aug 5, 2015, at 10:33 AM, Tom Benson > wrote: > >> ... so perhaps that's a viable option. I'll try it to ensure the CDS >> use does indeed occur while is_init_completed() is false. > > The mapping and initialization of the CDS string data happens during > VM initialization and before 'set_init_completed()? is called. Just > verified that in gdb as well. > > Thanks, > Jiangli > Yes, I checked as well. I've added this to each of alloc_/fill_/free_archive_regions: assert(!is_init_completed(), "Expect to be called at JVM init time"); Tom From kim.barrett at oracle.com Wed Aug 5 19:27:04 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Aug 2015 15:27:04 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C26249.1090902@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> Message-ID: On Aug 5, 2015, at 3:21 PM, Tom Benson wrote: > > Hi, > > Updated full and incremental webrevs are at: > http://cr.openjdk.java.net/~tbenson/8131734/webrev.01/ > http://cr.openjdk.java.net/~tbenson/8131734/webrev.01.vs.00/ Looks good. From tom.benson at oracle.com Wed Aug 5 20:42:03 2015 From: tom.benson at oracle.com (Tom Benson) Date: Wed, 05 Aug 2015 16:42:03 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> Message-ID: <55C2751B.8070400@oracle.com> Thanks, Kim. Tom On 8/5/2015 3:27 PM, Kim Barrett wrote: > On Aug 5, 2015, at 3:21 PM, Tom Benson wrote: >> Hi, >> >> Updated full and incremental webrevs are at: >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.01/ >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.01.vs.00/ > Looks good. > From thomas.schatzl at oracle.com Thu Aug 6 07:48:49 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 06 Aug 2015 09:48:49 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C2751B.8070400@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> Message-ID: <1438847329.2009.9.camel@oracle.com> Hi, On Wed, 2015-08-05 at 16:42 -0400, Tom Benson wrote: > Thanks, Kim. > Tom > > On 8/5/2015 3:27 PM, Kim Barrett wrote: > > On Aug 5, 2015, at 3:21 PM, Tom Benson wrote: > >> Hi, > >> > >> Updated full and incremental webrevs are at: > >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.01/ > >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.01.vs.00/ > > Looks good. > > Looking again at the code starting from line 1144 with the comment, there may still be an issue here: consider a VM A having 8M regions 1, 2, and 3, each spanning 8M (see figure below; each letter represents 1M of space), and a mapping from a VM B with 2M region size with the following layout that is loaded into VM A and failed. 1 2 3 region# AAAAAAAA AAAAAAAA AAAAAAAA VM A regions (8M region size) BB BB BBBBBBBB BBBB VM B mapping (2M region size) E.g. the mapping is from 2-3M, and another mapping from 6-20M. E.g. the ranges array contains (2, 3), (6, 20) now, after freeing region 1 from the first mapping (2, 3), start_region of the second mapping (=1) equals prev_last_region (=1); now the cursor is advanced to the region of the end of the second mapping (=3), forgetting to free region 2. What am I missing here? Thanks for adding the initialization checks. Thanks, Thomas From tom.benson at oracle.com Thu Aug 6 14:12:01 2015 From: tom.benson at oracle.com (Tom Benson) Date: Thu, 06 Aug 2015 10:12:01 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1438847329.2009.9.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> Message-ID: <55C36B31.7000908@oracle.com> Hi Thomas, On 8/6/2015 3:48 AM, Thomas Schatzl wrote: > Looking again at the code starting from line 1144 with the comment, > there may still be an issue here: consider a VM A having 8M regions 1, > 2, and 3, each spanning 8M (see figure below; each letter represents 1M > of space), and a mapping from a VM B with 2M region size with the > following layout that is loaded into VM A and failed. > > 1 2 3 region# > AAAAAAAA AAAAAAAA AAAAAAAA VM A regions (8M region size) > BB BB BBBBBBBB BBBB VM B mapping (2M region size) > > E.g. the mapping is from 2-3M, and another mapping from 6-20M. E.g. the > ranges array contains > > (2, 3), (6, 20) > > now, after freeing region 1 from the first mapping (2, 3), start_region > of the second mapping (=1) equals prev_last_region (=1); now the cursor > is advanced to the region of the end of the second mapping (=3), > forgetting to free region 2. > > What am I missing here? The cursor would not be advanced to to the end of the second mapping, but rather to the end of region A 1. So the current range being freed would be reduced from (6,20) to (8,20). Line 1148: if (start_region == prev_last_region) { start_address = start_region->end(); // Resets start address to end of A1, in your example Tom > > Thanks for adding the initialization checks. > > Thanks, > Thomas > > > From thomas.schatzl at oracle.com Thu Aug 6 14:35:23 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 06 Aug 2015 16:35:23 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C36B31.7000908@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> Message-ID: <1438871723.2474.37.camel@oracle.com> Hi Tom, On Thu, 2015-08-06 at 10:12 -0400, Tom Benson wrote: Hi Thomas, > > On 8/6/2015 3:48 AM, Thomas Schatzl wrote: > > Looking again at the code starting from line 1144 with the comment, > > there may still be an issue here: consider a VM A having 8M regions 1, > > 2, and 3, each spanning 8M (see figure below; each letter represents 1M > > of space), and a mapping from a VM B with 2M region size with the > > following layout that is loaded into VM A and failed. > > > > 1 2 3 region# > > AAAAAAAA AAAAAAAA AAAAAAAA VM A regions (8M region size) > > BB BB BBBBBBBB BBBB VM B mapping (2M region size) > > > > E.g. the mapping is from 2-3M, and another mapping from 6-20M. E.g. the > > ranges array contains > > > > (2, 3), (6, 20) > > > > now, after freeing region 1 from the first mapping (2, 3), start_region > > of the second mapping (=1) equals prev_last_region (=1); now the cursor > > is advanced to the region of the end of the second mapping (=3), > > forgetting to free region 2. > > > > What am I missing here? > The cursor would not be advanced to to the end of the second mapping, > but rather to the end of region A 1. So the current range being > freed > would be reduced from (6,20) to (8,20). Line 1148: > > if (start_region == prev_last_region) { > start_address = start_region->end(); // Resets start address > to end of A1, in your example > Okay, looks good. :) Thomas From tom.benson at oracle.com Thu Aug 6 14:49:26 2015 From: tom.benson at oracle.com (Tom Benson) Date: Thu, 06 Aug 2015 10:49:26 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1438871723.2474.37.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> Message-ID: <55C373F6.9060609@oracle.com> Thanks, Thomas. Tom On 8/6/2015 10:35 AM, Thomas Schatzl wrote: > Hi Tom, > > [.....] > Okay, looks good. :) > > Thomas > > From dmitry.dmitriev at oracle.com Thu Aug 6 19:58:39 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Thu, 6 Aug 2015 22:58:39 +0300 Subject: RFR (XS): 8132892: Memory must be freed after calling Arguments::set_sysclasspath function In-Reply-To: <55C08268.8090900@oracle.com> References: <55C08268.8090900@oracle.com> Message-ID: <55C3BC6F.2090609@oracle.com> Hello, Can I please get review and sponsor for this fix? Thanks! Dmitry On 04.08.2015 12:14, Dmitry Dmitriev wrote: > Hello, > > Please review this small fix which fix small memory leak. Also, I need > a sponsor for this fix, who can push it. > > Arguments::set_sysclasspath function call set_value method of > SystemProperty class which copy passed value. In several code paths > memory is allocated for string and then this string is passed to > Arguments::set_sysclasspath. Therefore allocated string should be > freed after calling Arguments::set_sysclasspath function. > > Webrev: http://cr.openjdk.java.net/~ddmitriev/8132892/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8132892 > Tested: JPRT(hotspot test set), hotspot all, vm.quick > > Thanks, > Dmitry From jiangli.zhou at oracle.com Thu Aug 6 21:32:56 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 6 Aug 2015 14:32:56 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C373F6.9060609@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> Message-ID: Hi, Here is the runtime part of the bug fix that calls the new free_archive_regions() when shared string mapping fails. I also added a jtreg test to test shared strings with -Xshare:auto. http://cr.openjdk.java.net/~jiangli/8131734/webrev.00/ Test: - Tested by explicitly making the shared string mapping fail on linux-x64, -Xshare:auto runs without crash with the fix - Tested with the new SharedStringsRunAuto test - Tested with XX:+PrintNMTStatistics -XX:NativeMemoryTracking=detail Thanks, Jiangli From dmitry.dmitriev at oracle.com Thu Aug 6 21:59:23 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Fri, 7 Aug 2015 00:59:23 +0300 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> Message-ID: <55C3D8BB.9000603@oracle.com> Hello Jiangli, I have few comments/questions. src/share/vm/memory/filemap.cpp module: 1) Should free_archive_regions also called when verify_string_regions() returns false on line 717? 2) The same question about unmap_string_regions(). Region must be freed when unmap_string_regions() is called? 3) Extra space at the end of the line 711. test/runtime/SharedArchiveFile/SharedStringsRunAuto.java 1) Unneeded second creating of OutputAnalyzer on line 61. Also, probably will be better to use same scheme for OutputAnalyzer? On lines 46-49 you not use local variable, but on lines 61-63 use local variable. 59 OutputAnalyzer output = new OutputAnalyzer(pb.start()); 60 61 output = new OutputAnalyzer(pb.start()); 2) Extra space at the end of the lines 25,26,30, 51 Thanks, Dmitry On 07.08.2015 0:32, Jiangli Zhou wrote: > Hi, > > Here is the runtime part of the bug fix that calls the new > free_archive_regions() when shared string mapping fails. I also added > a jtreg test to test shared strings with -Xshare:auto. > > http://cr.openjdk.java.net/~jiangli/8131734/webrev.00/ > > > Test: > - Tested by explicitly making the shared string mapping fail on > linux-x64, -Xshare:auto runs without crash with the fix > - Tested with the new SharedStringsRunAuto test > - Tested with XX:+PrintNMTStatistics -XX:NativeMemoryTracking=detail > > Thanks, > Jiangli From jiangli.zhou at oracle.com Thu Aug 6 22:19:33 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 6 Aug 2015 15:19:33 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C3D8BB.9000603@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> Message-ID: <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> Hi Dmitry, Thank you for the feedback! You are definitely right about freeing the archived regions in those failure cases. I should have thought it more thoroughly, instead of just fixing the case that I ran into. Will exam all cases. I?ll change the test to incorporate your suggestion. Will fix the extra spaces before committing. I have one question, how do you detect the extra spaces from the webrev? :) Thanks, Jiangli On Aug 6, 2015, at 2:59 PM, Dmitry Dmitriev wrote: > Hello Jiangli, > > I have few comments/questions. > > src/share/vm/memory/filemap.cpp module: > 1) Should free_archive_regions also called when verify_string_regions() returns false on line 717? > 2) The same question about unmap_string_regions(). Region must be freed when unmap_string_regions() is called? > 3) Extra space at the end of the line 711. > > test/runtime/SharedArchiveFile/SharedStringsRunAuto.java > 1) Unneeded second creating of OutputAnalyzer on line 61. Also, probably will be better to use same scheme for OutputAnalyzer? On lines 46-49 you not use local variable, but on lines 61-63 use local variable. > 59 OutputAnalyzer output = new OutputAnalyzer(pb.start()); > 60 > 61 output = new OutputAnalyzer(pb.start()); > 2) Extra space at the end of the lines 25,26,30, 51 > > Thanks, > Dmitry > > On 07.08.2015 0:32, Jiangli Zhou wrote: >> Hi, >> >> Here is the runtime part of the bug fix that calls the new free_archive_regions() when shared string mapping fails. I also added a jtreg test to test shared strings with -Xshare:auto. >> >> http://cr.openjdk.java.net/~jiangli/8131734/webrev.00/ >> >> Test: >> - Tested by explicitly making the shared string mapping fail on linux-x64, -Xshare:auto runs without crash with the fix >> - Tested with the new SharedStringsRunAuto test >> - Tested with XX:+PrintNMTStatistics -XX:NativeMemoryTracking=detail >> >> Thanks, >> Jiangli > From jiangli.zhou at oracle.com Fri Aug 7 00:40:32 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 6 Aug 2015 17:40:32 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> Message-ID: <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> Hi Dmitry, I?ve added shared string unmapping to handle the string data verification failure. I also added 'free_archive_regions()? call to FileMapInfo::unmap_string_regions(). My testing revealed some issues with the new archive region free code when I forced string verification failure. I?m working with Tom on the issues. Thanks, Jiangli On Aug 6, 2015, at 3:19 PM, Jiangli Zhou wrote: > Hi Dmitry, > > Thank you for the feedback! You are definitely right about freeing the archived regions in those failure cases. I should have thought it more thoroughly, instead of just fixing the case that I ran into. Will exam all cases. > > I?ll change the test to incorporate your suggestion. Will fix the extra spaces before committing. I have one question, how do you detect the extra spaces from the webrev? :) > > Thanks, > Jiangli > > On Aug 6, 2015, at 2:59 PM, Dmitry Dmitriev wrote: > >> Hello Jiangli, >> >> I have few comments/questions. >> >> src/share/vm/memory/filemap.cpp module: >> 1) Should free_archive_regions also called when verify_string_regions() returns false on line 717? >> 2) The same question about unmap_string_regions(). Region must be freed when unmap_string_regions() is called? >> 3) Extra space at the end of the line 711. >> >> test/runtime/SharedArchiveFile/SharedStringsRunAuto.java >> 1) Unneeded second creating of OutputAnalyzer on line 61. Also, probably will be better to use same scheme for OutputAnalyzer? On lines 46-49 you not use local variable, but on lines 61-63 use local variable. >> 59 OutputAnalyzer output = new OutputAnalyzer(pb.start()); >> 60 >> 61 output = new OutputAnalyzer(pb.start()); >> 2) Extra space at the end of the lines 25,26,30, 51 >> >> Thanks, >> Dmitry >> >> On 07.08.2015 0:32, Jiangli Zhou wrote: >>> Hi, >>> >>> Here is the runtime part of the bug fix that calls the new free_archive_regions() when shared string mapping fails. I also added a jtreg test to test shared strings with -Xshare:auto. >>> >>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.00/ >>> >>> Test: >>> - Tested by explicitly making the shared string mapping fail on linux-x64, -Xshare:auto runs without crash with the fix >>> - Tested with the new SharedStringsRunAuto test >>> - Tested with XX:+PrintNMTStatistics -XX:NativeMemoryTracking=detail >>> >>> Thanks, >>> Jiangli >> > From david.holmes at oracle.com Fri Aug 7 02:37:07 2015 From: david.holmes at oracle.com (David Holmes) Date: Fri, 7 Aug 2015 12:37:07 +1000 Subject: RFR (XS): 8132892: Memory must be freed after calling Arguments::set_sysclasspath function In-Reply-To: <55C08268.8090900@oracle.com> References: <55C08268.8090900@oracle.com> Message-ID: <55C419D3.1040408@oracle.com> Hi Dmitry, This looks good to me and I can sponsor it for you. Not sure if I can squeeze it under the "trivial" bar so we only need one review. :) I've checked all the assertions about allocations and responsibilities and it all seems correct. I'm wondering if this is a long standing issue or whether set_sysclasspath has changed its behaviour? Thanks, David On 4/08/2015 7:14 PM, Dmitry Dmitriev wrote: > Hello, > > Please review this small fix which fix small memory leak. Also, I need a > sponsor for this fix, who can push it. > > Arguments::set_sysclasspath function call set_value method of > SystemProperty class which copy passed value. In several code paths > memory is allocated for string and then this string is passed to > Arguments::set_sysclasspath. Therefore allocated string should be freed > after calling Arguments::set_sysclasspath function. > > Webrev: http://cr.openjdk.java.net/~ddmitriev/8132892/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8132892 > Tested: JPRT(hotspot test set), hotspot all, vm.quick > > Thanks, > Dmitry From dmitry.dmitriev at oracle.com Fri Aug 7 07:58:33 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Fri, 7 Aug 2015 10:58:33 +0300 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> Message-ID: <55C46529.4030601@oracle.com> Hello Jiangli, Thank you. About extra spaces - I just highlight changed lines in webrev and see that some lines have additional spaces at the end. :) Dmitry On 07.08.2015 3:40, Jiangli Zhou wrote: > Hi Dmitry, > > I?ve added shared string unmapping to handle the string data > verification failure. I also added 'free_archive_regions()? call > to FileMapInfo::unmap_string_regions(). My testing revealed some > issues with the new archive region free code when I forced string > verification failure. I?m working with Tom on the issues. > > Thanks, > Jiangli > > > On Aug 6, 2015, at 3:19 PM, Jiangli Zhou > wrote: > >> Hi Dmitry, >> >> Thank you for the feedback! You are definitely right about freeing >> the archived regions in those failure cases. I should have thought it >> more thoroughly, instead of just fixing the case that I ran into. >> Will exam all cases. >> >> I?ll change the test to incorporate your suggestion. Will fix the >> extra spaces before committing. I have one question, how do you >> detect the extra spaces from the webrev? :) >> >> Thanks, >> Jiangli >> >> On Aug 6, 2015, at 2:59 PM, Dmitry Dmitriev >> > wrote: >> >>> Hello Jiangli, >>> >>> I have few comments/questions. >>> >>> src/share/vm/memory/filemap.cpp module: >>> 1) Should free_archive_regions also called when >>> verify_string_regions() returns false on line 717? >>> 2) The same question about unmap_string_regions(). Region must be >>> freed when unmap_string_regions() is called? >>> 3) Extra space at the end of the line 711. >>> >>> test/runtime/SharedArchiveFile/SharedStringsRunAuto.java >>> 1) Unneeded second creating of OutputAnalyzer on line 61. Also, >>> probably will be better to use same scheme for OutputAnalyzer? On >>> lines 46-49 you not use local variable, but on lines 61-63 use local >>> variable. >>> 59 OutputAnalyzer output = new OutputAnalyzer(pb.start()); >>> 60 >>> 61 output = new OutputAnalyzer(pb.start()); >>> 2) Extra space at the end of the lines 25,26,30, 51 >>> >>> Thanks, >>> Dmitry >>> >>> On 07.08.2015 0:32, Jiangli Zhou wrote: >>>> Hi, >>>> >>>> Here is the runtime part of the bug fix that calls the new >>>> free_archive_regions() when shared string mapping fails. I also >>>> added a jtreg test to test shared strings with -Xshare:auto. >>>> >>>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.00/ >>>> >>>> >>>> Test: >>>> - Tested by explicitly making the shared string mapping fail on >>>> linux-x64, -Xshare:auto runs without crash with the fix >>>> - Tested with the new SharedStringsRunAuto test >>>> - Tested with XX:+PrintNMTStatistics -XX:NativeMemoryTracking=detail >>>> >>>> Thanks, >>>> Jiangli >>> >> > From dmitry.dmitriev at oracle.com Fri Aug 7 08:02:19 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Fri, 7 Aug 2015 11:02:19 +0300 Subject: RFR (XS): 8132892: Memory must be freed after calling Arguments::set_sysclasspath function In-Reply-To: <55C419D3.1040408@oracle.com> References: <55C08268.8090900@oracle.com> <55C419D3.1040408@oracle.com> Message-ID: <55C4660B.5090509@oracle.com> Hello David, Thank you for review and sponsorship. Yes, sure, let's wait for one more review. It seems that it a long standing issue, since set_sysclasspath and SystemProperty class not changed for a while. Dmitry On 07.08.2015 5:37, David Holmes wrote: > Hi Dmitry, > > This looks good to me and I can sponsor it for you. Not sure if I can > squeeze it under the "trivial" bar so we only need one review. :) I've > checked all the assertions about allocations and responsibilities and > it all seems correct. I'm wondering if this is a long standing issue > or whether set_sysclasspath has changed its behaviour? > > Thanks, > David > > On 4/08/2015 7:14 PM, Dmitry Dmitriev wrote: >> Hello, >> >> Please review this small fix which fix small memory leak. Also, I need a >> sponsor for this fix, who can push it. >> >> Arguments::set_sysclasspath function call set_value method of >> SystemProperty class which copy passed value. In several code paths >> memory is allocated for string and then this string is passed to >> Arguments::set_sysclasspath. Therefore allocated string should be freed >> after calling Arguments::set_sysclasspath function. >> >> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132892/webrev.00/ >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8132892 >> Tested: JPRT(hotspot test set), hotspot all, vm.quick >> >> Thanks, >> Dmitry From tom.benson at oracle.com Fri Aug 7 13:52:06 2015 From: tom.benson at oracle.com (Tom Benson) Date: Fri, 07 Aug 2015 09:52:06 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> Message-ID: <55C4B806.7050504@oracle.com> Hi, On 8/6/2015 8:40 PM, Jiangli Zhou wrote: > Hi Dmitry, > > I?ve added shared string unmapping to handle the string data verification failure. I also added 'free_archive_regions()? call to FileMapInfo::unmap_string_regions(). My testing revealed some issues with the new archive region free code when I forced string verification failure. I?m working with Tom on the issues. The free_archive_regions code was intended to handle the failed mmap case, and thus the regions would have been empty/unused. So perhaps not surprising this requires a small enhancement. Tom From tom.benson at oracle.com Fri Aug 7 14:56:32 2015 From: tom.benson at oracle.com (Tom Benson) Date: Fri, 07 Aug 2015 10:56:32 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C4B806.7050504@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> Message-ID: <55C4C720.5030903@oracle.com> Hi, On 8/7/2015 9:52 AM, Tom Benson wrote: > Hi, > On 8/6/2015 8:40 PM, Jiangli Zhou wrote: >> Hi Dmitry, >> >> I?ve added shared string unmapping to handle the string data >> verification failure. I also added 'free_archive_regions()? call to >> FileMapInfo::unmap_string_regions(). My testing revealed some issues >> with the new archive region free code when I forced string >> verification failure. I?m working with Tom on the issues. > The problem is simply that in addition to calling free_archive_regions, FileMapInfo::unmap_string_regions also unmaps the memory.... so there's a segv when GC tries to re-use it. Let's talk directly about the best way to handle it. Tom From jiangli.zhou at oracle.com Fri Aug 7 15:55:53 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 7 Aug 2015 08:55:53 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C46529.4030601@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C46529.4030601@oracle.com> Message-ID: <21CC8572-5BFF-47AD-ADA9-CCE1F7762EDE@oracle.com> On Aug 7, 2015, at 12:58 AM, Dmitry Dmitriev wrote: > Hello Jiangli, > > Thank you. About extra spaces - I just highlight changed lines in webrev and see that some lines have additional spaces at the end. :) I see. :) Thanks, Dmitry. Jiangli > > Dmitry > > On 07.08.2015 3:40, Jiangli Zhou wrote: >> Hi Dmitry, >> >> I?ve added shared string unmapping to handle the string data verification failure. I also added 'free_archive_regions()? call to FileMapInfo::unmap_string_regions(). My testing revealed some issues with the new archive region free code when I forced string verification failure. I?m working with Tom on the issues. >> >> Thanks, >> Jiangli >> >> >> On Aug 6, 2015, at 3:19 PM, Jiangli Zhou wrote: >> >>> Hi Dmitry, >>> >>> Thank you for the feedback! You are definitely right about freeing the archived regions in those failure cases. I should have thought it more thoroughly, instead of just fixing the case that I ran into. Will exam all cases. >>> >>> I?ll change the test to incorporate your suggestion. Will fix the extra spaces before committing. I have one question, how do you detect the extra spaces from the webrev? :) >>> >>> Thanks, >>> Jiangli >>> >>> On Aug 6, 2015, at 2:59 PM, Dmitry Dmitriev wrote: >>> >>>> Hello Jiangli, >>>> >>>> I have few comments/questions. >>>> >>>> src/share/vm/memory/filemap.cpp module: >>>> 1) Should free_archive_regions also called when verify_string_regions() returns false on line 717? >>>> 2) The same question about unmap_string_regions(). Region must be freed when unmap_string_regions() is called? >>>> 3) Extra space at the end of the line 711. >>>> >>>> test/runtime/SharedArchiveFile/SharedStringsRunAuto.java >>>> 1) Unneeded second creating of OutputAnalyzer on line 61. Also, probably will be better to use same scheme for OutputAnalyzer? On lines 46-49 you not use local variable, but on lines 61-63 use local variable. >>>> 59 OutputAnalyzer output = new OutputAnalyzer(pb.start()); >>>> 60 >>>> 61 output = new OutputAnalyzer(pb.start()); >>>> 2) Extra space at the end of the lines 25,26,30, 51 >>>> >>>> Thanks, >>>> Dmitry >>>> >>>> On 07.08.2015 0:32, Jiangli Zhou wrote: >>>>> Hi, >>>>> >>>>> Here is the runtime part of the bug fix that calls the new free_archive_regions() when shared string mapping fails. I also added a jtreg test to test shared strings with -Xshare:auto. >>>>> >>>>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.00/ >>>>> >>>>> Test: >>>>> - Tested by explicitly making the shared string mapping fail on linux-x64, -Xshare:auto runs without crash with the fix >>>>> - Tested with the new SharedStringsRunAuto test >>>>> - Tested with XX:+PrintNMTStatistics -XX:NativeMemoryTracking=detail >>>>> >>>>> Thanks, >>>>> Jiangli >>>> >>> >> > From daniel.daugherty at oracle.com Fri Aug 7 16:27:43 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 07 Aug 2015 10:27:43 -0600 Subject: RFR (XS): 8132892: Memory must be freed after calling Arguments::set_sysclasspath function In-Reply-To: <55C08268.8090900@oracle.com> References: <55C08268.8090900@oracle.com> Message-ID: <55C4DC7F.9070301@oracle.com> On 8/4/15 3:14 AM, Dmitry Dmitriev wrote: > Hello, > > Please review this small fix which fix small memory leak. Also, I need > a sponsor for this fix, who can push it. > > Arguments::set_sysclasspath function call set_value method of > SystemProperty class which copy passed value. In several code paths > memory is allocated for string and then this string is passed to > Arguments::set_sysclasspath. Therefore allocated string should be > freed after calling Arguments::set_sysclasspath function. > > Webrev: http://cr.openjdk.java.net/~ddmitriev/8132892/webrev.00/ > src/share/vm/runtime/arguments.cpp src/share/vm/runtime/os.cpp L1162: char* os::format_boot_path(const char* format_string, ... Not your bug, but: format_boot_path() is only called from os.cpp so I have to wonder why it is named in the 'os' class instead of being private to os.cpp. Perhaps this is worth a cleanup RFE... L1286: FREE_C_HEAP_ARRAY(char, modules_dir); Please set modules_dir = NULL after the free to prevent accidental post-free use; the other free calls are fine because they are closely followed by a return. Thumbs up. Adding the "modules_dir = NULL" is your choice and I don't need a re-review if you choose to do it. Dan > JBS: https://bugs.openjdk.java.net/browse/JDK-8132892 > Tested: JPRT(hotspot test set), hotspot all, vm.quick > > Thanks, > Dmitry > > From yumin.qi at oracle.com Fri Aug 7 16:48:18 2015 From: yumin.qi at oracle.com (Yumin Qi) Date: Fri, 07 Aug 2015 09:48:18 -0700 Subject: RFR: 8130115: REDO - Reduce Symbol::_identity_hash to 2 bytes In-Reply-To: <55B70538.1070109@oracle.com> References: <55A69A5F.3070504@oracle.com> <55B70538.1070109@oracle.com> Message-ID: <55C4E151.4020805@oracle.com> Ioi, I am trying to add a test case in SA for the testing as you mentioned. The easy part is adding a simple SA Tool (SymbolsInfo.java) to get the Symbol information but encountered a problem as: In the testing java process (1), create (spawn) another java process(2), which will run SA (SymbolsInfo) and attach back to the process (1). It failed due to time out waiting for response from target(1). I am investigating and trying to find a solution. It may have issue for such case. Webrev (Note: in the webrev, WhitBox.java, white_box.cpp, SymbolsInfo.java and IdentityHashForSymbols.java added to previous version webrev01) http://cr.openjdk.java.net/~minqi/8130115/webrev02/ Any one has comments how to solve the problem? Following are the two processes, you can see process 2) has parent as process 1): ($WS, $TEST are for real locations on my host machine) 1) 25939 25807 19 09:32 pts/1 00:00:00 $MYJDK/bin/java -Dtest.src=$WS/hotspot/test/serviceability/sa -Dtest.src.path=$WS/hotspot/test/serviceability/sa:$WS/hotspot/test/testlibrary:$WS/test/lib -Dtest.classes=$TEST/JTwork/classes/serviceability/sa -Dtest.class.path=$TEST/JTwork/classes/serviceability/sa:$TEST/JTwork/classes/testlibrary:$TEST/test/lib -Dtest.vm.opts= -Dtest.tool.vm.opts= -Dtest.compiler.opts= -Dtest.java.opts= -Dtest.jdk=$MYJDK -Dcompile.jdk=$MYJDK -Dtest.timeout.factor=1.0 -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI com.sun.javatest.regtest.agent.MainWrapper $TEST/JTwork/classes/serviceability/sa/IdentityHashForSymbols.jta 2) 25976 25939 63 09:32 pts/1 00:00:03 $MYJDK/bin/java -cp $TEST/jtreg/lib/javatest.jar:$TEST/jtreg/lib/jtreg.jar:$TEST/JTwork/classes/serviceability/sa:$WS/hotspot/test/serviceability/sa:$TEST/JTwork/classes/testlibrary:$TEST/test/lib sun.jvm.hotspot.tools.SymbolsInfo 25939 Thanks Yumin On 7/27/2015 9:29 PM, Ioi Lam wrote: > Hi Yumin, > > The C code changes look good to me. > > I am a little concerned about the Java code's calculation of > identityHash: > > Java version: > 86 public int identityHash() { > 87 long addr_value = getAddress().asLongValue(); > 88 int addr_bits = (int)(addr_value >> > (VM.getVM().getLogMinObjAlignmentInBytes() + 3)); > 89 int length = (int)getLength(); > 90 int byte0 = getByteAt(0); > 91 int byte1 = getByteAt(1); > 92 int id_hash = (int)(0xffff & idHash.getValue(this.addr)); > 93 return id_hash | > 94 ((addr_bits ^ (length << 8) ^ ((byte0 << 8) | byte1)) > << 16); > 95 } > > C version: > 148 unsigned identity_hash() { > 149 unsigned addr_bits = (unsigned)((uintptr_t)this >> > (LogMinObjAlignmentInBytes + 3)); > 150 return (unsigned)_identity_hash | > 151 ((addr_bits ^ (_length << 8) ^ (( _body[0] << 8) | > _body[1])) << 16); > 152 } > > The main problem is to correctly emulate the C unsigned operations in > the Java code. I've eyeballed the code and it seems correct, but I am > wondering if you have actually tested and verified that the Java > version indeed returns the same value as the C code? A unit test case > would be good: > > * Add a new test in hotspot/agent/test > * Get a few Symbols (e.g., call > sun.jvm.hotspot.runtime.VM.getSymbolTable and iterate over the first > 1000 Symbols) > * For each Symbol, call its Symbol.identityHash() method > * Add a new whitebox API to return the C version of the > identity_hash() value > * Check if the C value is the same as the Java value > > Please run the test on all platforms (both 32-bit and 64-bit, and all > OSes). > > Thanks > - Ioi > > > On 7/15/15 10:37 AM, Yumin Qi wrote: >> Hi, >> >> This is redo for bug 8087143, in that push, it caused failure on >> Serviceability Agent failed to get type for "_identity_hash": >> mistakenly used JShortField for it, but in fact it still is >> CIntegerField. In this change, besides of the original change in >> hotspot/src, I add code to calculate identity_hash in hotspot/agent >> based on the changed in hotspot. >> >> Old webrev for 8087143: >> bug: https://bugs.openjdk.java.net/browse/JDK-8087143 >> webrev: http://cr.openjdk.java.net/~minqi/8087143/webrev03/ >> >> Summary: _identity_hash is an integer in Symbol (SymbolBase), it is >> used to compute hash bucket index by modulus division of table size. >> Currently in hotspot, no table size is more than 65535 so we can use >> short instead. For case with table size over 65535 we can use the >> first two bytes of symbol data to be as the upper 16 bits for the >> calculation but rare cases. >> >> New webrev for 8130115: >> bug: https://bugs.openjdk.java.net/browse/JDK-8130115 >> webrev: http://cr.openjdk.java.net/~minqi/8130115/webrev01/ >> >> >> Tests: JPRT, SA manual tests, -atk quick, jtreg hotspot/runtime >> Also internal large application used for hashtable data analysis --- >> the No. of loaded classes is big(over 19K), and tested with different >> bucket sizes including over 65535 to see the new algorithm for >> identity_hash calculation, result shows the consistency before and >> after the fix. >> >> Thanks >> Yumin > From mikhailo.seledtsov at oracle.com Fri Aug 7 23:36:08 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 07 Aug 2015 16:36:08 -0700 Subject: RFR(XS): Quarantine: JDK-8133222 - [TESTBUG] Quarantine runtime/SharedArchiveFile/SharedStrings.java until the fix Message-ID: <55C540E8.7050205@oracle.com> Please review this one line change to quarantine a test that fails in nightly, until the root cause is found and test is fixed. JBS: https://bugs.openjdk.java.net/browse/JDK-8133222 Webrev: http://cr.openjdk.java.net/~mseledtsov/8133222.00/ Testing: Ran test locally (Linux-x64) This should be sufficient since change is trivial and platform-independent. Thank you, Misha From daniel.daugherty at oracle.com Fri Aug 7 23:59:30 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 07 Aug 2015 17:59:30 -0600 Subject: RFR(XS): Quarantine: JDK-8133222 - [TESTBUG] Quarantine runtime/SharedArchiveFile/SharedStrings.java until the fix In-Reply-To: <55C540E8.7050205@oracle.com> References: <55C540E8.7050205@oracle.com> Message-ID: <55C54662.30108@oracle.com> Thumbs up. Trivial review rules apply assuming that this: > Ran test locally (Linux-x64) means that the test actually didn't run. :-) Dan On 8/7/15 5:36 PM, Mikhailo Seledtsov wrote: > Please review this one line change to quarantine a test that fails in > nightly, > until the root cause is found and test is fixed. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8133222 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8133222.00/ > Testing: > Ran test locally (Linux-x64) > This should be sufficient since change is trivial and > platform-independent. > > Thank you, > Misha From mikhailo.seledtsov at oracle.com Sat Aug 8 00:09:03 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 07 Aug 2015 17:09:03 -0700 Subject: RFR(XS): Quarantine: JDK-8133222 - [TESTBUG] Quarantine runtime/SharedArchiveFile/SharedStrings.java until the fix In-Reply-To: <55C54662.30108@oracle.com> References: <55C540E8.7050205@oracle.com> <55C54662.30108@oracle.com> Message-ID: <55C5489F.1090601@oracle.com> Dan, Thank you for review. Misha On 8/7/2015 4:59 PM, Daniel D. Daugherty wrote: > Thumbs up. > > Trivial review rules apply assuming that this: > > > Ran test locally (Linux-x64) > > means that the test actually didn't run. :-) > Right. I launched the test, but it didn't run due to "@ignore" :) > Dan > > > On 8/7/15 5:36 PM, Mikhailo Seledtsov wrote: >> Please review this one line change to quarantine a test that fails in >> nightly, >> until the root cause is found and test is fixed. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8133222 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133222.00/ >> Testing: >> Ran test locally (Linux-x64) >> This should be sufficient since change is trivial and >> platform-independent. >> >> Thank you, >> Misha > From mikhailo.seledtsov at oracle.com Sat Aug 8 00:20:53 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 07 Aug 2015 17:20:53 -0700 Subject: RFR(XS): Quarantine: JDK-8133222 - [TESTBUG] Quarantine runtime/SharedArchiveFile/SharedStrings.java until the fix In-Reply-To: <55C5489F.1090601@oracle.com> References: <55C540E8.7050205@oracle.com> <55C54662.30108@oracle.com> <55C5489F.1090601@oracle.com> Message-ID: <55C54B65.5070601@oracle.com> Since I do not have a committer rights yet, could someone please push this change for me. The hgexport file is attached. Thank you, Misha On 8/7/2015 5:09 PM, Mikhailo Seledtsov wrote: > Dan, > > Thank you for review. > > Misha > > On 8/7/2015 4:59 PM, Daniel D. Daugherty wrote: >> Thumbs up. >> >> Trivial review rules apply assuming that this: >> >> > Ran test locally (Linux-x64) >> >> means that the test actually didn't run. :-) >> > Right. I launched the test, but it didn't run due to "@ignore" > :) > >> Dan >> >> >> On 8/7/15 5:36 PM, Mikhailo Seledtsov wrote: >>> Please review this one line change to quarantine a test that fails >>> in nightly, >>> until the root cause is found and test is fixed. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8133222 >>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133222.00/ >>> Testing: >>> Ran test locally (Linux-x64) >>> This should be sufficient since change is trivial and >>> platform-independent. >>> >>> Thank you, >>> Misha >> > -------------- next part -------------- # HG changeset patch # User mseledtsov # Date 1438992866 25200 # Fri Aug 07 17:14:26 2015 -0700 # Node ID be32044bf20079d41884fe431a01ba5b9af1eab0 # Parent a3d4ec0c8636bff0ef87de9af4bca1f860a972f3 8133222: [TESTBUG] Quarantine runtime/SharedArchiveFile/SharedStrings.java until the fix Summary: Quarantined using at-ingore tag Reviewed-by: dcubed diff --git a/test/runtime/SharedArchiveFile/SharedStrings.java b/test/runtime/SharedArchiveFile/SharedStrings.java --- a/test/runtime/SharedArchiveFile/SharedStrings.java +++ b/test/runtime/SharedArchiveFile/SharedStrings.java @@ -32,6 +32,7 @@ * @library /testlibrary /../../test/lib * @modules java.base/sun.misc * java.management + * @ignore - 8133180 * @build SharedStringsWb SharedStrings BasicJarBuilder * @run main ClassFileInstaller sun.hotspot.WhiteBox * @run main SharedStrings From daniel.daugherty at oracle.com Sat Aug 8 01:07:56 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 07 Aug 2015 19:07:56 -0600 Subject: RFR(XS): Quarantine: JDK-8133222 - [TESTBUG] Quarantine runtime/SharedArchiveFile/SharedStrings.java until the fix In-Reply-To: <55C54B65.5070601@oracle.com> References: <55C540E8.7050205@oracle.com> <55C54662.30108@oracle.com> <55C5489F.1090601@oracle.com> <55C54B65.5070601@oracle.com> Message-ID: <55C5566C.3070908@oracle.com> I'll proxy the push for you as soon as I can clone a repo. Dan On 8/7/15 6:20 PM, Mikhailo Seledtsov wrote: > Since I do not have a committer rights yet, could someone please push > this change for me. The hgexport file is attached. > > Thank you, > Misha > > On 8/7/2015 5:09 PM, Mikhailo Seledtsov wrote: >> Dan, >> >> Thank you for review. >> >> Misha >> >> On 8/7/2015 4:59 PM, Daniel D. Daugherty wrote: >>> Thumbs up. >>> >>> Trivial review rules apply assuming that this: >>> >>> > Ran test locally (Linux-x64) >>> >>> means that the test actually didn't run. :-) >>> >> Right. I launched the test, but it didn't run due to "@ignore" >> :) >> >>> Dan >>> >>> >>> On 8/7/15 5:36 PM, Mikhailo Seledtsov wrote: >>>> Please review this one line change to quarantine a test that fails >>>> in nightly, >>>> until the root cause is found and test is fixed. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8133222 >>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133222.00/ >>>> Testing: >>>> Ran test locally (Linux-x64) >>>> This should be sufficient since change is trivial and >>>> platform-independent. >>>> >>>> Thank you, >>>> Misha >>> >> > From dmitry.dmitriev at oracle.com Sat Aug 8 10:40:16 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Sat, 8 Aug 2015 13:40:16 +0300 Subject: RFR (XS): 8132892: Memory must be freed after calling Arguments::set_sysclasspath function In-Reply-To: <55C4DC7F.9070301@oracle.com> References: <55C08268.8090900@oracle.com> <55C4DC7F.9070301@oracle.com> Message-ID: <55C5DC90.6050605@oracle.com> Hello Dan, Thank you for review and comments! I prefer to leave patch as-is, i.e. not to add the "modules_dir = NULL" in this trivial case. Dmitry On 07.08.2015 19:27, Daniel D. Daugherty wrote: > > On 8/4/15 3:14 AM, Dmitry Dmitriev wrote: >> Hello, >> >> Please review this small fix which fix small memory leak. Also, I >> need a sponsor for this fix, who can push it. >> >> Arguments::set_sysclasspath function call set_value method of >> SystemProperty class which copy passed value. In several code paths >> memory is allocated for string and then this string is passed to >> Arguments::set_sysclasspath. Therefore allocated string should be >> freed after calling Arguments::set_sysclasspath function. >> >> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132892/webrev.00/ >> > > src/share/vm/runtime/arguments.cpp > src/share/vm/runtime/os.cpp > L1162: char* os::format_boot_path(const char* format_string, ... > Not your bug, but: > > format_boot_path() is only called from os.cpp so I have > to wonder why it is named in the 'os' class instead of > being private to os.cpp. Perhaps this is worth a cleanup > RFE... > > L1286: FREE_C_HEAP_ARRAY(char, modules_dir); > Please set modules_dir = NULL after the free to prevent > accidental post-free use; the other free calls are fine > because they are closely followed by a return. > > Thumbs up. Adding the "modules_dir = NULL" is your choice > and I don't need a re-review if you choose to do it. > > Dan > > >> JBS: https://bugs.openjdk.java.net/browse/JDK-8132892 >> Tested: JPRT(hotspot test set), hotspot all, vm.quick >> >> Thanks, >> Dmitry >> >> > From max.ockner at oracle.com Mon Aug 10 18:31:23 2015 From: max.ockner at oracle.com (Max Ockner) Date: Mon, 10 Aug 2015 14:31:23 -0400 Subject: RFR: 8098791: Remove support for PrintMethodStatistics and PrintClassStatistics Message-ID: <55C8EDFB.3090301@oracle.com> Hello, Please review this small change. Bug: https://bugs.openjdk.java.net/browse/JDK-8098791 Webrev: http://cr.openjdk.java.net/~mockner/8098791/ Summary: The code supporting PrintMethodStatistics and PrintClassStatistics was removed. The options are not useful, and they are not tested. A CCC request has been approved for this change (http://ccc.us.oracle.com/8098791). This should lighten the load for unified logging changes. Tested with jtreg *. Thanks, Max From coleen.phillimore at oracle.com Mon Aug 10 19:38:02 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Aug 2015 15:38:02 -0400 Subject: RFR: 8098791: Remove support for PrintMethodStatistics and PrintClassStatistics In-Reply-To: <55C8EDFB.3090301@oracle.com> References: <55C8EDFB.3090301@oracle.com> Message-ID: <55C8FD9A.6040605@oracle.com> Looks good! Thanks, Coleen On 8/10/15 2:31 PM, Max Ockner wrote: > Hello, > Please review this small change. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8098791 > Webrev: http://cr.openjdk.java.net/~mockner/8098791/ > > Summary: The code supporting PrintMethodStatistics and > PrintClassStatistics was removed. The options are not useful, and they > are not tested. A CCC request has been approved for this change > (http://ccc.us.oracle.com/8098791). This should lighten the load for > unified logging changes. > > Tested with jtreg *. > > Thanks, > Max > > From david.holmes at oracle.com Tue Aug 11 06:40:38 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Aug 2015 16:40:38 +1000 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout Message-ID: <55C998E6.6020609@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ The code introduced in 6900441 contained a bug in the code path for when WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the change in 8130728 which made the default setting of WorkAroundNPTLTimedWaitHang zero. In PlatformParker on Linux _cur_index tracks which pthread_cond object is currently in use by a waiting thread (one for relative-timed waits, the other for absolute-timed waits) and is set to -1 when the thread is not waiting. In the path now used by default we release the pthread_mutex_t and then pthread_cond_signal the condition variable at _cond[_cur_index]. But as soon as we release the mutex the waiting thread can resume execution (it may have timed-out and so not need the signal) and set _cur_index to -1. The signalling thread then signals _cond[-1] which does not contain a pthread_cond_t object. This can result in the pthread_cond_signal hanging, and potentially other consequences. The fix is simple: save the correct index before unlocking the mutex. The test: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java has been marked as failing intermittently (8133231) due to this and I will revert that as part of this fix, once that change reaches the hs-rt forest. Thanks, David From bertrand.delsart at oracle.com Tue Aug 11 09:37:01 2015 From: bertrand.delsart at oracle.com (Bertrand Delsart) Date: Tue, 11 Aug 2015 11:37:01 +0200 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55C998E6.6020609@oracle.com> References: <55C998E6.6020609@oracle.com> Message-ID: <55C9C23D.7070005@oracle.com> Looks OK but may not be sufficient. To play it even safer, I'd rather read _cur_index only once, for both the WorkAroundNPTLTimedWaitHang and the !WorkAroundNPTLTimedWaitHang case, e.g.: 5779 // thread might be parked [ save the _cur_index here, before testing it ] int index = _cur_index; 5780 if (_cur_index != -1) { => if (index != -1) 5781 // thread is definitely parked 5782 if (WorkAroundNPTLTimedWaitHang) { 5783 status = pthread_cond_signal(&_cond[_cur_index]); => use index instead of re-reading _cur_index 5784 assert(status == 0, "invariant"); 5785 status = pthread_mutex_unlock(_mutex); 5786 assert(status == 0, "invariant"); 5787 } else { 5788 // must capture correct index before unlocking [ 5789 int index = _cur_index; ] // now loaded earlier 5790 status = pthread_mutex_unlock(_mutex); 5791 assert(status == 0, "invariant"); 5792 status = pthread_cond_signal(&_cond[index]); 5793 assert(status == 0, "invariant"); Bertrand. On 11/08/2015 08:40, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 > > Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ > > The code introduced in 6900441 contained a bug in the code path for when > WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the change > in 8130728 which made the default setting of WorkAroundNPTLTimedWaitHang > zero. > > In PlatformParker on Linux _cur_index tracks which pthread_cond object > is currently in use by a waiting thread (one for relative-timed waits, > the other for absolute-timed waits) and is set to -1 when the thread is > not waiting. In the path now used by default we release the > pthread_mutex_t and then pthread_cond_signal the condition variable at > _cond[_cur_index]. But as soon as we release the mutex the waiting > thread can resume execution (it may have timed-out and so not need the > signal) and set _cur_index to -1. The signalling thread then signals > _cond[-1] which does not contain a pthread_cond_t object. This can > result in the pthread_cond_signal hanging, and potentially other > consequences. > > The fix is simple: save the correct index before unlocking the mutex. > > The test: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java > has been marked as failing intermittently (8133231) due to this and I > will revert that as part of this fix, once that change reaches the hs-rt > forest. > > Thanks, > David -- Bertrand Delsart, Grenoble Engineering Center Oracle, 180 av. de l'Europe, ZIRST de Montbonnot 38330 Montbonnot Saint Martin, FRANCE bertrand.delsart at oracle.com Phone : +33 4 76 18 81 23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From david.holmes at oracle.com Tue Aug 11 10:51:25 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Aug 2015 20:51:25 +1000 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55C9C23D.7070005@oracle.com> References: <55C998E6.6020609@oracle.com> <55C9C23D.7070005@oracle.com> Message-ID: <55C9D3AD.1050003@oracle.com> On 11/08/2015 7:37 PM, Bertrand Delsart wrote: > Looks OK but may not be sufficient. > > To play it even safer, I'd rather read _cur_index only once, for both > the WorkAroundNPTLTimedWaitHang and the !WorkAroundNPTLTimedWaitHang > case, e.g.: I can do that but it isn't necessary for correctness - _cur_index is only modified whilst holding the mutex so it must also only be read whilst holding the mutex, which is now fully covered. Thanks, David ----- > 5779 // thread might be parked > > [ save the _cur_index here, before testing it ] > int index = _cur_index; > > 5780 if (_cur_index != -1) { > => if (index != -1) > > 5781 // thread is definitely parked > 5782 if (WorkAroundNPTLTimedWaitHang) { > 5783 status = pthread_cond_signal(&_cond[_cur_index]); > => use index instead of re-reading _cur_index > > 5784 assert(status == 0, "invariant"); > 5785 status = pthread_mutex_unlock(_mutex); > 5786 assert(status == 0, "invariant"); > 5787 } else { > 5788 // must capture correct index before unlocking > > [ 5789 int index = _cur_index; ] // now loaded earlier > > 5790 status = pthread_mutex_unlock(_mutex); > 5791 assert(status == 0, "invariant"); > 5792 status = pthread_cond_signal(&_cond[index]); > 5793 assert(status == 0, "invariant"); > > > Bertrand. > > On 11/08/2015 08:40, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 >> >> Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ >> >> The code introduced in 6900441 contained a bug in the code path for when >> WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the change >> in 8130728 which made the default setting of WorkAroundNPTLTimedWaitHang >> zero. >> >> In PlatformParker on Linux _cur_index tracks which pthread_cond object >> is currently in use by a waiting thread (one for relative-timed waits, >> the other for absolute-timed waits) and is set to -1 when the thread is >> not waiting. In the path now used by default we release the >> pthread_mutex_t and then pthread_cond_signal the condition variable at >> _cond[_cur_index]. But as soon as we release the mutex the waiting >> thread can resume execution (it may have timed-out and so not need the >> signal) and set _cur_index to -1. The signalling thread then signals >> _cond[-1] which does not contain a pthread_cond_t object. This can >> result in the pthread_cond_signal hanging, and potentially other >> consequences. >> >> The fix is simple: save the correct index before unlocking the mutex. >> >> The test: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java >> has been marked as failing intermittently (8133231) due to this and I >> will revert that as part of this fix, once that change reaches the hs-rt >> forest. >> >> Thanks, >> David > > From bertrand.delsart at oracle.com Tue Aug 11 11:54:09 2015 From: bertrand.delsart at oracle.com (Bertrand Delsart) Date: Tue, 11 Aug 2015 13:54:09 +0200 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55C9D3AD.1050003@oracle.com> References: <55C998E6.6020609@oracle.com> <55C9C23D.7070005@oracle.com> <55C9D3AD.1050003@oracle.com> Message-ID: <55C9E261.3040704@oracle.com> On 11/08/2015 12:51, David Holmes wrote: > On 11/08/2015 7:37 PM, Bertrand Delsart wrote: >> Looks OK but may not be sufficient. >> >> To play it even safer, I'd rather read _cur_index only once, for both >> the WorkAroundNPTLTimedWaitHang and the !WorkAroundNPTLTimedWaitHang >> case, e.g.: > > I can do that but it isn't necessary for correctness - _cur_index is > only modified whilst holding the mutex so it must also only be read > whilst holding the mutex, which is now fully covered. OK. Approved as is if you prefer. Your pick. Bertrand. > > Thanks, > David > ----- > >> 5779 // thread might be parked >> >> [ save the _cur_index here, before testing it ] >> int index = _cur_index; >> >> 5780 if (_cur_index != -1) { >> => if (index != -1) >> >> 5781 // thread is definitely parked >> 5782 if (WorkAroundNPTLTimedWaitHang) { >> 5783 status = pthread_cond_signal(&_cond[_cur_index]); >> => use index instead of re-reading _cur_index >> >> 5784 assert(status == 0, "invariant"); >> 5785 status = pthread_mutex_unlock(_mutex); >> 5786 assert(status == 0, "invariant"); >> 5787 } else { >> 5788 // must capture correct index before unlocking >> >> [ 5789 int index = _cur_index; ] // now loaded earlier >> >> 5790 status = pthread_mutex_unlock(_mutex); >> 5791 assert(status == 0, "invariant"); >> 5792 status = pthread_cond_signal(&_cond[index]); >> 5793 assert(status == 0, "invariant"); >> >> >> Bertrand. >> >> On 11/08/2015 08:40, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 >>> >>> Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ >>> >>> The code introduced in 6900441 contained a bug in the code path for when >>> WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the change >>> in 8130728 which made the default setting of WorkAroundNPTLTimedWaitHang >>> zero. >>> >>> In PlatformParker on Linux _cur_index tracks which pthread_cond object >>> is currently in use by a waiting thread (one for relative-timed waits, >>> the other for absolute-timed waits) and is set to -1 when the thread is >>> not waiting. In the path now used by default we release the >>> pthread_mutex_t and then pthread_cond_signal the condition variable at >>> _cond[_cur_index]. But as soon as we release the mutex the waiting >>> thread can resume execution (it may have timed-out and so not need the >>> signal) and set _cur_index to -1. The signalling thread then signals >>> _cond[-1] which does not contain a pthread_cond_t object. This can >>> result in the pthread_cond_signal hanging, and potentially other >>> consequences. >>> >>> The fix is simple: save the correct index before unlocking the mutex. >>> >>> The test: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java >>> has been marked as failing intermittently (8133231) due to this and I >>> will revert that as part of this fix, once that change reaches the hs-rt >>> forest. >>> >>> Thanks, >>> David >> >> -- Bertrand Delsart, Grenoble Engineering Center Oracle, 180 av. de l'Europe, ZIRST de Montbonnot 38330 Montbonnot Saint Martin, FRANCE bertrand.delsart at oracle.com Phone : +33 4 76 18 81 23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From tom.benson at oracle.com Tue Aug 11 17:43:12 2015 From: tom.benson at oracle.com (Tom Benson) Date: Tue, 11 Aug 2015 13:43:12 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55C4C720.5030903@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> Message-ID: <55CA3430.3070300@oracle.com> Hi, On 8/7/2015 10:56 AM, Tom Benson wrote: > > The problem is simply that in addition to calling > free_archive_regions, FileMapInfo::unmap_string_regions also unmaps > the memory.... so there's a segv when GC tries to re-use it. Let's > talk directly about the best way to handle it. > Tom After some discussion, I've changed the definition and name of free_archive_regions. Now called dealloc_archive_regions, it uncommits the specified regions, unmapping the memory, rather than adding them to the free list. This means the CDS code will no longer do the unmapping on verification failures. Updated full and incremental webrevs of the GC code are at: http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ Tested with JPRT and running benchmarks with the dealloc_ performed explicitly. Jiangli also tested the original failing cases, and will be posting an updated webrev. Tom On 8/7/2015 10:56 AM, Tom Benson wrote: > Hi, > > On 8/7/2015 9:52 AM, Tom Benson wrote: >> Hi, >> On 8/6/2015 8:40 PM, Jiangli Zhou wrote: >>> Hi Dmitry, >>> >>> I?ve added shared string unmapping to handle the string data >>> verification failure. I also added 'free_archive_regions()? call to >>> FileMapInfo::unmap_string_regions(). My testing revealed some issues >>> with the new archive region free code when I forced string >>> verification failure. I?m working with Tom on the issues. >> > > The problem is simply that in addition to calling > free_archive_regions, FileMapInfo::unmap_string_regions also unmaps > the memory.... so there's a segv when GC tries to re-use it. Let's > talk directly about the best way to handle it. > Tom From kim.barrett at oracle.com Tue Aug 11 21:24:01 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 11 Aug 2015 17:24:01 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55CA3430.3070300@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> Message-ID: <9255850A-182F-4741-BE2A-C7BF72057603@oracle.com> On Aug 11, 2015, at 1:43 PM, Tom Benson wrote: > > Hi, > On 8/7/2015 10:56 AM, Tom Benson wrote: >> >> The problem is simply that in addition to calling free_archive_regions, FileMapInfo::unmap_string_regions also unmaps the memory.... so there's a segv when GC tries to re-use it. Let's talk directly about the best way to handle it. >> Tom > > After some discussion, I've changed the definition and name of free_archive_regions. Now called dealloc_archive_regions, it uncommits the specified regions, unmapping the memory, rather than adding them to the free list. This means the CDS code will no longer do the unmapping on verification failures. > > Updated full and incremental webrevs of the GC code are at: > http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ > http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ > > Tested with JPRT and running benchmarks with the dealloc_ performed explicitly. Jiangli also tested the original failing cases, and will be posting an updated webrev. Can this introduce uncommitted "holes" in committed space? It seems like it might. Otherwise, why introduce shrink_at, rather than just using shrink_by. I'm not sure such holes are presently possible, and I'm not sure they are handled properly everywhere. For example, I think such a hole might confuse HeapRegionManager::shrink_by. I haven't looked carefully for other code that might be confused by uncommitted holes. From tom.benson at oracle.com Tue Aug 11 21:40:12 2015 From: tom.benson at oracle.com (Tom Benson) Date: Tue, 11 Aug 2015 17:40:12 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <9255850A-182F-4741-BE2A-C7BF72057603@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <9255850A-182F-4741-BE2A-C7BF72057603@oracle.com> Message-ID: <55CA6BBC.8080605@oracle.com> Hi Kim, On 8/11/2015 5:24 PM, Kim Barrett wrote: > On Aug 11, 2015, at 1:43 PM, Tom Benson wrote: >> Hi, >> On 8/7/2015 10:56 AM, Tom Benson wrote: >>> The problem is simply that in addition to calling free_archive_regions, FileMapInfo::unmap_string_regions also unmaps the memory.... so there's a segv when GC tries to re-use it. Let's talk directly about the best way to handle it. >>> Tom >> After some discussion, I've changed the definition and name of free_archive_regions. Now called dealloc_archive_regions, it uncommits the specified regions, unmapping the memory, rather than adding them to the free list. This means the CDS code will no longer do the unmapping on verification failures. >> >> Updated full and incremental webrevs of the GC code are at: >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >> >> Tested with JPRT and running benchmarks with the dealloc_ performed explicitly. Jiangli also tested the original failing cases, and will be posting an updated webrev. > Can this introduce uncommitted "holes" in committed space? It seems > like it might. Otherwise, why introduce shrink_at, rather than just > using shrink_by. I'm not sure such holes are presently possible, and > I'm not sure they are handled properly everywhere. shrink_by looks for free ranges it can uncommit. Here, we want to uncommit the specific region(s) allocated with alloc_archive_regions. There is an expand_at that takes a specific index, which shrink_at is intended to be analogous to. G1 is designed to allow uncommitted holes. Even without this change, there is a big 'hole' in committed space: All the regions between the low end of the heap where mutator allocation is occurring (initially sized by Xms) and these archive regions which are at the highest end of the (Xmx-sized) heap. In a way, as used by CDS, calling dealloc_archive_regions is *removing* that hole. 8^) The entire upper portion of the heap (above Xms) will again be uncommitted, as if alloc_archive_regions had never been called. Tom > > For example, I think such a hole might confuse > HeapRegionManager::shrink_by. I haven't looked carefully for other > code that might be confused by uncommitted holes. From kim.barrett at oracle.com Wed Aug 12 00:02:11 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 11 Aug 2015 20:02:11 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55CA6BBC.8080605@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <9255850A-182F-4741-BE2A-C7BF72057603@oracle.com> <55CA6BBC.8080605@oracle.com> Message-ID: On Aug 11, 2015, at 5:40 PM, Tom Benson wrote: > > On 8/11/2015 5:24 PM, Kim Barrett wrote: >> On Aug 11, 2015, at 1:43 PM, Tom Benson wrote: >>> Updated full and incremental webrevs of the GC code are at: >>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >> Can this introduce uncommitted "holes" in committed space? It seems >> like it might. Otherwise, why introduce shrink_at, rather than just >> using shrink_by. I'm not sure such holes are presently possible, and >> I'm not sure they are handled properly everywhere. > shrink_by looks for free ranges it can uncommit. Here, we want to uncommit the specific region(s) allocated with alloc_archive_regions. There is an expand_at that takes a specific index, which shrink_at is intended to be analogous to. I seem to have earlier convinced myself that shrink_by would be confused by a hole; after staring at it some more, I now think it's ok. Sorry for the noise. But I think there might be an unrelated pre-existing performance bug in shrink_by. Line 431 is "cur -= num_last_found;". If the recent scan had to skip over a block of !available || !empty regions to find the regions that were just removed, that skipped block isn't accounted for by that decrement. I think a better iteration step would be "cur = idx_last_found;". [This is part of what confused me about holes.] > G1 is designed to allow uncommitted holes. Even without this change, there is a big 'hole' in committed space: All the regions between the low end of the heap where mutator allocation is occurring (initially sized by Xms) and these archive regions which are at the highest end of the (Xmx-sized) heap. > > In a way, as used by CDS, calling dealloc_archive_regions is *removing* that hole. 8^) The entire upper portion of the heap (above Xms) will again be uncommitted, as if alloc_archive_regions had never been called. I do see code that seems to be intended to cope with such holes. I wonder though, were there ever uncommitted holes in practice before the introduction of this new archive region feature? Previously, the only time a region gets uncommitted is after a full gc. I've not studied G1's full gc enough to even guess whether it could leave such holes. If the answer is no, there could be lurking bugs waiting to be uncovered. I guess we'll deal with those if/when we find them. From kim.barrett at oracle.com Wed Aug 12 00:10:01 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 11 Aug 2015 20:10:01 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <9255850A-182F-4741-BE2A-C7BF72057603@oracle.com> <55CA6BBC.8080605@oracle.com> Message-ID: On Aug 11, 2015, at 8:02 PM, Kim Barrett wrote: > > On Aug 11, 2015, at 5:40 PM, Tom Benson wrote: >> >> On 8/11/2015 5:24 PM, Kim Barrett wrote: >>> On Aug 11, 2015, at 1:43 PM, Tom Benson wrote: >>>> Updated full and incremental webrevs of the GC code are at: >>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>> Can this introduce uncommitted "holes" in committed space? It seems >>> like it might. Otherwise, why introduce shrink_at, rather than just >>> using shrink_by. I'm not sure such holes are presently possible, and >>> I'm not sure they are handled properly everywhere. >> shrink_by looks for free ranges it can uncommit. Here, we want to uncommit the specific region(s) allocated with alloc_archive_regions. There is an expand_at that takes a specific index, which shrink_at is intended to be analogous to. > > I seem to have earlier convinced myself that shrink_by would be > confused by a hole; after staring at it some more, I now think it's > ok. Sorry for the noise. > > But I think there might be an unrelated pre-existing performance bug > in shrink_by. Line 431 is "cur -= num_last_found;". If the recent > scan had to skip over a block of !available || !empty regions to find > the regions that were just removed, that skipped block isn't accounted > for by that decrement. I think a better iteration step would be "cur > = idx_last_found;". [This is part of what confused me about holes.] > >> G1 is designed to allow uncommitted holes. Even without this change, there is a big 'hole' in committed space: All the regions between the low end of the heap where mutator allocation is occurring (initially sized by Xms) and these archive regions which are at the highest end of the (Xmx-sized) heap. >> >> In a way, as used by CDS, calling dealloc_archive_regions is *removing* that hole. 8^) The entire upper portion of the heap (above Xms) will again be uncommitted, as if alloc_archive_regions had never been called. > > I do see code that seems to be intended to cope with such holes. > > I wonder though, were there ever uncommitted holes in practice before > the introduction of this new archive region feature? Previously, the > only time a region gets uncommitted is after a full gc. I've not > studied G1's full gc enough to even guess whether it could leave such > holes. If the answer is no, there could be lurking bugs waiting to be > uncovered. I guess we'll deal with those if/when we find them. Oops, forgot the important part of the reply. Change looks good. From thomas.schatzl at oracle.com Wed Aug 12 10:37:43 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 12 Aug 2015 12:37:43 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <9255850A-182F-4741-BE2A-C7BF72057603@oracle.com> <55CA6BBC.8080605@oracle.com> Message-ID: <1439375863.2324.15.camel@oracle.com> Hi, On Tue, 2015-08-11 at 20:02 -0400, Kim Barrett wrote: > On Aug 11, 2015, at 5:40 PM, Tom Benson wrote: > > > > On 8/11/2015 5:24 PM, Kim Barrett wrote: > >> On Aug 11, 2015, at 1:43 PM, Tom Benson wrote: > >>> Updated full and incremental webrevs of the GC code are at: > >>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ > >>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ > >> Can this introduce uncommitted "holes" in committed space? It seems > >> like it might. Otherwise, why introduce shrink_at, rather than just > >> using shrink_by. I'm not sure such holes are presently possible, and > >> I'm not sure they are handled properly everywhere. > > shrink_by looks for free ranges it can uncommit. Here, we want to > > uncommit the specific region(s) allocated with alloc_archive_regions. > > There is an expand_at that takes a specific index, which shrink_at is > > intended to be analogous to. > > I seem to have earlier convinced myself that shrink_by would be > confused by a hole; after staring at it some more, I now think it's > ok. Sorry for the noise. > > But I think there might be an unrelated pre-existing performance bug > in shrink_by. Line 431 is "cur -= num_last_found;". If the recent > scan had to skip over a block of !available || !empty regions to find > the regions that were just removed, that skipped block isn't accounted > for by that decrement. I think a better iteration step would be "cur > = idx_last_found;". [This is part of what confused me about holes.] That's true. I filed https://bugs.openjdk.java.net/browse/JDK-8133456. > > G1 is designed to allow uncommitted holes. Even without this > >change, there is a big 'hole' in committed space: All the regions > >between the low end of the heap where mutator allocation is occurring > >(initially sized by Xms) and these archive regions which are at the >>highest end of the (Xmx-sized) heap. > > > > In a way, as used by CDS, calling dealloc_archive_regions is >>*removing* that hole. 8^) The entire upper portion of the heap >>(above Xms) will again be uncommitted, as if alloc_archive_regions had >>never been called. > > I do see code that seems to be intended to cope with such holes. > > I wonder though, were there ever uncommitted holes in practice before > the introduction of this new archive region feature? Previously, the > only time a region gets uncommitted is after a full gc. I've not > studied G1's full gc enough to even guess whether it could leave such > holes. Yes, when there are humongous objects in the heap. > If the answer is no, there could be lurking bugs waiting to be > uncovered. I guess we'll deal with those if/when we find them. There are explicit tests in the jtreg tests (gc/g1/TestShrink*) that test shrinking, and TestShrinkDefragmentedHeap in particular tests this. Also there are a few stress tests (ArrayJuggle*) that repeatedly create holes using large objects and do full collections that cause heap shrinking. Thanks, Thomas From thomas.schatzl at oracle.com Wed Aug 12 11:00:59 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 12 Aug 2015 13:00:59 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55CA3430.3070300@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> Message-ID: <1439377259.2324.27.camel@oracle.com> Hi, On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: > Hi, > On 8/7/2015 10:56 AM, Tom Benson wrote: > > > > The problem is simply that in addition to calling > > free_archive_regions, FileMapInfo::unmap_string_regions also unmaps > > the memory.... so there's a segv when GC tries to re-use it. Let's > > talk directly about the best way to handle it. > > Tom > > After some discussion, I've changed the definition and name of > free_archive_regions. Now called dealloc_archive_regions, it uncommits > the specified regions, unmapping the memory, rather than adding them to > the free list. This means the CDS code will no longer do the unmapping > on verification failures. > > Updated full and incremental webrevs of the GC code are at: > http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ > http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ > > Tested with JPRT and running benchmarks with the dealloc_ performed > explicitly. Jiangli also tested the original failing cases, and will be > posting an updated webrev. - is it possible that shrink_by() uses shrink_at()? This would avoid two paths that uncommit regions like expand_by()/expand_at()? - I think the change should call at least HeapRegion::hr_clear() on the region to remove or reset any auxiliary data structures, if not G1CollectedHeap::free_region() (without adding the region to the free list). Since the HeapRegion* is not deallocated by the uncommit, this may cause strange behavior later when the region is reused. Other than that it looks okay. Thanks, Thomas From tom.benson at oracle.com Wed Aug 12 12:31:46 2015 From: tom.benson at oracle.com (Tom Benson) Date: Wed, 12 Aug 2015 08:31:46 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <9255850A-182F-4741-BE2A-C7BF72057603@oracle.com> <55CA6BBC.8080605@oracle.com> Message-ID: <55CB3CB2.2000307@oracle.com> Hi Kim - On 8/11/2015 8:10 PM, Kim Barrett wrote: > On Aug 11, 2015, at 8:02 PM, Kim Barrett wrote: >> ..... > Oops, forgot the important part of the reply. > > Change looks good. > Thanks. I see Thomas addressed your other (outside this change) comments. Tom From ioi.lam at oracle.com Wed Aug 12 15:56:32 2015 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 12 Aug 2015 08:56:32 -0700 Subject: RFR: 8098791: Remove support for PrintMethodStatistics and PrintClassStatistics In-Reply-To: <55C8EDFB.3090301@oracle.com> References: <55C8EDFB.3090301@oracle.com> Message-ID: <55CB6CB0.2050604@oracle.com> Looks good to me. Thanks Ioi On 8/10/15 11:31 AM, Max Ockner wrote: > Hello, > Please review this small change. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8098791 > Webrev: http://cr.openjdk.java.net/~mockner/8098791/ > > Summary: The code supporting PrintMethodStatistics and > PrintClassStatistics was removed. The options are not useful, and they > are not tested. A CCC request has been approved for this change > (http://ccc.us.oracle.com/8098791). This should lighten the load for > unified logging changes. > > Tested with jtreg *. > > Thanks, > Max > > From dmitry.dmitriev at oracle.com Wed Aug 12 19:44:46 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Wed, 12 Aug 2015 22:44:46 +0300 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55C998E6.6020609@oracle.com> References: <55C998E6.6020609@oracle.com> Message-ID: <55CBA22E.1010500@oracle.com> Hello David, Changes looks good, but I'm not a reviewer. By the way, I see one enhancement that can be made in Parker::unpark() that can be implemeted if you wish in this bug or later(for example, when code for WorkAroundNPTLTimedWaitHang will be removed). Parker::unpark() have duplicated code(lines 5796-5797 and 5800-5801): pthread_mutex_unlock(_mutex); assert(status == 0, "invariant"); I think it can be removed by combining two if's on lines 5778 and 5780 into one: if ((s < 1) && (_cur_index != -1)) Also return value of pthread_mutex_unlock not assinged to 'status' in this case. Thank you, Dmitry On 11.08.2015 9:40, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 > > Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ > > The code introduced in 6900441 contained a bug in the code path for > when WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the > change in 8130728 which made the default setting of > WorkAroundNPTLTimedWaitHang zero. > > In PlatformParker on Linux _cur_index tracks which pthread_cond object > is currently in use by a waiting thread (one for relative-timed waits, > the other for absolute-timed waits) and is set to -1 when the thread > is not waiting. In the path now used by default we release the > pthread_mutex_t and then pthread_cond_signal the condition variable at > _cond[_cur_index]. But as soon as we release the mutex the waiting > thread can resume execution (it may have timed-out and so not need the > signal) and set _cur_index to -1. The signalling thread then signals > _cond[-1] which does not contain a pthread_cond_t object. This can > result in the pthread_cond_signal hanging, and potentially other > consequences. > > The fix is simple: save the correct index before unlocking the mutex. > > The test: > java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java has > been marked as failing intermittently (8133231) due to this and I will > revert that as part of this fix, once that change reaches the hs-rt > forest. > > Thanks, > David From david.holmes at oracle.com Wed Aug 12 22:34:03 2015 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Aug 2015 08:34:03 +1000 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55C9E261.3040704@oracle.com> References: <55C998E6.6020609@oracle.com> <55C9C23D.7070005@oracle.com> <55C9D3AD.1050003@oracle.com> <55C9E261.3040704@oracle.com> Message-ID: <55CBC9DB.3030109@oracle.com> On 11/08/2015 9:54 PM, Bertrand Delsart wrote: > On 11/08/2015 12:51, David Holmes wrote: >> On 11/08/2015 7:37 PM, Bertrand Delsart wrote: >>> Looks OK but may not be sufficient. >>> >>> To play it even safer, I'd rather read _cur_index only once, for both >>> the WorkAroundNPTLTimedWaitHang and the !WorkAroundNPTLTimedWaitHang >>> case, e.g.: >> >> I can do that but it isn't necessary for correctness - _cur_index is >> only modified whilst holding the mutex so it must also only be read >> whilst holding the mutex, which is now fully covered. > > OK. Approved as is if you prefer. Your pick. Thanks Bertrand. David > Bertrand. > >> >> Thanks, >> David >> ----- >> >>> 5779 // thread might be parked >>> >>> [ save the _cur_index here, before testing it ] >>> int index = _cur_index; >>> >>> 5780 if (_cur_index != -1) { >>> => if (index != -1) >>> >>> 5781 // thread is definitely parked >>> 5782 if (WorkAroundNPTLTimedWaitHang) { >>> 5783 status = pthread_cond_signal(&_cond[_cur_index]); >>> => use index instead of re-reading _cur_index >>> >>> 5784 assert(status == 0, "invariant"); >>> 5785 status = pthread_mutex_unlock(_mutex); >>> 5786 assert(status == 0, "invariant"); >>> 5787 } else { >>> 5788 // must capture correct index before unlocking >>> >>> [ 5789 int index = _cur_index; ] // now loaded earlier >>> >>> 5790 status = pthread_mutex_unlock(_mutex); >>> 5791 assert(status == 0, "invariant"); >>> 5792 status = pthread_cond_signal(&_cond[index]); >>> 5793 assert(status == 0, "invariant"); >>> >>> >>> Bertrand. >>> >>> On 11/08/2015 08:40, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 >>>> >>>> Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ >>>> >>>> The code introduced in 6900441 contained a bug in the code path for >>>> when >>>> WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the >>>> change >>>> in 8130728 which made the default setting of >>>> WorkAroundNPTLTimedWaitHang >>>> zero. >>>> >>>> In PlatformParker on Linux _cur_index tracks which pthread_cond object >>>> is currently in use by a waiting thread (one for relative-timed waits, >>>> the other for absolute-timed waits) and is set to -1 when the thread is >>>> not waiting. In the path now used by default we release the >>>> pthread_mutex_t and then pthread_cond_signal the condition variable at >>>> _cond[_cur_index]. But as soon as we release the mutex the waiting >>>> thread can resume execution (it may have timed-out and so not need the >>>> signal) and set _cur_index to -1. The signalling thread then signals >>>> _cond[-1] which does not contain a pthread_cond_t object. This can >>>> result in the pthread_cond_signal hanging, and potentially other >>>> consequences. >>>> >>>> The fix is simple: save the correct index before unlocking the mutex. >>>> >>>> The test: >>>> java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java >>>> has been marked as failing intermittently (8133231) due to this and I >>>> will revert that as part of this fix, once that change reaches the >>>> hs-rt >>>> forest. >>>> >>>> Thanks, >>>> David >>> >>> > > From david.holmes at oracle.com Wed Aug 12 22:36:31 2015 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Aug 2015 08:36:31 +1000 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55CBA22E.1010500@oracle.com> References: <55C998E6.6020609@oracle.com> <55CBA22E.1010500@oracle.com> Message-ID: <55CBCA6F.3010801@oracle.com> Hi Dmitry, On 13/08/2015 5:44 AM, Dmitry Dmitriev wrote: > Hello David, > > Changes looks good, but I'm not a reviewer. Thanks - still need a Reviewer please! > By the way, I see one enhancement that can be made in Parker::unpark() > that can be implemeted if you wish in this bug or later(for example, > when code for WorkAroundNPTLTimedWaitHang will be removed). > Parker::unpark() have duplicated code(lines 5796-5797 and 5800-5801): > pthread_mutex_unlock(_mutex); > assert(status == 0, "invariant"); > > I think it can be removed by combining two if's on lines 5778 and 5780 > into one: > if ((s < 1) && (_cur_index != -1)) > > Also return value of pthread_mutex_unlock not assinged to 'status' in > this case. I'll flag that for when the workaround code is removed - thanks. I prefer to just fix the current bug in the most minimal way. The asserts should be assert_status as well. David > Thank you, > Dmitry > > On 11.08.2015 9:40, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 >> >> Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ >> >> The code introduced in 6900441 contained a bug in the code path for >> when WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the >> change in 8130728 which made the default setting of >> WorkAroundNPTLTimedWaitHang zero. >> >> In PlatformParker on Linux _cur_index tracks which pthread_cond object >> is currently in use by a waiting thread (one for relative-timed waits, >> the other for absolute-timed waits) and is set to -1 when the thread >> is not waiting. In the path now used by default we release the >> pthread_mutex_t and then pthread_cond_signal the condition variable at >> _cond[_cur_index]. But as soon as we release the mutex the waiting >> thread can resume execution (it may have timed-out and so not need the >> signal) and set _cur_index to -1. The signalling thread then signals >> _cond[-1] which does not contain a pthread_cond_t object. This can >> result in the pthread_cond_signal hanging, and potentially other >> consequences. >> >> The fix is simple: save the correct index before unlocking the mutex. >> >> The test: >> java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java has >> been marked as failing intermittently (8133231) due to this and I will >> revert that as part of this fix, once that change reaches the hs-rt >> forest. >> >> Thanks, >> David > From daniel.daugherty at oracle.com Wed Aug 12 22:51:36 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 12 Aug 2015 16:51:36 -0600 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55C998E6.6020609@oracle.com> References: <55C998E6.6020609@oracle.com> Message-ID: <55CBCDF8.8090800@oracle.com> On 8/11/15 12:40 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 > > Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ src/os/linux/vm/os_linux.cpp No comments. Thumbs up! Dan > > The code introduced in 6900441 contained a bug in the code path for > when WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the > change in 8130728 which made the default setting of > WorkAroundNPTLTimedWaitHang zero. > > In PlatformParker on Linux _cur_index tracks which pthread_cond object > is currently in use by a waiting thread (one for relative-timed waits, > the other for absolute-timed waits) and is set to -1 when the thread > is not waiting. In the path now used by default we release the > pthread_mutex_t and then pthread_cond_signal the condition variable at > _cond[_cur_index]. But as soon as we release the mutex the waiting > thread can resume execution (it may have timed-out and so not need the > signal) and set _cur_index to -1. The signalling thread then signals > _cond[-1] which does not contain a pthread_cond_t object. This can > result in the pthread_cond_signal hanging, and potentially other > consequences. > > The fix is simple: save the correct index before unlocking the mutex. > > The test: > java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java has > been marked as failing intermittently (8133231) due to this and I will > revert that as part of this fix, once that change reaches the hs-rt > forest. > > Thanks, > David > From david.holmes at oracle.com Wed Aug 12 23:09:06 2015 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Aug 2015 09:09:06 +1000 Subject: (S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout In-Reply-To: <55CBCDF8.8090800@oracle.com> References: <55C998E6.6020609@oracle.com> <55CBCDF8.8090800@oracle.com> Message-ID: <55CBD212.80909@oracle.com> Thanks Dan! David On 13/08/2015 8:51 AM, Daniel D. Daugherty wrote: > On 8/11/15 12:40 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453 >> >> Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/ > > src/os/linux/vm/os_linux.cpp > No comments. > > Thumbs up! > > Dan > > >> >> The code introduced in 6900441 contained a bug in the code path for >> when WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the >> change in 8130728 which made the default setting of >> WorkAroundNPTLTimedWaitHang zero. >> >> In PlatformParker on Linux _cur_index tracks which pthread_cond object >> is currently in use by a waiting thread (one for relative-timed waits, >> the other for absolute-timed waits) and is set to -1 when the thread >> is not waiting. In the path now used by default we release the >> pthread_mutex_t and then pthread_cond_signal the condition variable at >> _cond[_cur_index]. But as soon as we release the mutex the waiting >> thread can resume execution (it may have timed-out and so not need the >> signal) and set _cur_index to -1. The signalling thread then signals >> _cond[-1] which does not contain a pthread_cond_t object. This can >> result in the pthread_cond_signal hanging, and potentially other >> consequences. >> >> The fix is simple: save the correct index before unlocking the mutex. >> >> The test: >> java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java has >> been marked as failing intermittently (8133231) due to this and I will >> revert that as part of this fix, once that change reaches the hs-rt >> forest. >> >> Thanks, >> David >> > From dmitry.dmitriev at oracle.com Thu Aug 13 07:55:52 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Thu, 13 Aug 2015 10:55:52 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function Message-ID: <55CC4D88.2030601@oracle.com> Hello, Please review this fix which remove memory leak in Arguments::add_property function. Also, I need a sponsor for this fix, who can push it. Arguments::add_property function allocate memory for key and value. Then key and values are passed to the PropertyList_unique_add function which use SystemProperty class to add or update property value. SystemProperty class maintains it's own copy of key and value and thus copy passed key and value. Therefore key and value must be freed in add_property function(with exception for value in case of "java.vendor.url.bug" and "sun.java.command" properties). In this fix I allocate memory only for key when passed property contains value. If passed property not contains value, then I not allocate memory for key and use passed property string. Value also extracted from passed property string instead of allocating. To accomplish that I changed declaration of "value" in several functions from "char *" to "const char *" since value is not modified in these functions(PropertyList_* functions, SystemProperty class methods). Processing of "java.vendor.url.bug" and "sun.java.command" properties also corrected. Now when these properties redefined, then code checks if memory was allocated for special variables of these properties(checking that not contains default value) and free it. Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 Tested: JPRT(hotspot test set), hotspot all, vm.quick Thanks, Dmitry From tom.benson at oracle.com Thu Aug 13 19:32:44 2015 From: tom.benson at oracle.com (Tom Benson) Date: Thu, 13 Aug 2015 15:32:44 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1439377259.2324.27.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> Message-ID: <55CCF0DC.2000800@oracle.com> Hi Thomas, On 8/12/2015 7:00 AM, Thomas Schatzl wrote: > Hi, > > On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >> Hi, >> On 8/7/2015 10:56 AM, Tom Benson wrote: >> After some discussion, I've changed the definition and name of >> free_archive_regions. Now called dealloc_archive_regions, it uncommits >> the specified regions, unmapping the memory, rather than adding them to >> the free list. This means the CDS code will no longer do the unmapping >> on verification failures. >> >> Updated full and incremental webrevs of the GC code are at: >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >> >> Tested with JPRT and running benchmarks with the dealloc_ performed >> explicitly. Jiangli also tested the original failing cases, and will be >> posting an updated webrev. > - is it possible that shrink_by() uses shrink_at()? This would avoid two > paths that uncommit regions like expand_by()/expand_at()? OK, I made the change. I didn't do it originally because the asserts I wanted to add for the call from g1CollectedHeap seemed superfluous for the other call, and shrink_at was so small. Now shrink_at takes a region count as well. Updated full and incremental webrevs are at: http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ > > - I think the change should call at least HeapRegion::hr_clear() on the > region to remove or reset any auxiliary data structures, if not > G1CollectedHeap::free_region() (without adding the region to the free > list). > Since the HeapRegion* is not deallocated by the uncommit, this may cause > strange behavior later when the region is reused. I don't think calling hr_clear should be necessary... If it is, we should be doing it in shrink_by as well, and I don't think we are. I don't see how a HeapRegion can be 'reused' without having gone through the constructor when expand_ asks (indirectly) for 'new HeapRegion', and that does an hr_clear() as well as the rest of init. Or am I missing something there? Thanks, Tom > > Other than that it looks okay. > > Thanks, > Thomas > > From daniel.daugherty at oracle.com Fri Aug 14 20:53:37 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 14 Aug 2015 14:53:37 -0600 Subject: RFR XXS 8133537: clarify position of unlock options in error messages Message-ID: <55CE5551.8050408@oracle.com> Greetings, I have a very small code review request to clarify the wording used when the following options are specified in the wrong place: -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions Even though this is a trivial change on the surface, we will not be following the HotSpot Trivial Change Rules. This means I need two reviewers and one must be a (R)eviewer. See the bug link for examples of the new output. 8133537: clarify position of unlock options in error messages https://bugs.openjdk.java.net/browse/JDK-8133537 Webrev URL: http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ Testing: JPRT -testset hotspot is in process Aurora Adhoc Runtime-SVC Nightly testing (will be submitted next) (sanity check to make sure new error message line does not break any tests) Thanks, in advance, for any comments, questions or suggestions. Dan From mikhailo.seledtsov at oracle.com Fri Aug 14 21:22:36 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 14 Aug 2015 14:22:36 -0700 Subject: RFR(S): JDK-8133180 - [TESTBUG] runtime/SharedArchiveFile/SharedStrings.java failed with WhiteBox.class : no such file or directory Message-ID: <55CE5C1C.7050002@oracle.com> Please review this fix to the CDS test bug. See the comments in the bug for details. JBS: https://bugs.openjdk.java.net/browse/JDK-8133180 Webrev: http://cr.openjdk.java.net/~mseledtsov/8133180.00/ Testing: - ran the reproducer discussed in the bug description rm -Rf JT* test jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java - ran the CDS tests in concurrent mode - running CDS tests via multi-platform build-and-test system (in progress) Thank you, Misha From coleen.phillimore at oracle.com Fri Aug 14 22:43:03 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 14 Aug 2015 18:43:03 -0400 Subject: RFR XXS 8133537: clarify position of unlock options in error messages In-Reply-To: <55CE5551.8050408@oracle.com> References: <55CE5551.8050408@oracle.com> Message-ID: <55CE6EF6.3090705@oracle.com> This looks good, pending test results (there may be tests with the old error message as you say below). Thanks, Coleen On 8/14/15 4:53 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a very small code review request to clarify the wording used > when the following options are specified in the wrong place: > > -XX:+UnlockDiagnosticVMOptions > -XX:+UnlockExperimentalVMOptions > > Even though this is a trivial change on the surface, we will not be > following the HotSpot Trivial Change Rules. This means I need two > reviewers and one must be a (R)eviewer. > > See the bug link for examples of the new output. > > 8133537: clarify position of unlock options in error messages > https://bugs.openjdk.java.net/browse/JDK-8133537 > > Webrev URL: > http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ > > Testing: JPRT -testset hotspot is in process > Aurora Adhoc Runtime-SVC Nightly testing (will be submitted > next) > (sanity check to make sure new error message line > does not break any tests) > > Thanks, in advance, for any comments, questions or suggestions. > > Dan From daniel.daugherty at oracle.com Fri Aug 14 23:01:50 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 14 Aug 2015 17:01:50 -0600 Subject: RFR XXS 8133537: clarify position of unlock options in error messages In-Reply-To: <55CE6EF6.3090705@oracle.com> References: <55CE5551.8050408@oracle.com> <55CE6EF6.3090705@oracle.com> Message-ID: <55CE735E.2010802@oracle.com> Thanks for the fast review! I'm watching the Aurora Adhoc results roll in as I type... Dan On 8/14/15 4:43 PM, Coleen Phillimore wrote: > > This looks good, pending test results (there may be tests with the old > error message as you say below). > Thanks, > Coleen > > On 8/14/15 4:53 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a very small code review request to clarify the wording used >> when the following options are specified in the wrong place: >> >> -XX:+UnlockDiagnosticVMOptions >> -XX:+UnlockExperimentalVMOptions >> >> Even though this is a trivial change on the surface, we will not be >> following the HotSpot Trivial Change Rules. This means I need two >> reviewers and one must be a (R)eviewer. >> >> See the bug link for examples of the new output. >> >> 8133537: clarify position of unlock options in error messages >> https://bugs.openjdk.java.net/browse/JDK-8133537 >> >> Webrev URL: >> http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ >> >> Testing: JPRT -testset hotspot is in process >> Aurora Adhoc Runtime-SVC Nightly testing (will be submitted >> next) >> (sanity check to make sure new error message line >> does not break any tests) >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan > From jiangli.zhou at oracle.com Sat Aug 15 00:43:23 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 14 Aug 2015 17:43:23 -0700 Subject: RFR(S): JDK-8133180 - [TESTBUG] runtime/SharedArchiveFile/SharedStrings.java failed with WhiteBox.class : no such file or directory In-Reply-To: <55CE5C1C.7050002@oracle.com> References: <55CE5C1C.7050002@oracle.com> Message-ID: <402AD339-8F17-44CB-9B91-36C27DE6437F@oracle.com> Hi Misha, I have one suggestion. Instead of searching either the ?work? or ?classes? directory depending on the ?classesInWorkDir? argument, how about searching both directories? So the BasicJarBuilder.build() method would use the ?classes? directory first, then try the ?work? directory if the file cannot be found in ?classes'. That might be a more robust solution. Thanks, Jiangli On Aug 14, 2015, at 2:22 PM, Mikhailo Seledtsov wrote: > Please review this fix to the CDS test bug. See the comments in the bug for details. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8133180 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8133180.00/ > Testing: > - ran the reproducer discussed in the bug description > rm -Rf JT* test > jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java > jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java > > - ran the CDS tests in concurrent mode > - running CDS tests via multi-platform build-and-test system > (in progress) > > Thank you, > Misha From david.holmes at oracle.com Mon Aug 17 02:09:51 2015 From: david.holmes at oracle.com (David Holmes) Date: Mon, 17 Aug 2015 12:09:51 +1000 Subject: RFR XXS 8133537: clarify position of unlock options in error messages In-Reply-To: <55CE5551.8050408@oracle.com> References: <55CE5551.8050408@oracle.com> Message-ID: <55D1426F.5010601@oracle.com> Hi Dan, As much as it pains me to do this to you do we really need two lines instead of just changing eg: must be enabled via -XX:+UnlockDiagnosticVMOptions to must be enabled by preceding it with -XX:+UnlockDiagnosticVMOptions ? Thanks, David On 15/08/2015 6:53 AM, Daniel D. Daugherty wrote: > Greetings, > > I have a very small code review request to clarify the wording used > when the following options are specified in the wrong place: > > -XX:+UnlockDiagnosticVMOptions > -XX:+UnlockExperimentalVMOptions > > Even though this is a trivial change on the surface, we will not be > following the HotSpot Trivial Change Rules. This means I need two > reviewers and one must be a (R)eviewer. > > See the bug link for examples of the new output. > > 8133537: clarify position of unlock options in error messages > https://bugs.openjdk.java.net/browse/JDK-8133537 > > Webrev URL: http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ > > Testing: JPRT -testset hotspot is in process > Aurora Adhoc Runtime-SVC Nightly testing (will be submitted next) > (sanity check to make sure new error message line > does not break any tests) > > Thanks, in advance, for any comments, questions or suggestions. > > Dan From daniel.daugherty at oracle.com Mon Aug 17 13:08:07 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 17 Aug 2015 07:08:07 -0600 Subject: RFR XXS 8133537: clarify position of unlock options in error messages In-Reply-To: <55D1426F.5010601@oracle.com> References: <55CE5551.8050408@oracle.com> <55D1426F.5010601@oracle.com> Message-ID: <55D1DCB7.1040603@oracle.com> On 8/16/15 8:09 PM, David Holmes wrote: > Hi Dan, > > As much as it pains me to do this to you do we really need two lines > instead of just changing eg: > > must be enabled via -XX:+UnlockDiagnosticVMOptions > > to > > must be enabled by preceding it with -XX:+UnlockDiagnosticVMOptions Thought about doing it this way, but I didn't want to risk running into a test that was looking for the specific existing error message. As it was, I was worried about a 'golden file' style of test, but (so far) my testing hasn't shown that I've run into that particular style of landmine... Also, I think having the separate line makes the requirement stand out more. Can I convince you to move forward with the wording change as it is right now? Dan > > ? > > Thanks, > David > > On 15/08/2015 6:53 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a very small code review request to clarify the wording used >> when the following options are specified in the wrong place: >> >> -XX:+UnlockDiagnosticVMOptions >> -XX:+UnlockExperimentalVMOptions >> >> Even though this is a trivial change on the surface, we will not be >> following the HotSpot Trivial Change Rules. This means I need two >> reviewers and one must be a (R)eviewer. >> >> See the bug link for examples of the new output. >> >> 8133537: clarify position of unlock options in error messages >> https://bugs.openjdk.java.net/browse/JDK-8133537 >> >> Webrev URL: >> http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ >> >> Testing: JPRT -testset hotspot is in process >> Aurora Adhoc Runtime-SVC Nightly testing (will be submitted >> next) >> (sanity check to make sure new error message line >> does not break any tests) >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan From david.holmes at oracle.com Mon Aug 17 20:59:03 2015 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Aug 2015 06:59:03 +1000 Subject: RFR XXS 8133537: clarify position of unlock options in error messages In-Reply-To: <55D1DCB7.1040603@oracle.com> References: <55CE5551.8050408@oracle.com> <55D1426F.5010601@oracle.com> <55D1DCB7.1040603@oracle.com> Message-ID: <55D24B17.2070808@oracle.com> On 17/08/2015 11:08 PM, Daniel D. Daugherty wrote: > On 8/16/15 8:09 PM, David Holmes wrote: >> Hi Dan, >> >> As much as it pains me to do this to you do we really need two lines >> instead of just changing eg: >> >> must be enabled via -XX:+UnlockDiagnosticVMOptions >> >> to >> >> must be enabled by preceding it with -XX:+UnlockDiagnosticVMOptions > > Thought about doing it this way, but I didn't want to risk > running into a test that was looking for the specific existing > error message. As it was, I was worried about a 'golden file' > style of test, but (so far) my testing hasn't shown that > I've run into that particular style of landmine... Yes there is a risk with any change in output. > Also, I think having the separate line makes the requirement > stand out more. Yes but in a detrimental way in my opinion. "Gee if they had to document twice that you put the flag first then there must be a lot of people getting it wrong, so obviously there's a usability issue there." > Can I convince you to move forward with the > wording change as it is right now? Given this whole thing has consumed way too much time anyway - yes. Thanks, David > Dan > > >> >> ? >> >> Thanks, >> David >> >> On 15/08/2015 6:53 AM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> I have a very small code review request to clarify the wording used >>> when the following options are specified in the wrong place: >>> >>> -XX:+UnlockDiagnosticVMOptions >>> -XX:+UnlockExperimentalVMOptions >>> >>> Even though this is a trivial change on the surface, we will not be >>> following the HotSpot Trivial Change Rules. This means I need two >>> reviewers and one must be a (R)eviewer. >>> >>> See the bug link for examples of the new output. >>> >>> 8133537: clarify position of unlock options in error messages >>> https://bugs.openjdk.java.net/browse/JDK-8133537 >>> >>> Webrev URL: >>> http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ >>> >>> Testing: JPRT -testset hotspot is in process >>> Aurora Adhoc Runtime-SVC Nightly testing (will be submitted >>> next) >>> (sanity check to make sure new error message line >>> does not break any tests) >>> >>> Thanks, in advance, for any comments, questions or suggestions. >>> >>> Dan > From mikhailo.seledtsov at oracle.com Tue Aug 18 01:18:58 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 17 Aug 2015 18:18:58 -0700 Subject: RFR(S): JDK-8133180 - [TESTBUG] runtime/SharedArchiveFile/SharedStrings.java failed with WhiteBox.class : no such file or directory In-Reply-To: <402AD339-8F17-44CB-9B91-36C27DE6437F@oracle.com> References: <55CE5C1C.7050002@oracle.com> <402AD339-8F17-44CB-9B91-36C27DE6437F@oracle.com> Message-ID: <55D28802.4070306@oracle.com> Hi Jiangli, Thank you for your suggestion; it should make testing more robust. I will try your suggestion; if I do not see any undesired side effects I will rerun full testset and re-submit the updated review. Thank you, Misha On 8/14/2015 5:43 PM, Jiangli Zhou wrote: > Hi Misha, > > I have one suggestion. Instead of searching either the ?work? or ?classes? directory depending on the ?classesInWorkDir? argument, how about searching both directories? So the BasicJarBuilder.build() method would use the ?classes? directory first, then try the ?work? directory if the file cannot be found in ?classes'. That might be a more robust solution. > > Thanks, > Jiangli > > On Aug 14, 2015, at 2:22 PM, Mikhailo Seledtsov wrote: > >> Please review this fix to the CDS test bug. See the comments in the bug for details. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8133180 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133180.00/ >> Testing: >> - ran the reproducer discussed in the bug description >> rm -Rf JT* test >> jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java >> jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java >> >> - ran the CDS tests in concurrent mode >> - running CDS tests via multi-platform build-and-test system >> (in progress) >> >> Thank you, >> Misha From daniel.daugherty at oracle.com Tue Aug 18 16:30:05 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 18 Aug 2015 10:30:05 -0600 Subject: RFR XXS 8133537: clarify position of unlock options in error messages In-Reply-To: <55D24B17.2070808@oracle.com> References: <55CE5551.8050408@oracle.com> <55D1426F.5010601@oracle.com> <55D1DCB7.1040603@oracle.com> <55D24B17.2070808@oracle.com> Message-ID: <55D35D8D.6030704@oracle.com> On 8/17/15 2:59 PM, David Holmes wrote: > On 17/08/2015 11:08 PM, Daniel D. Daugherty wrote: >> On 8/16/15 8:09 PM, David Holmes wrote: >>> Hi Dan, >>> >>> As much as it pains me to do this to you do we really need two lines >>> instead of just changing eg: >>> >>> must be enabled via -XX:+UnlockDiagnosticVMOptions >>> >>> to >>> >>> must be enabled by preceding it with -XX:+UnlockDiagnosticVMOptions >> >> Thought about doing it this way, but I didn't want to risk >> running into a test that was looking for the specific existing >> error message. As it was, I was worried about a 'golden file' >> style of test, but (so far) my testing hasn't shown that >> I've run into that particular style of landmine... > > Yes there is a risk with any change in output. > >> Also, I think having the separate line makes the requirement >> stand out more. > > Yes but in a detrimental way in my opinion. "Gee if they had to > document twice that you put the flag first then there must be a lot of > people getting it wrong, so obviously there's a usability issue there." I can't think of anything to say here that's not been said before so I'm just gonna move on. > >> Can I convince you to move forward with the >> wording change as it is right now? > > Given this whole thing has consumed way too much time anyway - yes. Thanks. I have to crawl through the test results and then I'll get this one out of my hair. Dan > > Thanks, > David > >> Dan >> >> >>> >>> ? >>> >>> Thanks, >>> David >>> >>> On 15/08/2015 6:53 AM, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> I have a very small code review request to clarify the wording used >>>> when the following options are specified in the wrong place: >>>> >>>> -XX:+UnlockDiagnosticVMOptions >>>> -XX:+UnlockExperimentalVMOptions >>>> >>>> Even though this is a trivial change on the surface, we will not be >>>> following the HotSpot Trivial Change Rules. This means I need two >>>> reviewers and one must be a (R)eviewer. >>>> >>>> See the bug link for examples of the new output. >>>> >>>> 8133537: clarify position of unlock options in error messages >>>> https://bugs.openjdk.java.net/browse/JDK-8133537 >>>> >>>> Webrev URL: >>>> http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ >>>> >>>> Testing: JPRT -testset hotspot is in process >>>> Aurora Adhoc Runtime-SVC Nightly testing (will be submitted >>>> next) >>>> (sanity check to make sure new error message line >>>> does not break any tests) >>>> >>>> Thanks, in advance, for any comments, questions or suggestions. >>>> >>>> Dan >> From daniel.daugherty at oracle.com Wed Aug 19 18:00:57 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 19 Aug 2015 12:00:57 -0600 Subject: RFR XXS 8133537: clarify position of unlock options in error messages In-Reply-To: <55D35D8D.6030704@oracle.com> References: <55CE5551.8050408@oracle.com> <55D1426F.5010601@oracle.com> <55D1DCB7.1040603@oracle.com> <55D24B17.2070808@oracle.com> <55D35D8D.6030704@oracle.com> Message-ID: <55D4C459.6030407@oracle.com> Finished crawling through the Aurora Adhoc "Runtime-SVC Nightly tests" results and didn't find any regressions due to this change. Will be committing and pushing this change shortly. Thanks again to Coleen and David H for the reviews. Dan On 8/18/15 10:30 AM, Daniel D. Daugherty wrote: > On 8/17/15 2:59 PM, David Holmes wrote: >> On 17/08/2015 11:08 PM, Daniel D. Daugherty wrote: >>> On 8/16/15 8:09 PM, David Holmes wrote: >>>> Hi Dan, >>>> >>>> As much as it pains me to do this to you do we really need two lines >>>> instead of just changing eg: >>>> >>>> must be enabled via -XX:+UnlockDiagnosticVMOptions >>>> >>>> to >>>> >>>> must be enabled by preceding it with -XX:+UnlockDiagnosticVMOptions >>> >>> Thought about doing it this way, but I didn't want to risk >>> running into a test that was looking for the specific existing >>> error message. As it was, I was worried about a 'golden file' >>> style of test, but (so far) my testing hasn't shown that >>> I've run into that particular style of landmine... >> >> Yes there is a risk with any change in output. >> >>> Also, I think having the separate line makes the requirement >>> stand out more. >> >> Yes but in a detrimental way in my opinion. "Gee if they had to >> document twice that you put the flag first then there must be a lot >> of people getting it wrong, so obviously there's a usability issue >> there." > > I can't think of anything to say here that's not been said > before so I'm just gonna move on. > > >> >>> Can I convince you to move forward with the >>> wording change as it is right now? >> >> Given this whole thing has consumed way too much time anyway - yes. > > Thanks. I have to crawl through the test results and then I'll > get this one out of my hair. > > Dan > > >> >> Thanks, >> David >> >>> Dan >>> >>> >>>> >>>> ? >>>> >>>> Thanks, >>>> David >>>> >>>> On 15/08/2015 6:53 AM, Daniel D. Daugherty wrote: >>>>> Greetings, >>>>> >>>>> I have a very small code review request to clarify the wording used >>>>> when the following options are specified in the wrong place: >>>>> >>>>> -XX:+UnlockDiagnosticVMOptions >>>>> -XX:+UnlockExperimentalVMOptions >>>>> >>>>> Even though this is a trivial change on the surface, we will not be >>>>> following the HotSpot Trivial Change Rules. This means I need two >>>>> reviewers and one must be a (R)eviewer. >>>>> >>>>> See the bug link for examples of the new output. >>>>> >>>>> 8133537: clarify position of unlock options in error messages >>>>> https://bugs.openjdk.java.net/browse/JDK-8133537 >>>>> >>>>> Webrev URL: >>>>> http://cr.openjdk.java.net/~dcubed/8133537-webrev/0-jdk9-hs-rt/ >>>>> >>>>> Testing: JPRT -testset hotspot is in process >>>>> Aurora Adhoc Runtime-SVC Nightly testing (will be submitted >>>>> next) >>>>> (sanity check to make sure new error message line >>>>> does not break any tests) >>>>> >>>>> Thanks, in advance, for any comments, questions or suggestions. >>>>> >>>>> Dan >>> > > > From vladimir.kozlov at oracle.com Wed Aug 19 23:26:40 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 19 Aug 2015 16:26:40 -0700 Subject: RFR(XS) 8133984) print_compressed_class_space() is only defined in 64-bit VM Message-ID: <55D510B0.7030903@oracle.com> http://cr.openjdk.java.net/~kvn/8133984/webrev/ The call to Metaspace::print_compressed_class_space(st) in vmError.cpp is not guarded by #ifdef _LP64. So some C++ compilers complain. I will push it into hs-comp. Thanks, Vladimir From coleen.phillimore at oracle.com Wed Aug 19 23:48:28 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Aug 2015 19:48:28 -0400 Subject: RFR(XS) 8133984) print_compressed_class_space() is only defined in 64-bit VM In-Reply-To: <55D510B0.7030903@oracle.com> References: <55D510B0.7030903@oracle.com> Message-ID: <55D515CC.1070004@oracle.com> Can you do: LP64_ONLY(*static void print_compressed_class_space(outputStream* st, const char* requested_addr = 0)__*_*;)*_ instead so that we verify that it's not called unless under LP64? thanks and sorry, I think I caused this bug. Coleen On 8/19/15 7:26 PM, Vladimir Kozlov wrote: > http://cr.openjdk.java.net/~kvn/8133984/webrev/ > > The call to Metaspace::print_compressed_class_space(st) in vmError.cpp > is not guarded by #ifdef _LP64. So some C++ compilers complain. > > I will push it into hs-comp. > > Thanks, > Vladimir From vladimir.kozlov at oracle.com Thu Aug 20 00:23:43 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 19 Aug 2015 17:23:43 -0700 Subject: RFR(XS) 8133984) print_compressed_class_space() is only defined in 64-bit VM In-Reply-To: <55D515CC.1070004@oracle.com> References: <55D510B0.7030903@oracle.com> <55D515CC.1070004@oracle.com> Message-ID: <55D51E0F.40607@oracle.com> But then print_compressed_class_space will not be defined in 32 bit VM and we will not be able to compile vmError.cpp. Vladimir On 8/19/15 4:48 PM, Coleen Phillimore wrote: > > Can you do: > > LP64_ONLY(*static void print_compressed_class_space(outputStream* st, > const char* requested_addr = 0)__*_*;)*_ > > instead so that we verify that it's not called unless under LP64? > > thanks and sorry, I think I caused this bug. > > Coleen > > > On 8/19/15 7:26 PM, Vladimir Kozlov wrote: >> http://cr.openjdk.java.net/~kvn/8133984/webrev/ >> >> The call to Metaspace::print_compressed_class_space(st) in vmError.cpp >> is not guarded by #ifdef _LP64. So some C++ compilers complain. >> >> I will push it into hs-comp. >> >> Thanks, >> Vladimir > From coleen.phillimore at oracle.com Thu Aug 20 00:50:34 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Aug 2015 20:50:34 -0400 Subject: RFR(XS) 8133984) print_compressed_class_space() is only defined in 64-bit VM In-Reply-To: <55D51E0F.40607@oracle.com> References: <55D510B0.7030903@oracle.com> <55D515CC.1070004@oracle.com> <55D51E0F.40607@oracle.com> Message-ID: <55D5245A.5060305@oracle.com> Ok. That code should have been under ifdef _LP64. Your change is fine. Coleen On 8/19/15 8:23 PM, Vladimir Kozlov wrote: > But then print_compressed_class_space will not be defined in 32 bit VM > and we will not be able to compile vmError.cpp. > > Vladimir > > On 8/19/15 4:48 PM, Coleen Phillimore wrote: >> >> Can you do: >> >> LP64_ONLY(*static void print_compressed_class_space(outputStream* st, >> const char* requested_addr = 0)__*_*;)*_ >> >> instead so that we verify that it's not called unless under LP64? >> >> thanks and sorry, I think I caused this bug. >> >> Coleen >> >> >> On 8/19/15 7:26 PM, Vladimir Kozlov wrote: >>> http://cr.openjdk.java.net/~kvn/8133984/webrev/ >>> >>> The call to Metaspace::print_compressed_class_space(st) in vmError.cpp >>> is not guarded by #ifdef _LP64. So some C++ compilers complain. >>> >>> I will push it into hs-comp. >>> >>> Thanks, >>> Vladimir >> From vladimir.kozlov at oracle.com Thu Aug 20 00:56:52 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 19 Aug 2015 17:56:52 -0700 Subject: RFR(XS) 8133984) print_compressed_class_space() is only defined in 64-bit VM In-Reply-To: <55D5245A.5060305@oracle.com> References: <55D510B0.7030903@oracle.com> <55D515CC.1070004@oracle.com> <55D51E0F.40607@oracle.com> <55D5245A.5060305@oracle.com> Message-ID: <55D525D4.8090500@oracle.com> Thanks, Coleen Vladimir On 8/19/15 5:50 PM, Coleen Phillimore wrote: > > Ok. That code should have been under ifdef _LP64. Your change is fine. > Coleen > > > On 8/19/15 8:23 PM, Vladimir Kozlov wrote: >> But then print_compressed_class_space will not be defined in 32 bit VM >> and we will not be able to compile vmError.cpp. >> >> Vladimir >> >> On 8/19/15 4:48 PM, Coleen Phillimore wrote: >>> >>> Can you do: >>> >>> LP64_ONLY(*static void print_compressed_class_space(outputStream* st, >>> const char* requested_addr = 0)__*_*;)*_ >>> >>> instead so that we verify that it's not called unless under LP64? >>> >>> thanks and sorry, I think I caused this bug. >>> >>> Coleen >>> >>> >>> On 8/19/15 7:26 PM, Vladimir Kozlov wrote: >>>> http://cr.openjdk.java.net/~kvn/8133984/webrev/ >>>> >>>> The call to Metaspace::print_compressed_class_space(st) in vmError.cpp >>>> is not guarded by #ifdef _LP64. So some C++ compilers complain. >>>> >>>> I will push it into hs-comp. >>>> >>>> Thanks, >>>> Vladimir >>> > From ioi.lam at oracle.com Thu Aug 20 03:15:34 2015 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 19 Aug 2015 20:15:34 -0700 Subject: RFR(XS) 8133984) print_compressed_class_space() is only defined in 64-bit VM In-Reply-To: <55D5245A.5060305@oracle.com> References: <55D510B0.7030903@oracle.com> <55D515CC.1070004@oracle.com> <55D51E0F.40607@oracle.com> <55D5245A.5060305@oracle.com> Message-ID: <5F23083C-D391-4677-8B93-61C020FAE3D7@oracle.com> Looks good to me too. Thanks for fixing it. Ioi > On Aug 19, 2015, at 5:50 PM, Coleen Phillimore wrote: > > > Ok. That code should have been under ifdef _LP64. Your change is fine. > Coleen > > >> On 8/19/15 8:23 PM, Vladimir Kozlov wrote: >> But then print_compressed_class_space will not be defined in 32 bit VM and we will not be able to compile vmError.cpp. >> >> Vladimir >> >>> On 8/19/15 4:48 PM, Coleen Phillimore wrote: >>> >>> Can you do: >>> >>> LP64_ONLY(*static void print_compressed_class_space(outputStream* st, >>> const char* requested_addr = 0)__*_*;)*_ >>> >>> instead so that we verify that it's not called unless under LP64? >>> >>> thanks and sorry, I think I caused this bug. >>> >>> Coleen >>> >>> >>>> On 8/19/15 7:26 PM, Vladimir Kozlov wrote: >>>> http://cr.openjdk.java.net/~kvn/8133984/webrev/ >>>> >>>> The call to Metaspace::print_compressed_class_space(st) in vmError.cpp >>>> is not guarded by #ifdef _LP64. So some C++ compilers complain. >>>> >>>> I will push it into hs-comp. >>>> >>>> Thanks, >>>> Vladimir > From thomas.schatzl at oracle.com Thu Aug 20 14:06:12 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 20 Aug 2015 16:06:12 +0200 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55CCF0DC.2000800@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> Message-ID: <1440079572.2347.11.camel@oracle.com> Hi Tom, sorry for the delay... On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: > Hi Thomas, > > On 8/12/2015 7:00 AM, Thomas Schatzl wrote: > > Hi, > > > > On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: > >> Hi, > >> On 8/7/2015 10:56 AM, Tom Benson wrote: > >> After some discussion, I've changed the definition and name of > >> free_archive_regions. Now called dealloc_archive_regions, it uncommits > >> the specified regions, unmapping the memory, rather than adding them to > >> the free list. This means the CDS code will no longer do the unmapping > >> on verification failures. > >> > >> Updated full and incremental webrevs of the GC code are at: > >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ > >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ > >> > >> Tested with JPRT and running benchmarks with the dealloc_ performed > >> explicitly. Jiangli also tested the original failing cases, and will be > >> posting an updated webrev. > > - is it possible that shrink_by() uses shrink_at()? This would avoid two > > paths that uncommit regions like expand_by()/expand_at()? > > OK, I made the change. I didn't do it originally because the asserts I > wanted to add for the call from g1CollectedHeap seemed superfluous for > the other call, and shrink_at was so small. Now shrink_at takes a > region count as well. > > Updated full and incremental webrevs are at: > http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ > http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ > > Looks good. > > > > - I think the change should call at least HeapRegion::hr_clear() on the > > region to remove or reset any auxiliary data structures, if not > > G1CollectedHeap::free_region() (without adding the region to the free > > list). > > Since the HeapRegion* is not deallocated by the uncommit, this may cause > > strange behavior later when the region is reused. > > I don't think calling hr_clear should be necessary... If it is, we > should be doing it in shrink_by as well, and I don't think we are. I > don't see how a HeapRegion can be 'reused' without having gone through > the constructor when expand_ asks (indirectly) for 'new HeapRegion', and > that does an hr_clear() as well as the rest of init. Or am I missing > something there? Leave it as is. I thought that a full gc (which is the only case where the heap shrinks at the moment) also clears the remset of these regions at least. It should, I filed JDK-8134048 for looking in this issue. Looks good. Thanks, Thomas From tom.benson at oracle.com Thu Aug 20 14:12:24 2015 From: tom.benson at oracle.com (Tom Benson) Date: Thu, 20 Aug 2015 10:12:24 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <1440079572.2347.11.camel@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> Message-ID: <55D5E048.5070404@oracle.com> Hi Thomas, OK, thanks! Tom On 8/20/2015 10:06 AM, Thomas Schatzl wrote: > Hi Tom, > > sorry for the delay... > > On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >> Hi Thomas, >> >> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>> Hi, >>> >>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>> Hi, >>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>> After some discussion, I've changed the definition and name of >>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>> the specified regions, unmapping the memory, rather than adding them to >>>> the free list. This means the CDS code will no longer do the unmapping >>>> on verification failures. >>>> >>>> Updated full and incremental webrevs of the GC code are at: >>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>> >>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>> explicitly. Jiangli also tested the original failing cases, and will be >>>> posting an updated webrev. >>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>> paths that uncommit regions like expand_by()/expand_at()? >> OK, I made the change. I didn't do it originally because the asserts I >> wanted to add for the call from g1CollectedHeap seemed superfluous for >> the other call, and shrink_at was so small. Now shrink_at takes a >> region count as well. >> > >> Updated full and incremental webrevs are at: >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >> >> > Looks good. > >>> - I think the change should call at least HeapRegion::hr_clear() on the >>> region to remove or reset any auxiliary data structures, if not >>> G1CollectedHeap::free_region() (without adding the region to the free >>> list). >>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>> strange behavior later when the region is reused. >> I don't think calling hr_clear should be necessary... If it is, we >> should be doing it in shrink_by as well, and I don't think we are. I >> don't see how a HeapRegion can be 'reused' without having gone through >> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >> that does an hr_clear() as well as the rest of init. Or am I missing >> something there? > Leave it as is. I thought that a full gc (which is the only case where > the heap shrinks at the moment) also clears the remset of these regions > at least. > > It should, I filed JDK-8134048 for looking in this issue. > > Looks good. > > Thanks, > Thomas > > From james.laskey at oracle.com Thu Aug 20 15:39:12 2015 From: james.laskey at oracle.com (Jim Laskey (Oracle)) Date: Thu, 20 Aug 2015 12:39:12 -0300 Subject: RFR: Hotspot jimage API Message-ID: <435C5C1C-ACAB-498B-8114-C19B404A69FB@oracle.com> This is a description of changes precipitated from https://bugs.openjdk.java.net/browse/JDK-8087181 https://wiki.se.oracle.com/display/JPG/Hotspot+jimage+API Cheers, ? Jim From james.laskey at oracle.com Thu Aug 20 17:16:23 2015 From: james.laskey at oracle.com (Jim Laskey (Oracle)) Date: Thu, 20 Aug 2015 14:16:23 -0300 Subject: RFR: Hotspot jimage API In-Reply-To: <435C5C1C-ACAB-498B-8114-C19B404A69FB@oracle.com> References: <435C5C1C-ACAB-498B-8114-C19B404A69FB@oracle.com> Message-ID: <1838AD23-5934-4358-8163-553F0CC43407@oracle.com> External link is here http://cr.openjdk.java.net/~jlaskey/jake/HotSpotJImageAPI.pdf > On Aug 20, 2015, at 12:39 PM, Jim Laskey (Oracle) wrote: > > This is a description of changes precipitated from https://bugs.openjdk.java.net/browse/JDK-8087181 > > https://wiki.se.oracle.com/display/JPG/Hotspot+jimage+API > > Cheers, > > ? Jim From jiangli.zhou at oracle.com Thu Aug 20 19:55:21 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 20 Aug 2015 12:55:21 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55D5E048.5070404@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> Message-ID: <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> Hi Dmitry, Here is the updated runtime webrev that reflects Tom?s latest GC changes. http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. Thanks, Jiangli On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: > Hi Thomas, > OK, thanks! > Tom > > On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >> Hi Tom, >> >> sorry for the delay... >> >> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>> Hi Thomas, >>> >>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>> Hi, >>>> >>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>> Hi, >>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>> After some discussion, I've changed the definition and name of >>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>> the specified regions, unmapping the memory, rather than adding them to >>>>> the free list. This means the CDS code will no longer do the unmapping >>>>> on verification failures. >>>>> >>>>> Updated full and incremental webrevs of the GC code are at: >>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>> >>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>> posting an updated webrev. >>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>> paths that uncommit regions like expand_by()/expand_at()? >>> OK, I made the change. I didn't do it originally because the asserts I >>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>> the other call, and shrink_at was so small. Now shrink_at takes a >>> region count as well. >>> >> >>> Updated full and incremental webrevs are at: >>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>> >>> >> Looks good. >> >>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>> region to remove or reset any auxiliary data structures, if not >>>> G1CollectedHeap::free_region() (without adding the region to the free >>>> list). >>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>> strange behavior later when the region is reused. >>> I don't think calling hr_clear should be necessary... If it is, we >>> should be doing it in shrink_by as well, and I don't think we are. I >>> don't see how a HeapRegion can be 'reused' without having gone through >>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>> that does an hr_clear() as well as the rest of init. Or am I missing >>> something there? >> Leave it as is. I thought that a full gc (which is the only case where >> the heap shrinks at the moment) also clears the remset of these regions >> at least. >> >> It should, I filed JDK-8134048 for looking in this issue. >> >> Looks good. >> >> Thanks, >> Thomas >> >> > From james.laskey at oracle.com Fri Aug 21 11:50:50 2015 From: james.laskey at oracle.com (Jim Laskey (Oracle)) Date: Fri, 21 Aug 2015 08:50:50 -0300 Subject: RFR: JDK-8080511 - Refresh of jimage support In-Reply-To: References: Message-ID: <2D754821-84AE-453D-8626-8693F9943CEC@oracle.com> The API is still a work in progress. Stay tuned. > On Aug 21, 2015, at 4:37 AM, deven you wrote: > > Hi Jim, > > I have one question. I see Hotspot already supports in decompressing compressed resource and there is a method newCompressedResource in jdk/src/java.base/share/classes/jdk/internal/jimage/ResourcePool.java > for creating a compressed resource but I did not find any API uses this method and not find there is any compressed resource in bootmodules.jimage. > > What I want to know is 1. if I want to compress one resource in a certain module what are the steps? I assume I need write some code which first gets the plugin and compressed buffer and then pass to newCompressedResource? If there is some compressed zip or jar files in a certain module how the relevant code deals with this condition? > 2. Any plan that bootmodules.jiamge or other jimage files will contain such compressed resources? > > Thanks a lot! > > 2015-06-18 8:08 GMT+08:00 Jim Laskey (Oracle) >: > https://bugs.openjdk.java.net/browse/JDK-8080511 > > This is an long overdue refresh of the jimage support in the JDK9-dev repo. This includes native support for reading jimage files, improved jrt-fs (java runtime file system) support for retrieving modules and packages from the runtime, and improved performance for langtools in the presence of jrt-fs. > > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-top > > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-jdk > > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-hotspot > > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-langtools > > > > Details: > > - jrt-fs provides access, via the nio FileSystem API, to the classes in a .jimage file, organized by module or by package. > - Shared code for jimage support converted to native. Currently residing in hotspot, but will migrate to it?s own jdk library https://bugs.openjdk.java.net/browse/JDK-8087181 > > - A new archive abstraction for class/resource sources. > - java based implementation layer for jimage reading to allow backport to JDK8 (jrt-fs.jar - IDE support.) > - JNI support for jimage into hotspot. > - White box tests written to exercise native jimage support. > > From dmitry.dmitriev at oracle.com Fri Aug 21 15:25:27 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Fri, 21 Aug 2015 18:25:27 +0300 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> Message-ID: <55D742E7.9080208@oracle.com> Hello Jiangli, This looks good to me, but I'm not a reviewer. Also, I have question to you. Probably you can clarify me one moment. num_ranges and string_ranges are modified only in code under "#if INCLUDE_ALL_GCS", so it make sense to include all usage of these variables also under "#if INCLUDE_ALL_GCS"? I mean following functions: FileMapInfo::fixup_string_regions() and new FileMapInfo::dealloc_string_regions() function. Thank you, Dmitry On 20.08.2015 22:55, Jiangli Zhou wrote: > Hi Dmitry, > > Here is the updated runtime webrev that reflects Tom?s latest GC changes. > > http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ > > I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. > > Thanks, > Jiangli > > On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: > >> Hi Thomas, >> OK, thanks! >> Tom >> >> On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >>> Hi Tom, >>> >>> sorry for the delay... >>> >>> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>>> Hi Thomas, >>>> >>>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>>> Hi, >>>>> >>>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>>> Hi, >>>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>>> After some discussion, I've changed the definition and name of >>>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>>> the specified regions, unmapping the memory, rather than adding them to >>>>>> the free list. This means the CDS code will no longer do the unmapping >>>>>> on verification failures. >>>>>> >>>>>> Updated full and incremental webrevs of the GC code are at: >>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>>> >>>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>>> posting an updated webrev. >>>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>>> paths that uncommit regions like expand_by()/expand_at()? >>>> OK, I made the change. I didn't do it originally because the asserts I >>>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>>> the other call, and shrink_at was so small. Now shrink_at takes a >>>> region count as well. >>>> >>>> Updated full and incremental webrevs are at: >>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>>> >>>> >>> Looks good. >>> >>>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>>> region to remove or reset any auxiliary data structures, if not >>>>> G1CollectedHeap::free_region() (without adding the region to the free >>>>> list). >>>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>>> strange behavior later when the region is reused. >>>> I don't think calling hr_clear should be necessary... If it is, we >>>> should be doing it in shrink_by as well, and I don't think we are. I >>>> don't see how a HeapRegion can be 'reused' without having gone through >>>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>>> that does an hr_clear() as well as the rest of init. Or am I missing >>>> something there? >>> Leave it as is. I thought that a full gc (which is the only case where >>> the heap shrinks at the moment) also clears the remset of these regions >>> at least. >>> >>> It should, I filed JDK-8134048 for looking in this issue. >>> >>> Looks good. >>> >>> Thanks, >>> Thomas >>> >>> From rachel.protacio at oracle.com Fri Aug 21 15:41:22 2015 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Fri, 21 Aug 2015 11:41:22 -0400 Subject: RFR: 8133561: linux thread id should be reported in decimal in the error reports now In-Reply-To: <55D7458A.5070100@oracle.com> References: <55D7458A.5070100@oracle.com> Message-ID: <55D746A2.1070808@oracle.com> Hello, everyone! I've just started with the Hotspot Runtime team - please take a look at this change. ---- Summary: Linux thread id error reports changed back to decimal Bug: https://bugs.openjdk.java.net/browse/JDK-8133561 Webrev: http://cr.openjdk.java.net/~coleenp/8133561/ Testing: I visually verified the result with error logs and Show MessageBoxOnError. Passed jtreg hotspot/test/runtime and RBT "quick" tests. Thank you! Rachel From harold.seigel at oracle.com Fri Aug 21 16:37:18 2015 From: harold.seigel at oracle.com (harold seigel) Date: Fri, 21 Aug 2015 12:37:18 -0400 Subject: RFR: 8133561: linux thread id should be reported in decimal in the error reports now In-Reply-To: <55D746A2.1070808@oracle.com> References: <55D7458A.5070100@oracle.com> <55D746A2.1070808@oracle.com> Message-ID: <55D753BE.8080501@oracle.com> Hi Rachel, Your change looks good. Thanks for doing it. Harold On 8/21/2015 11:41 AM, Rachel Protacio wrote: > Hello, everyone! I've just started with the Hotspot Runtime team - > please take a look at this change. > ---- > Summary: Linux thread id error reports changed back to decimal > > Bug: https://bugs.openjdk.java.net/browse/JDK-8133561 > Webrev: http://cr.openjdk.java.net/~coleenp/8133561/ > > Testing: I visually verified the result with error logs and Show > MessageBoxOnError. Passed jtreg hotspot/test/runtime and RBT "quick" > tests. > > Thank you! > Rachel From christian.tornqvist at oracle.com Fri Aug 21 18:12:09 2015 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Fri, 21 Aug 2015 14:12:09 -0400 Subject: RFR: 8133561: linux thread id should be reported in decimal in the error reports now In-Reply-To: <55D746A2.1070808@oracle.com> References: <55D7458A.5070100@oracle.com> <55D746A2.1070808@oracle.com> Message-ID: <0EF14457-C513-4A20-A13C-9AD2CC08325E@oracle.com> Hi Rachel, This looks good, thanks for fixing this. Thanks, Christian > On Aug 21, 2015, at 11:41 AM, Rachel Protacio wrote: > > Hello, everyone! I've just started with the Hotspot Runtime team - please take a look at this change. > ---- > Summary: Linux thread id error reports changed back to decimal > > Bug: https://bugs.openjdk.java.net/browse/JDK-8133561 > Webrev: http://cr.openjdk.java.net/~coleenp/8133561/ > > Testing: I visually verified the result with error logs and Show MessageBoxOnError. Passed jtreg hotspot/test/runtime and RBT "quick" tests. > > Thank you! > Rachel From rachel.protacio at oracle.com Fri Aug 21 19:10:29 2015 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Fri, 21 Aug 2015 15:10:29 -0400 Subject: RFR: 8133561: linux thread id should be reported in decimal in the error reports now In-Reply-To: <55D753BE.8080501@oracle.com> References: <55D7458A.5070100@oracle.com> <55D746A2.1070808@oracle.com> <55D753BE.8080501@oracle.com> Message-ID: <55D777A5.7060204@oracle.com> Thanks, Harold! On 8/21/2015 12:37 PM, harold seigel wrote: > Hi Rachel, > > Your change looks good. Thanks for doing it. > > Harold > > On 8/21/2015 11:41 AM, Rachel Protacio wrote: >> Hello, everyone! I've just started with the Hotspot Runtime team - >> please take a look at this change. >> ---- >> Summary: Linux thread id error reports changed back to decimal >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8133561 >> Webrev: http://cr.openjdk.java.net/~coleenp/8133561/ >> >> Testing: I visually verified the result with error logs and Show >> MessageBoxOnError. Passed jtreg hotspot/test/runtime and RBT "quick" >> tests. >> >> Thank you! >> Rachel > From rachel.protacio at oracle.com Fri Aug 21 19:11:56 2015 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Fri, 21 Aug 2015 15:11:56 -0400 Subject: RFR: 8133561: linux thread id should be reported in decimal in the error reports now In-Reply-To: <0EF14457-C513-4A20-A13C-9AD2CC08325E@oracle.com> References: <55D7458A.5070100@oracle.com> <55D746A2.1070808@oracle.com> <0EF14457-C513-4A20-A13C-9AD2CC08325E@oracle.com> Message-ID: <55D777FC.6000201@oracle.com> Thanks, Christian! Glad it will be useful. On 8/21/2015 2:12 PM, Christian Tornqvist wrote: > Hi Rachel, > > This looks good, thanks for fixing this. > > Thanks, > Christian > > > >> On Aug 21, 2015, at 11:41 AM, Rachel Protacio wrote: >> >> Hello, everyone! I've just started with the Hotspot Runtime team - please take a look at this change. >> ---- >> Summary: Linux thread id error reports changed back to decimal >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8133561 >> Webrev: http://cr.openjdk.java.net/~coleenp/8133561/ >> >> Testing: I visually verified the result with error logs and Show MessageBoxOnError. Passed jtreg hotspot/test/runtime and RBT "quick" tests. >> >> Thank you! >> Rachel From jiangli.zhou at oracle.com Sat Aug 22 01:47:28 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 21 Aug 2015 18:47:28 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55D742E7.9080208@oracle.com> References: <55C103A4.1060505@oracle.com> <1438781844.2378.60.camel@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> <55D742E7.! 9080208@oracle.com> Message-ID: Hi Dmitry, On Aug 21, 2015, at 8:25 AM, Dmitry Dmitriev wrote: > Hello Jiangli, > > This looks good to me, but I'm not a reviewer. Thanks. > > Also, I have question to you. Probably you can clarify me one moment. > num_ranges and string_ranges are modified only in code under "#if INCLUDE_ALL_GCS", so it make sense to include all usage of these variables also under "#if INCLUDE_ALL_GCS"? I mean following functions: > FileMapInfo::fixup_string_regions() and new FileMapInfo::dealloc_string_regions() function. Agreed. Here is the updated webrev: http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/src/share/vm/memory/filemap.cpp.sdiff.html. I also reverified JPRT builds with the new #ifdef changes. Thanks for the detailed review! Jiangli > > Thank you, > Dmitry > > On 20.08.2015 22:55, Jiangli Zhou wrote: >> Hi Dmitry, >> >> Here is the updated runtime webrev that reflects Tom?s latest GC changes. >> >> http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ >> >> I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. >> >> Thanks, >> Jiangli >> >> On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: >> >>> Hi Thomas, >>> OK, thanks! >>> Tom >>> >>> On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >>>> Hi Tom, >>>> >>>> sorry for the delay... >>>> >>>> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>>>> Hi Thomas, >>>>> >>>>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>>>> Hi, >>>>>> >>>>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>>>> Hi, >>>>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>>>> After some discussion, I've changed the definition and name of >>>>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>>>> the specified regions, unmapping the memory, rather than adding them to >>>>>>> the free list. This means the CDS code will no longer do the unmapping >>>>>>> on verification failures. >>>>>>> >>>>>>> Updated full and incremental webrevs of the GC code are at: >>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>>>> >>>>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>>>> posting an updated webrev. >>>>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>>>> paths that uncommit regions like expand_by()/expand_at()? >>>>> OK, I made the change. I didn't do it originally because the asserts I >>>>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>>>> the other call, and shrink_at was so small. Now shrink_at takes a >>>>> region count as well. >>>>> >>>>> Updated full and incremental webrevs are at: >>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>>>> >>>>> >>>> Looks good. >>>> >>>>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>>>> region to remove or reset any auxiliary data structures, if not >>>>>> G1CollectedHeap::free_region() (without adding the region to the free >>>>>> list). >>>>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>>>> strange behavior later when the region is reused. >>>>> I don't think calling hr_clear should be necessary... If it is, we >>>>> should be doing it in shrink_by as well, and I don't think we are. I >>>>> don't see how a HeapRegion can be 'reused' without having gone through >>>>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>>>> that does an hr_clear() as well as the rest of init. Or am I missing >>>>> something there? >>>> Leave it as is. I thought that a full gc (which is the only case where >>>> the heap shrinks at the moment) also clears the remset of these regions >>>> at least. >>>> >>>> It should, I filed JDK-8134048 for looking in this issue. >>>> >>>> Looks good. >>>> >>>> Thanks, >>>> Thomas >>>> >>>> > From dmitry.dmitriev at oracle.com Sun Aug 23 20:51:21 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Sun, 23 Aug 2015 23:51:21 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55CC4D88.2030601@oracle.com> References: <55CC4D88.2030601@oracle.com> Message-ID: <55DA3249.8030205@oracle.com> Hello, Can I please get review and sponsor for this fix? Thanks! Dmitry On 13.08.2015 10:55, Dmitry Dmitriev wrote: > Hello, > > Please review this fix which remove memory leak in > Arguments::add_property function. Also, I need a sponsor for this fix, > who can push it. > > Arguments::add_property function allocate memory for key and value. > Then key and values are passed to the PropertyList_unique_add function > which use SystemProperty class to add or update property value. > SystemProperty class maintains it's own copy of key and value and thus > copy passed key and value. Therefore key and value must be freed in > add_property function(with exception for value in case of > "java.vendor.url.bug" and "sun.java.command" properties). > > In this fix I allocate memory only for key when passed property > contains value. If passed property not contains value, then I not > allocate memory for key and use passed property string. Value also > extracted from passed property string instead of allocating. To > accomplish that I changed declaration of "value" in several functions > from "char *" to "const char *" since value is not modified in these > functions(PropertyList_* functions, SystemProperty class methods). > > Processing of "java.vendor.url.bug" and "sun.java.command" properties > also corrected. Now when these properties redefined, then code checks > if memory was allocated for special variables of these > properties(checking that not contains default value) and free it. > > Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 > Tested: JPRT(hotspot test set), hotspot all, vm.quick > > Thanks, > Dmitry From dmitry.dmitriev at oracle.com Sun Aug 23 20:54:20 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Sun, 23 Aug 2015 23:54:20 +0300 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: References: <55C103A4.1060505@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> <55D742E7.9080208@oracle.com> Message-ID: <55DA32FC.8000405@oracle.com> Hi Jiangli, Looks good to me! Thank you, Dmitry On 22.08.2015 4:47, Jiangli Zhou wrote: > Hi Dmitry, > > On Aug 21, 2015, at 8:25 AM, Dmitry Dmitriev wrote: > >> Hello Jiangli, >> >> This looks good to me, but I'm not a reviewer. > Thanks. > >> Also, I have question to you. Probably you can clarify me one moment. >> num_ranges and string_ranges are modified only in code under "#if INCLUDE_ALL_GCS", so it make sense to include all usage of these variables also under "#if INCLUDE_ALL_GCS"? I mean following functions: >> FileMapInfo::fixup_string_regions() and new FileMapInfo::dealloc_string_regions() function. > Agreed. Here is the updated webrev: http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/src/share/vm/memory/filemap.cpp.sdiff.html. > > I also reverified JPRT builds with the new #ifdef changes. > > Thanks for the detailed review! > > Jiangli > >> Thank you, >> Dmitry >> >> On 20.08.2015 22:55, Jiangli Zhou wrote: >>> Hi Dmitry, >>> >>> Here is the updated runtime webrev that reflects Tom?s latest GC changes. >>> >>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ >>> >>> I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. >>> >>> Thanks, >>> Jiangli >>> >>> On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: >>> >>>> Hi Thomas, >>>> OK, thanks! >>>> Tom >>>> >>>> On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >>>>> Hi Tom, >>>>> >>>>> sorry for the delay... >>>>> >>>>> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>>>>> Hi Thomas, >>>>>> >>>>>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>>>>> Hi, >>>>>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>>>>> After some discussion, I've changed the definition and name of >>>>>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>>>>> the specified regions, unmapping the memory, rather than adding them to >>>>>>>> the free list. This means the CDS code will no longer do the unmapping >>>>>>>> on verification failures. >>>>>>>> >>>>>>>> Updated full and incremental webrevs of the GC code are at: >>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>>>>> >>>>>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>>>>> posting an updated webrev. >>>>>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>>>>> paths that uncommit regions like expand_by()/expand_at()? >>>>>> OK, I made the change. I didn't do it originally because the asserts I >>>>>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>>>>> the other call, and shrink_at was so small. Now shrink_at takes a >>>>>> region count as well. >>>>>> >>>>>> Updated full and incremental webrevs are at: >>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>>>>> >>>>>> >>>>> Looks good. >>>>> >>>>>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>>>>> region to remove or reset any auxiliary data structures, if not >>>>>>> G1CollectedHeap::free_region() (without adding the region to the free >>>>>>> list). >>>>>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>>>>> strange behavior later when the region is reused. >>>>>> I don't think calling hr_clear should be necessary... If it is, we >>>>>> should be doing it in shrink_by as well, and I don't think we are. I >>>>>> don't see how a HeapRegion can be 'reused' without having gone through >>>>>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>>>>> that does an hr_clear() as well as the rest of init. Or am I missing >>>>>> something there? >>>>> Leave it as is. I thought that a full gc (which is the only case where >>>>> the heap shrinks at the moment) also clears the remset of these regions >>>>> at least. >>>>> >>>>> It should, I filed JDK-8134048 for looking in this issue. >>>>> >>>>> Looks good. >>>>> >>>>> Thanks, >>>>> Thomas >>>>> >>>>> From ioi.lam at oracle.com Sun Aug 23 23:13:05 2015 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 23 Aug 2015 16:13:05 -0700 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55CC4D88.2030601@oracle.com> References: <55CC4D88.2030601@oracle.com> Message-ID: <55DA5381.9080004@oracle.com> Hi Dmitry, Is this change part of 8132725? 3904 jint code = set_aggressive_opts_flags(); 3905 if (code != JNI_OK) { 3906 return code; 3907 } 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >> also check (_java_vendor_url_bug != NULL) for sanity? Also, there's a lot of duplicated "if (eq != NULL) { FreeHeap((void *)key);}". Maybe these can be consolidated with a "goto"? I know lots of people haye goto but it will make the clean up less error prone: bool Arguments::add_property(const char* prop) { .... bool status = false; .... char *_java_command_new = os::strdup(value, mtInternal); if (_java_command_new == NULL) { goto done; }else { if (_java_command != NULL) { os::free(_java_command); } _java_command = _java_command_new; } .... } // Create new property and add at the end of the list PropertyList_unique_add(&_system_properties, key, value); } status = true; done: if (key != prop) { // SystemProperty copy passed value, thus free previously allocated // memory FreeHeap((void *)key); } return status; } Also, using (key != prop) would make the code clearer than (eq != NULL). Thanks - Ioi On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: > Hello, > > Please review this fix which remove memory leak in > Arguments::add_property function. Also, I need a sponsor for this fix, > who can push it. > > Arguments::add_property function allocate memory for key and value. > Then key and values are passed to the PropertyList_unique_add function > which use SystemProperty class to add or update property value. > SystemProperty class maintains it's own copy of key and value and thus > copy passed key and value. Therefore key and value must be freed in > add_property function(with exception for value in case of > "java.vendor.url.bug" and "sun.java.command" properties). > > In this fix I allocate memory only for key when passed property > contains value. If passed property not contains value, then I not > allocate memory for key and use passed property string. Value also > extracted from passed property string instead of allocating. To > accomplish that I changed declaration of "value" in several functions > from "char *" to "const char *" since value is not modified in these > functions(PropertyList_* functions, SystemProperty class methods). > > Processing of "java.vendor.url.bug" and "sun.java.command" properties > also corrected. Now when these properties redefined, then code checks > if memory was allocated for special variables of these > properties(checking that not contains default value) and free it. > > Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 > Tested: JPRT(hotspot test set), hotspot all, vm.quick > > Thanks, > Dmitry From Alan.Bateman at oracle.com Mon Aug 24 06:43:24 2015 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 24 Aug 2015 07:43:24 +0100 Subject: RFR: Hotspot jimage API In-Reply-To: <1838AD23-5934-4358-8163-553F0CC43407@oracle.com> References: <435C5C1C-ACAB-498B-8114-C19B404A69FB@oracle.com> <1838AD23-5934-4358-8163-553F0CC43407@oracle.com> Message-ID: <55DABD0C.7070800@oracle.com> On 20/08/2015 18:16, Jim Laskey (Oracle) wrote: > External link is here http://cr.openjdk.java.net/~jlaskey/jake/HotSpotJImageAPI.pdf > > This mostly good except for JIMAGE_PackageToModule which assumes a unique mapping. So I think this needs to be dropped or re-examined. -Alan From james.laskey at oracle.com Mon Aug 24 13:06:12 2015 From: james.laskey at oracle.com (Jim Laskey (Oracle)) Date: Mon, 24 Aug 2015 10:06:12 -0300 Subject: RFR: Hotspot jimage API In-Reply-To: <55DABD0C.7070800@oracle.com> References: <435C5C1C-ACAB-498B-8114-C19B404A69FB@oracle.com> <1838AD23-5934-4358-8163-553F0CC43407@oracle.com> <55DABD0C.7070800@oracle.com> Message-ID: I?ll see if I can float that balloon today > On Aug 24, 2015, at 3:43 AM, Alan Bateman wrote: > > > > On 20/08/2015 18:16, Jim Laskey (Oracle) wrote: >> External link is here http://cr.openjdk.java.net/~jlaskey/jake/HotSpotJImageAPI.pdf >> >> > This mostly good except for JIMAGE_PackageToModule which assumes a unique mapping. So I think this needs to be dropped or re-examined. > > -Alan From dmitry.dmitriev at oracle.com Mon Aug 24 13:21:37 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Mon, 24 Aug 2015 16:21:37 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DA5381.9080004@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> Message-ID: <55DB1A61.7020508@oracle.com> Hi Ioi, Thank you for comments! Please, see my answers inline. On 24.08.2015 2:13, Ioi Lam wrote: > Hi Dmitry, > > Is this change part of 8132725? > > 3904 jint code = set_aggressive_opts_flags(); > 3905 if (code != JNI_OK) { > 3906 return code; > 3907 } Yes, set_aggressive_opts_flags not check return value of add_property function, so I add check to the set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) and thus now it returns jint. > > > 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { > > >> also check (_java_vendor_url_bug != NULL) for sanity? I think that this is unnecessary in this case, because _java_vendor_url_bug can not be NULL. _java_vendor_url_bug initialized to DEFAULT_VENDOR_URL_BUG and changed only in add_property function. Before new value is assigned to _java_vendor_url_bug it's check for not NULL. Thus, I think that check (_java_vendor_url_bug != NULL) is unnecessary in this case. > > > Also, there's a lot of duplicated "if (eq != NULL) { FreeHeap((void > *)key);}". Maybe these can be consolidated with a "goto"? I know lots > of people haye goto but it will make the clean up less error prone: Thank you for this proposal. Since "goto" is not widely used in Hotspot code I decided to refactor current implementation to avoid duplication of "if (eq != NULL) { FreeHeap((void *)key);}". > > bool Arguments::add_property(const char* prop) { > .... > bool status = false; > .... > char *_java_command_new = os::strdup(value, mtInternal); > if (_java_command_new == NULL) { > goto done; > }else { > if (_java_command != NULL) { > os::free(_java_command); > } > _java_command = _java_command_new; > } > .... > } > // Create new property and add at the end of the list > PropertyList_unique_add(&_system_properties, key, value); > } > status = true; > > done: > if (key != prop) { > // SystemProperty copy passed value, thus free previously allocated > // memory > FreeHeap((void *)key); > } > return status; > } > > Also, using (key != prop) would make the code clearer than (eq != NULL). Fixed! webrev 01: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ webrev 01 vs 00: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ Thank you, Dmitry > > Thanks > - Ioi > > On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >> Hello, >> >> Please review this fix which remove memory leak in >> Arguments::add_property function. Also, I need a sponsor for this >> fix, who can push it. >> >> Arguments::add_property function allocate memory for key and value. >> Then key and values are passed to the PropertyList_unique_add >> function which use SystemProperty class to add or update property >> value. SystemProperty class maintains it's own copy of key and value >> and thus copy passed key and value. Therefore key and value must be >> freed in add_property function(with exception for value in case of >> "java.vendor.url.bug" and "sun.java.command" properties). >> >> In this fix I allocate memory only for key when passed property >> contains value. If passed property not contains value, then I not >> allocate memory for key and use passed property string. Value also >> extracted from passed property string instead of allocating. To >> accomplish that I changed declaration of "value" in several functions >> from "char *" to "const char *" since value is not modified in these >> functions(PropertyList_* functions, SystemProperty class methods). >> >> Processing of "java.vendor.url.bug" and "sun.java.command" properties >> also corrected. Now when these properties redefined, then code checks >> if memory was allocated for special variables of these >> properties(checking that not contains default value) and free it. >> >> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >> Tested: JPRT(hotspot test set), hotspot all, vm.quick >> >> Thanks, >> Dmitry > From coleen.phillimore at oracle.com Mon Aug 24 15:18:50 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 24 Aug 2015 11:18:50 -0400 Subject: RFR: 8133561: linux thread id should be reported in decimal in the error reports now In-Reply-To: <55D746A2.1070808@oracle.com> References: <55D7458A.5070100@oracle.com> <55D746A2.1070808@oracle.com> Message-ID: <55DB35DA.8050909@oracle.com> Hi Rachel, Welcome to the group! This is a good change. I verified that for the other platforms, decimal output makes more sense. Thanks, Coleen On 8/21/15 11:41 AM, Rachel Protacio wrote: > Hello, everyone! I've just started with the Hotspot Runtime team - > please take a look at this change. > ---- > Summary: Linux thread id error reports changed back to decimal > > Bug: https://bugs.openjdk.java.net/browse/JDK-8133561 > Webrev: http://cr.openjdk.java.net/~coleenp/8133561/ > > Testing: I visually verified the result with error logs and Show > MessageBoxOnError. Passed jtreg hotspot/test/runtime and RBT "quick" > tests. > > Thank you! > Rachel From rachel.protacio at oracle.com Mon Aug 24 15:19:12 2015 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Mon, 24 Aug 2015 11:19:12 -0400 Subject: RFR: 8133561: linux thread id should be reported in decimal in the error reports now In-Reply-To: <55DB35DA.8050909@oracle.com> References: <55D7458A.5070100@oracle.com> <55D746A2.1070808@oracle.com> <55DB35DA.8050909@oracle.com> Message-ID: <55DB35F0.2070403@oracle.com> Great. Thank you, Coleen! Rachel On 8/24/2015 11:18 AM, Coleen Phillimore wrote: > > Hi Rachel, > > Welcome to the group! This is a good change. I verified that for the > other platforms, decimal output makes more sense. > > Thanks, > Coleen > > On 8/21/15 11:41 AM, Rachel Protacio wrote: >> Hello, everyone! I've just started with the Hotspot Runtime team - >> please take a look at this change. >> ---- >> Summary: Linux thread id error reports changed back to decimal >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8133561 >> Webrev: http://cr.openjdk.java.net/~coleenp/8133561/ >> >> Testing: I visually verified the result with error logs and Show >> MessageBoxOnError. Passed jtreg hotspot/test/runtime and RBT "quick" >> tests. >> >> Thank you! >> Rachel > From jiangli.zhou at oracle.com Mon Aug 24 16:47:56 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Mon, 24 Aug 2015 09:47:56 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55DA32FC.8000405@oracle.com> References: <55C103A4.1060505@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> <55D742E7.9080208@oracle.com> <55DA32FC.8000405@! oracle.com> Message-ID: <0D4F4950-644A-4CBB-A4E4-F3EC16044A8C@oracle.com> Thanks, Dmitry! Jiangli On Aug 23, 2015, at 1:54 PM, Dmitry Dmitriev wrote: > Hi Jiangli, > > Looks good to me! > > Thank you, > Dmitry > > On 22.08.2015 4:47, Jiangli Zhou wrote: >> Hi Dmitry, >> >> On Aug 21, 2015, at 8:25 AM, Dmitry Dmitriev wrote: >> >>> Hello Jiangli, >>> >>> This looks good to me, but I'm not a reviewer. >> Thanks. >> >>> Also, I have question to you. Probably you can clarify me one moment. >>> num_ranges and string_ranges are modified only in code under "#if INCLUDE_ALL_GCS", so it make sense to include all usage of these variables also under "#if INCLUDE_ALL_GCS"? I mean following functions: >>> FileMapInfo::fixup_string_regions() and new FileMapInfo::dealloc_string_regions() function. >> Agreed. Here is the updated webrev: http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/src/share/vm/memory/filemap.cpp.sdiff.html. >> >> I also reverified JPRT builds with the new #ifdef changes. >> >> Thanks for the detailed review! >> >> Jiangli >> >>> Thank you, >>> Dmitry >>> >>> On 20.08.2015 22:55, Jiangli Zhou wrote: >>>> Hi Dmitry, >>>> >>>> Here is the updated runtime webrev that reflects Tom?s latest GC changes. >>>> >>>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ >>>> >>>> I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: >>>> >>>>> Hi Thomas, >>>>> OK, thanks! >>>>> Tom >>>>> >>>>> On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >>>>>> Hi Tom, >>>>>> >>>>>> sorry for the delay... >>>>>> >>>>>> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>>>>>> Hi Thomas, >>>>>>> >>>>>>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>>>>>> Hi, >>>>>>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>>>>>> After some discussion, I've changed the definition and name of >>>>>>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>>>>>> the specified regions, unmapping the memory, rather than adding them to >>>>>>>>> the free list. This means the CDS code will no longer do the unmapping >>>>>>>>> on verification failures. >>>>>>>>> >>>>>>>>> Updated full and incremental webrevs of the GC code are at: >>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>>>>>> >>>>>>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>>>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>>>>>> posting an updated webrev. >>>>>>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>>>>>> paths that uncommit regions like expand_by()/expand_at()? >>>>>>> OK, I made the change. I didn't do it originally because the asserts I >>>>>>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>>>>>> the other call, and shrink_at was so small. Now shrink_at takes a >>>>>>> region count as well. >>>>>>> >>>>>>> Updated full and incremental webrevs are at: >>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>>>>>> >>>>>>> >>>>>> Looks good. >>>>>> >>>>>>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>>>>>> region to remove or reset any auxiliary data structures, if not >>>>>>>> G1CollectedHeap::free_region() (without adding the region to the free >>>>>>>> list). >>>>>>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>>>>>> strange behavior later when the region is reused. >>>>>>> I don't think calling hr_clear should be necessary... If it is, we >>>>>>> should be doing it in shrink_by as well, and I don't think we are. I >>>>>>> don't see how a HeapRegion can be 'reused' without having gone through >>>>>>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>>>>>> that does an hr_clear() as well as the rest of init. Or am I missing >>>>>>> something there? >>>>>> Leave it as is. I thought that a full gc (which is the only case where >>>>>> the heap shrinks at the moment) also clears the remset of these regions >>>>>> at least. >>>>>> >>>>>> It should, I filed JDK-8134048 for looking in this issue. >>>>>> >>>>>> Looks good. >>>>>> >>>>>> Thanks, >>>>>> Thomas >>>>>> >>>>>> > From ioi.lam at oracle.com Mon Aug 24 17:42:50 2015 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 24 Aug 2015 10:42:50 -0700 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DB1A61.7020508@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> Message-ID: <55DB579A.9000903@oracle.com> Hi Dmitry, The new changes look good. For defensive programming, I would suggest adding an assert here: 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { assert(_java_vendor_url_bug != NULL, "......"); 1036 os::free((void *)_java_vendor_url_bug); I can sponsor the change, but we still need a Reviewer for this change. Thanks - Ioi On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: > Hi Ioi, > > Thank you for comments! Please, see my answers inline. > > On 24.08.2015 2:13, Ioi Lam wrote: >> Hi Dmitry, >> >> Is this change part of 8132725? >> >> 3904 jint code = set_aggressive_opts_flags(); >> 3905 if (code != JNI_OK) { >> 3906 return code; >> 3907 } > Yes, set_aggressive_opts_flags not check return value of add_property > function, so I add check to the set_aggressive_opts_flags()(lines > 1911-1913 in new arguments.cpp) and thus now it returns jint. > >> >> >> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >> >> >> also check (_java_vendor_url_bug != NULL) for sanity? > I think that this is unnecessary in this case, because > _java_vendor_url_bug can not be NULL. _java_vendor_url_bug initialized > to DEFAULT_VENDOR_URL_BUG and changed only in add_property function. > Before new value is assigned to _java_vendor_url_bug it's check for > not NULL. Thus, I think that check (_java_vendor_url_bug != NULL) is > unnecessary in this case. > >> >> >> Also, there's a lot of duplicated "if (eq != NULL) { FreeHeap((void >> *)key);}". Maybe these can be consolidated with a "goto"? I know lots >> of people haye goto but it will make the clean up less error prone: > Thank you for this proposal. Since "goto" is not widely used in > Hotspot code I decided to refactor current implementation to avoid > duplication of "if (eq != NULL) { FreeHeap((void *)key);}". > >> >> bool Arguments::add_property(const char* prop) { >> .... >> bool status = false; >> .... >> char *_java_command_new = os::strdup(value, mtInternal); >> if (_java_command_new == NULL) { >> goto done; >> }else { >> if (_java_command != NULL) { >> os::free(_java_command); >> } >> _java_command = _java_command_new; >> } >> .... >> } >> // Create new property and add at the end of the list >> PropertyList_unique_add(&_system_properties, key, value); >> } >> status = true; >> >> done: >> if (key != prop) { >> // SystemProperty copy passed value, thus free previously allocated >> // memory >> FreeHeap((void *)key); >> } >> return status; >> } >> >> Also, using (key != prop) would make the code clearer than (eq != NULL). > Fixed! > > webrev 01: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ > > webrev 01 vs 00: > http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ > > > Thank you, > Dmitry >> >> Thanks >> - Ioi >> >> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>> Hello, >>> >>> Please review this fix which remove memory leak in >>> Arguments::add_property function. Also, I need a sponsor for this >>> fix, who can push it. >>> >>> Arguments::add_property function allocate memory for key and value. >>> Then key and values are passed to the PropertyList_unique_add >>> function which use SystemProperty class to add or update property >>> value. SystemProperty class maintains it's own copy of key and value >>> and thus copy passed key and value. Therefore key and value must be >>> freed in add_property function(with exception for value in case of >>> "java.vendor.url.bug" and "sun.java.command" properties). >>> >>> In this fix I allocate memory only for key when passed property >>> contains value. If passed property not contains value, then I not >>> allocate memory for key and use passed property string. Value also >>> extracted from passed property string instead of allocating. To >>> accomplish that I changed declaration of "value" in several >>> functions from "char *" to "const char *" since value is not >>> modified in these functions(PropertyList_* functions, SystemProperty >>> class methods). >>> >>> Processing of "java.vendor.url.bug" and "sun.java.command" >>> properties also corrected. Now when these properties redefined, then >>> code checks if memory was allocated for special variables of these >>> properties(checking that not contains default value) and free it. >>> >>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>> >>> Thanks, >>> Dmitry >> > From jiangli.zhou at oracle.com Mon Aug 24 19:17:09 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Mon, 24 Aug 2015 12:17:09 -0700 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DB1A61.7020508@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> Message-ID: Hi Dmitry, My comment is really just an opinion about the style. The ?key? only needs to be a separate string when calling PropertyList_unique_add(), if the strcmp() comparisons are changed to strncmp() with ?prop? as the argument. That means we don?t need to allocate the ?key' string until PropertyList_unique_add() is called. And the ?key? can be freed right after the PropertyList_unique_add() call. That probably would make the code more easier to read. Then the ?_java_command_new? and ?_java_vendor_utl_bug_new? allocation failure cases can just return with ?false? immediately. The ?status? would probably no longer be needed. As there is no issue with the correctness, please fell free to keep the existing code. One minor issue with the code is the indentation at line 1023. Thanks, Jiangli On Aug 24, 2015, at 6:21 AM, Dmitry Dmitriev wrote: > Hi Ioi, > > Thank you for comments! Please, see my answers inline. > > On 24.08.2015 2:13, Ioi Lam wrote: >> Hi Dmitry, >> >> Is this change part of 8132725? >> >> 3904 jint code = set_aggressive_opts_flags(); >> 3905 if (code != JNI_OK) { >> 3906 return code; >> 3907 } > Yes, set_aggressive_opts_flags not check return value of add_property function, so I add check to the set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) and thus now it returns jint. > >> >> >> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >> >> >> also check (_java_vendor_url_bug != NULL) for sanity? > I think that this is unnecessary in this case, because _java_vendor_url_bug can not be NULL. _java_vendor_url_bug initialized to DEFAULT_VENDOR_URL_BUG and changed only in add_property function. Before new value is assigned to _java_vendor_url_bug it's check for not NULL. Thus, I think that check (_java_vendor_url_bug != NULL) is unnecessary in this case. > >> >> >> Also, there's a lot of duplicated "if (eq != NULL) { FreeHeap((void *)key);}". Maybe these can be consolidated with a "goto"? I know lots of people haye goto but it will make the clean up less error prone: > Thank you for this proposal. Since "goto" is not widely used in Hotspot code I decided to refactor current implementation to avoid duplication of "if (eq != NULL) { FreeHeap((void *)key);}". > >> >> bool Arguments::add_property(const char* prop) { >> .... >> bool status = false; >> .... >> char *_java_command_new = os::strdup(value, mtInternal); >> if (_java_command_new == NULL) { >> goto done; >> }else { >> if (_java_command != NULL) { >> os::free(_java_command); >> } >> _java_command = _java_command_new; >> } >> .... >> } >> // Create new property and add at the end of the list >> PropertyList_unique_add(&_system_properties, key, value); >> } >> status = true; >> >> done: >> if (key != prop) { >> // SystemProperty copy passed value, thus free previously allocated >> // memory >> FreeHeap((void *)key); >> } >> return status; >> } >> >> Also, using (key != prop) would make the code clearer than (eq != NULL). > Fixed! > > webrev 01: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ > webrev 01 vs 00: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ > > Thank you, > Dmitry >> >> Thanks >> - Ioi >> >> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>> Hello, >>> >>> Please review this fix which remove memory leak in Arguments::add_property function. Also, I need a sponsor for this fix, who can push it. >>> >>> Arguments::add_property function allocate memory for key and value. Then key and values are passed to the PropertyList_unique_add function which use SystemProperty class to add or update property value. SystemProperty class maintains it's own copy of key and value and thus copy passed key and value. Therefore key and value must be freed in add_property function(with exception for value in case of "java.vendor.url.bug" and "sun.java.command" properties). >>> >>> In this fix I allocate memory only for key when passed property contains value. If passed property not contains value, then I not allocate memory for key and use passed property string. Value also extracted from passed property string instead of allocating. To accomplish that I changed declaration of "value" in several functions from "char *" to "const char *" since value is not modified in these functions(PropertyList_* functions, SystemProperty class methods). >>> >>> Processing of "java.vendor.url.bug" and "sun.java.command" properties also corrected. Now when these properties redefined, then code checks if memory was allocated for special variables of these properties(checking that not contains default value) and free it. >>> >>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>> >>> Thanks, >>> Dmitry From dmitry.dmitriev at oracle.com Tue Aug 25 12:27:31 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Tue, 25 Aug 2015 15:27:31 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DB579A.9000903@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> Message-ID: <55DC5F33.6060401@oracle.com> Hi Ioi, Thank you for review and sponsorship! Still need a Reviewer please. I added assert. Also I fix indention on line 1023 and change "char *var_name" to "char* var_name" to match style which used in this function. webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ webrev 02 vs 01: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ Thanks, Dmitry On 24.08.2015 20:42, Ioi Lam wrote: > Hi Dmitry, > > The new changes look good. > > For defensive programming, I would suggest adding an assert here: > > 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { > assert(_java_vendor_url_bug != NULL, "......"); > 1036 os::free((void *)_java_vendor_url_bug); > > I can sponsor the change, but we still need a Reviewer for this change. > > Thanks > - Ioi > > On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >> Hi Ioi, >> >> Thank you for comments! Please, see my answers inline. >> >> On 24.08.2015 2:13, Ioi Lam wrote: >>> Hi Dmitry, >>> >>> Is this change part of 8132725? >>> >>> 3904 jint code = set_aggressive_opts_flags(); >>> 3905 if (code != JNI_OK) { >>> 3906 return code; >>> 3907 } >> Yes, set_aggressive_opts_flags not check return value of add_property >> function, so I add check to the set_aggressive_opts_flags()(lines >> 1911-1913 in new arguments.cpp) and thus now it returns jint. >> >>> >>> >>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>> >>> >> also check (_java_vendor_url_bug != NULL) for sanity? >> I think that this is unnecessary in this case, because >> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >> add_property function. Before new value is assigned to >> _java_vendor_url_bug it's check for not NULL. Thus, I think that >> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >> >>> >>> >>> Also, there's a lot of duplicated "if (eq != NULL) { FreeHeap((void >>> *)key);}". Maybe these can be consolidated with a "goto"? I know >>> lots of people haye goto but it will make the clean up less error >>> prone: >> Thank you for this proposal. Since "goto" is not widely used in >> Hotspot code I decided to refactor current implementation to avoid >> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >> >>> >>> bool Arguments::add_property(const char* prop) { >>> .... >>> bool status = false; >>> .... >>> char *_java_command_new = os::strdup(value, mtInternal); >>> if (_java_command_new == NULL) { >>> goto done; >>> }else { >>> if (_java_command != NULL) { >>> os::free(_java_command); >>> } >>> _java_command = _java_command_new; >>> } >>> .... >>> } >>> // Create new property and add at the end of the list >>> PropertyList_unique_add(&_system_properties, key, value); >>> } >>> status = true; >>> >>> done: >>> if (key != prop) { >>> // SystemProperty copy passed value, thus free previously >>> allocated >>> // memory >>> FreeHeap((void *)key); >>> } >>> return status; >>> } >>> >>> Also, using (key != prop) would make the code clearer than (eq != >>> NULL). >> Fixed! >> >> webrev 01: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >> >> webrev 01 vs 00: >> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >> >> >> Thank you, >> Dmitry >>> >>> Thanks >>> - Ioi >>> >>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>> Hello, >>>> >>>> Please review this fix which remove memory leak in >>>> Arguments::add_property function. Also, I need a sponsor for this >>>> fix, who can push it. >>>> >>>> Arguments::add_property function allocate memory for key and value. >>>> Then key and values are passed to the PropertyList_unique_add >>>> function which use SystemProperty class to add or update property >>>> value. SystemProperty class maintains it's own copy of key and >>>> value and thus copy passed key and value. Therefore key and value >>>> must be freed in add_property function(with exception for value in >>>> case of "java.vendor.url.bug" and "sun.java.command" properties). >>>> >>>> In this fix I allocate memory only for key when passed property >>>> contains value. If passed property not contains value, then I not >>>> allocate memory for key and use passed property string. Value also >>>> extracted from passed property string instead of allocating. To >>>> accomplish that I changed declaration of "value" in several >>>> functions from "char *" to "const char *" since value is not >>>> modified in these functions(PropertyList_* functions, >>>> SystemProperty class methods). >>>> >>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>> properties also corrected. Now when these properties redefined, >>>> then code checks if memory was allocated for special variables of >>>> these properties(checking that not contains default value) and free >>>> it. >>>> >>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>> >>>> Thanks, >>>> Dmitry >>> >> > From dmitry.dmitriev at oracle.com Tue Aug 25 12:31:02 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Tue, 25 Aug 2015 15:31:02 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> Message-ID: <55DC6006.9010406@oracle.com> Hi Jiangli, Thank you for proposal. I think about that and decided to leave code as is, because in such case we need to add additional checks to strncmp to avoid junks at the end of the property key. I fixed the indentation! Thanks! Dmitry On 24.08.2015 22:17, Jiangli Zhou wrote: > Hi Dmitry, > > My comment is really just an opinion about the style. The ?key? only > needs to be a separate string when calling PropertyList_unique_add(), > if the strcmp() comparisons are changed to strncmp() with ?prop? as > the argument. That means we don?t need to allocate the ?key' string > until PropertyList_unique_add() is called. And the ?key? can be freed > right after the PropertyList_unique_add() call. That probably would > make the code more easier to read. Then the ?_java_command_new? and > ?_java_vendor_utl_bug_new? allocation failure cases can just return > with ?false? immediately. The ?status? would probably no longer be > needed. As there is no issue with the correctness, please fell free to > keep the existing code. > > One minor issue with the code is the indentation at line 1023. > > Thanks, > Jiangli > > > On Aug 24, 2015, at 6:21 AM, Dmitry Dmitriev > > wrote: > >> Hi Ioi, >> >> Thank you for comments! Please, see my answers inline. >> >> On 24.08.2015 2:13, Ioi Lam wrote: >>> Hi Dmitry, >>> >>> Is this change part of 8132725? >>> >>> 3904 jint code = set_aggressive_opts_flags(); >>> 3905 if (code != JNI_OK) { >>> 3906 return code; >>> 3907 } >> Yes, set_aggressive_opts_flags not check return value of add_property >> function, so I add check to the set_aggressive_opts_flags()(lines >> 1911-1913 in new arguments.cpp) and thus now it returns jint. >> >>> >>> >>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>> >>> >> also check (_java_vendor_url_bug != NULL) for sanity? >> I think that this is unnecessary in this case, because >> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >> add_property function. Before new value is assigned to >> _java_vendor_url_bug it's check for not NULL. Thus, I think that >> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >> >>> >>> >>> Also, there's a lot of duplicated "if (eq != NULL) { FreeHeap((void >>> *)key);}". Maybe these can be consolidated with a "goto"? I know >>> lots of people haye goto but it will make the clean up less error prone: >> Thank you for this proposal. Since "goto" is not widely used in >> Hotspot code I decided to refactor current implementation to avoid >> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >> >>> >>> bool Arguments::add_property(const char* prop) { >>> .... >>> bool status = false; >>> .... >>> char *_java_command_new = os::strdup(value, mtInternal); >>> if (_java_command_new == NULL) { >>> goto done; >>> }else { >>> if (_java_command != NULL) { >>> os::free(_java_command); >>> } >>> _java_command = _java_command_new; >>> } >>> .... >>> } >>> // Create new property and add at the end of the list >>> PropertyList_unique_add(&_system_properties, key, value); >>> } >>> status = true; >>> >>> done: >>> if (key != prop) { >>> // SystemProperty copy passed value, thus free previously allocated >>> // memory >>> FreeHeap((void *)key); >>> } >>> return status; >>> } >>> >>> Also, using (key != prop) would make the code clearer than (eq != NULL). >> Fixed! >> >> webrev 01:http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >> >> webrev 01 vs >> 00:http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >> >> >> Thank you, >> Dmitry >>> >>> Thanks >>> - Ioi >>> >>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>> Hello, >>>> >>>> Please review this fix which remove memory leak in >>>> Arguments::add_property function. Also, I need a sponsor for this >>>> fix, who can push it. >>>> >>>> Arguments::add_property function allocate memory for key and value. >>>> Then key and values are passed to the PropertyList_unique_add >>>> function which use SystemProperty class to add or update property >>>> value. SystemProperty class maintains it's own copy of key and >>>> value and thus copy passed key and value. Therefore key and value >>>> must be freed in add_property function(with exception for value in >>>> case of "java.vendor.url.bug" and "sun.java.command" properties). >>>> >>>> In this fix I allocate memory only for key when passed property >>>> contains value. If passed property not contains value, then I not >>>> allocate memory for key and use passed property string. Value also >>>> extracted from passed property string instead of allocating. To >>>> accomplish that I changed declaration of "value" in several >>>> functions from "char *" to "const char *" since value is not >>>> modified in these functions(PropertyList_* functions, >>>> SystemProperty class methods). >>>> >>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>> properties also corrected. Now when these properties redefined, >>>> then code checks if memory was allocated for special variables of >>>> these properties(checking that not contains default value) and free it. >>>> >>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>> >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>> >>>> Thanks, >>>> Dmitry > From goetz.lindenmaier at sap.com Tue Aug 25 12:35:48 2015 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 25 Aug 2015 12:35:48 +0000 Subject: RFR: 8134396: Check upper limit of flags setting numbers of stack protection pages. Message-ID: <4295855A5C1DE049A61835A1887419CC2D0199BB@DEWDFEMB12A.global.corp.sap> Hi, I detected a problem with the values of the flags setting the numbers of stack protection pages. TestOptionsWithRanges shows that illegal values for red and yellow pages are possible. In JavaThread::create_stack_guard_pages(), setting StackRedPages=92233720368547753 and StackYellowPages=1 yields len = (92233720368547753 + 1) * os::vm_page_size() = 0x8000000000000000 * os::vm_page_size() = 0 Thus not protecting any pages of the stack. The check in os::init_2() succeeds, as there the Shadow pages are considered, too, so that the overflowed value is > 0. On linux, the VM stops with the misleading message "OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard pages failed." On aix, we get assertions and SIGSEGVs. This fix implements a ConstraintFunc for the three flags involved, checking that no overflow can happen. This is a very nice new feature! http://cr.openjdk.java.net/~goetz/webrevs/8134396-StGPg/webrev.01/ Please review this change. I please need a sponsor. Alternatively, one could set a smaller upper limit in the range. This limit would have to be small enough to fulfill the property for any page size possible on 32 bit systems. Best regards, Goetz. From jiangli.zhou at oracle.com Tue Aug 25 17:09:10 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 25 Aug 2015 10:09:10 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55DA32FC.8000405@oracle.com> References: <55C103A4.1060505@oracle.com> <55C21DC3.5090305@oracle.com> <1438785532.2378.73.camel@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> <55D742E7.9080208@oracle.com> <55DA32FC.8000405@! oracle.com> Message-ID: <0359932D-CDA2-4FC7-9C38-E9C7770655C4@oracle.com> Could someone also help review the runtime part? I need one more Reviewer. http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/ Thanks, Jiangli > On Aug 23, 2015, at 1:54 PM, Dmitry Dmitriev wrote: > > Hi Jiangli, > > Looks good to me! > > Thank you, > Dmitry > > On 22.08.2015 4:47, Jiangli Zhou wrote: >> Hi Dmitry, >> >> On Aug 21, 2015, at 8:25 AM, Dmitry Dmitriev wrote: >> >>> Hello Jiangli, >>> >>> This looks good to me, but I'm not a reviewer. >> Thanks. >> >>> Also, I have question to you. Probably you can clarify me one moment. >>> num_ranges and string_ranges are modified only in code under "#if INCLUDE_ALL_GCS", so it make sense to include all usage of these variables also under "#if INCLUDE_ALL_GCS"? I mean following functions: >>> FileMapInfo::fixup_string_regions() and new FileMapInfo::dealloc_string_regions() function. >> Agreed. Here is the updated webrev: http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/src/share/vm/memory/filemap.cpp.sdiff.html. >> >> I also reverified JPRT builds with the new #ifdef changes. >> >> Thanks for the detailed review! >> >> Jiangli >> >>> Thank you, >>> Dmitry >>> >>> On 20.08.2015 22:55, Jiangli Zhou wrote: >>>> Hi Dmitry, >>>> >>>> Here is the updated runtime webrev that reflects Tom?s latest GC changes. >>>> >>>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ >>>> >>>> I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: >>>> >>>>> Hi Thomas, >>>>> OK, thanks! >>>>> Tom >>>>> >>>>> On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >>>>>> Hi Tom, >>>>>> >>>>>> sorry for the delay... >>>>>> >>>>>> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>>>>>> Hi Thomas, >>>>>>> >>>>>>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>>>>>> Hi, >>>>>>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>>>>>> After some discussion, I've changed the definition and name of >>>>>>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>>>>>> the specified regions, unmapping the memory, rather than adding them to >>>>>>>>> the free list. This means the CDS code will no longer do the unmapping >>>>>>>>> on verification failures. >>>>>>>>> >>>>>>>>> Updated full and incremental webrevs of the GC code are at: >>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>>>>>> >>>>>>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>>>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>>>>>> posting an updated webrev. >>>>>>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>>>>>> paths that uncommit regions like expand_by()/expand_at()? >>>>>>> OK, I made the change. I didn't do it originally because the asserts I >>>>>>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>>>>>> the other call, and shrink_at was so small. Now shrink_at takes a >>>>>>> region count as well. >>>>>>> >>>>>>> Updated full and incremental webrevs are at: >>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>>>>>> >>>>>>> >>>>>> Looks good. >>>>>> >>>>>>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>>>>>> region to remove or reset any auxiliary data structures, if not >>>>>>>> G1CollectedHeap::free_region() (without adding the region to the free >>>>>>>> list). >>>>>>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>>>>>> strange behavior later when the region is reused. >>>>>>> I don't think calling hr_clear should be necessary... If it is, we >>>>>>> should be doing it in shrink_by as well, and I don't think we are. I >>>>>>> don't see how a HeapRegion can be 'reused' without having gone through >>>>>>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>>>>>> that does an hr_clear() as well as the rest of init. Or am I missing >>>>>>> something there? >>>>>> Leave it as is. I thought that a full gc (which is the only case where >>>>>> the heap shrinks at the moment) also clears the remset of these regions >>>>>> at least. >>>>>> >>>>>> It should, I filed JDK-8134048 for looking in this issue. >>>>>> >>>>>> Looks good. >>>>>> >>>>>> Thanks, >>>>>> Thomas >>>>>> >>>>>> > From harold.seigel at oracle.com Tue Aug 25 19:08:23 2015 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 25 Aug 2015 15:08:23 -0400 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <0359932D-CDA2-4FC7-9C38-E9C7770655C4@oracle.com> References: <55C103A4.1060505@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> <55D742E7.9080208@oracle.com> <55DA32FC.8000405@! oracle.com> <0359932D-CDA2-4FC7-9C38-E9C7770655C4@oracle.com> Message-ID: <55DCBD27.1030504@oracle.com> Hi Jiangli, The changes look good. Thanks, Harold On 8/25/2015 1:09 PM, Jiangli Zhou wrote: > Could someone also help review the runtime part? I need one more Reviewer. > > http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/ > > Thanks, > Jiangli > > >> On Aug 23, 2015, at 1:54 PM, Dmitry Dmitriev wrote: >> >> Hi Jiangli, >> >> Looks good to me! >> >> Thank you, >> Dmitry >> >> On 22.08.2015 4:47, Jiangli Zhou wrote: >>> Hi Dmitry, >>> >>> On Aug 21, 2015, at 8:25 AM, Dmitry Dmitriev wrote: >>> >>>> Hello Jiangli, >>>> >>>> This looks good to me, but I'm not a reviewer. >>> Thanks. >>> >>>> Also, I have question to you. Probably you can clarify me one moment. >>>> num_ranges and string_ranges are modified only in code under "#if INCLUDE_ALL_GCS", so it make sense to include all usage of these variables also under "#if INCLUDE_ALL_GCS"? I mean following functions: >>>> FileMapInfo::fixup_string_regions() and new FileMapInfo::dealloc_string_regions() function. >>> Agreed. Here is the updated webrev: http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/src/share/vm/memory/filemap.cpp.sdiff.html. >>> >>> I also reverified JPRT builds with the new #ifdef changes. >>> >>> Thanks for the detailed review! >>> >>> Jiangli >>> >>>> Thank you, >>>> Dmitry >>>> >>>> On 20.08.2015 22:55, Jiangli Zhou wrote: >>>>> Hi Dmitry, >>>>> >>>>> Here is the updated runtime webrev that reflects Tom?s latest GC changes. >>>>> >>>>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ >>>>> >>>>> I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>> On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: >>>>> >>>>>> Hi Thomas, >>>>>> OK, thanks! >>>>>> Tom >>>>>> >>>>>> On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >>>>>>> Hi Tom, >>>>>>> >>>>>>> sorry for the delay... >>>>>>> >>>>>>> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>>>>>>> Hi Thomas, >>>>>>>> >>>>>>>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>>>>>>> Hi, >>>>>>>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>>>>>>> After some discussion, I've changed the definition and name of >>>>>>>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>>>>>>> the specified regions, unmapping the memory, rather than adding them to >>>>>>>>>> the free list. This means the CDS code will no longer do the unmapping >>>>>>>>>> on verification failures. >>>>>>>>>> >>>>>>>>>> Updated full and incremental webrevs of the GC code are at: >>>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>>>>>>> >>>>>>>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>>>>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>>>>>>> posting an updated webrev. >>>>>>>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>>>>>>> paths that uncommit regions like expand_by()/expand_at()? >>>>>>>> OK, I made the change. I didn't do it originally because the asserts I >>>>>>>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>>>>>>> the other call, and shrink_at was so small. Now shrink_at takes a >>>>>>>> region count as well. >>>>>>>> >>>>>>>> Updated full and incremental webrevs are at: >>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>>>>>>> >>>>>>>> >>>>>>> Looks good. >>>>>>> >>>>>>>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>>>>>>> region to remove or reset any auxiliary data structures, if not >>>>>>>>> G1CollectedHeap::free_region() (without adding the region to the free >>>>>>>>> list). >>>>>>>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>>>>>>> strange behavior later when the region is reused. >>>>>>>> I don't think calling hr_clear should be necessary... If it is, we >>>>>>>> should be doing it in shrink_by as well, and I don't think we are. I >>>>>>>> don't see how a HeapRegion can be 'reused' without having gone through >>>>>>>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>>>>>>> that does an hr_clear() as well as the rest of init. Or am I missing >>>>>>>> something there? >>>>>>> Leave it as is. I thought that a full gc (which is the only case where >>>>>>> the heap shrinks at the moment) also clears the remset of these regions >>>>>>> at least. >>>>>>> >>>>>>> It should, I filed JDK-8134048 for looking in this issue. >>>>>>> >>>>>>> Looks good. >>>>>>> >>>>>>> Thanks, >>>>>>> Thomas >>>>>>> >>>>>>> From jiangli.zhou at oracle.com Tue Aug 25 20:39:02 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 25 Aug 2015 13:39:02 -0700 Subject: RFR (S): 8131734: Add free_archive_regions support to G1 for -Xshared:auto In-Reply-To: <55DCBD27.1030504@oracle.com> References: <55C103A4.1060505@oracle.com> <55C248FF.6010308@oracle.com> <55C26249.1090902@oracle.com> <55C2751B.8070400@oracle.com> <1438847329.2009.9.camel@oracle.com> <55C36B31.7000908@oracle.com> <1438871723.2474.37.camel@oracle.com> <55C373F6.9060609@oracle.com> <55C3D8BB.9000603@oracle.com> <48C0E571-0D7B-4D30-85DF-E31D99092A7C@oracle.com> <30B0248C-3BEA-4FAE-861F-DAAE19F56B45@oracle.com> <55C4B806.7050504@oracle.com> <55C4C720.5030903@oracle.com> <55CA3430.3070300@oracle.com> <1439377259.2324.27.camel@oracle.com> <55CCF0DC.2000800@oracle.com> <1440079572.2347.11.camel@oracle.com> <55D5E048.5070404@oracle.com> <8566C008-5A60-4376-94C4-968E1A475769@oracle.com> <55D742E7.9080208@oracle.com> <55DA32FC.8000405@! oracle.com> <0359932D-CDA2-4FC7-9C38-E9C7770655C4@oracle.com> <55D! CBD27.1030504@oracle.com> Message-ID: <93C0EF4C-7ACA-474E-ACEC-46C2DCB93387@oracle.com> Thanks, Harold! Jiangli > On Aug 25, 2015, at 12:08 PM, harold seigel wrote: > > Hi Jiangli, > > The changes look good. > > Thanks, Harold > > On 8/25/2015 1:09 PM, Jiangli Zhou wrote: >> Could someone also help review the runtime part? I need one more Reviewer. >> >> http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/ >> >> Thanks, >> Jiangli >> >> >>> On Aug 23, 2015, at 1:54 PM, Dmitry Dmitriev wrote: >>> >>> Hi Jiangli, >>> >>> Looks good to me! >>> >>> Thank you, >>> Dmitry >>> >>> On 22.08.2015 4:47, Jiangli Zhou wrote: >>>> Hi Dmitry, >>>> >>>> On Aug 21, 2015, at 8:25 AM, Dmitry Dmitriev wrote: >>>> >>>>> Hello Jiangli, >>>>> >>>>> This looks good to me, but I'm not a reviewer. >>>> Thanks. >>>> >>>>> Also, I have question to you. Probably you can clarify me one moment. >>>>> num_ranges and string_ranges are modified only in code under "#if INCLUDE_ALL_GCS", so it make sense to include all usage of these variables also under "#if INCLUDE_ALL_GCS"? I mean following functions: >>>>> FileMapInfo::fixup_string_regions() and new FileMapInfo::dealloc_string_regions() function. >>>> Agreed. Here is the updated webrev: http://cr.openjdk.java.net/~jiangli/8131734/webrev.02/src/share/vm/memory/filemap.cpp.sdiff.html. >>>> >>>> I also reverified JPRT builds with the new #ifdef changes. >>>> >>>> Thanks for the detailed review! >>>> >>>> Jiangli >>>> >>>>> Thank you, >>>>> Dmitry >>>>> >>>>> On 20.08.2015 22:55, Jiangli Zhou wrote: >>>>>> Hi Dmitry, >>>>>> >>>>>> Here is the updated runtime webrev that reflects Tom?s latest GC changes. >>>>>> >>>>>> http://cr.openjdk.java.net/~jiangli/8131734/webrev.01/ >>>>>> >>>>>> I renamed the FileMapInfo::unmap_string_regions() to FileMapInfo::dealloc_string_regions(), which only deallocates the archived string region from the java heap without unmapping. The unmapping is handled by the GC system as the archived string region is part of the java heap. I also added dealloc_string_regions() call to the case where the string region verification fails. >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>> On Aug 20, 2015, at 7:12 AM, Tom Benson wrote: >>>>>> >>>>>>> Hi Thomas, >>>>>>> OK, thanks! >>>>>>> Tom >>>>>>> >>>>>>> On 8/20/2015 10:06 AM, Thomas Schatzl wrote: >>>>>>>> Hi Tom, >>>>>>>> >>>>>>>> sorry for the delay... >>>>>>>> >>>>>>>> On Thu, 2015-08-13 at 15:32 -0400, Tom Benson wrote: >>>>>>>>> Hi Thomas, >>>>>>>>> >>>>>>>>> On 8/12/2015 7:00 AM, Thomas Schatzl wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> On Tue, 2015-08-11 at 13:43 -0400, Tom Benson wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> On 8/7/2015 10:56 AM, Tom Benson wrote: >>>>>>>>>>> After some discussion, I've changed the definition and name of >>>>>>>>>>> free_archive_regions. Now called dealloc_archive_regions, it uncommits >>>>>>>>>>> the specified regions, unmapping the memory, rather than adding them to >>>>>>>>>>> the free list. This means the CDS code will no longer do the unmapping >>>>>>>>>>> on verification failures. >>>>>>>>>>> >>>>>>>>>>> Updated full and incremental webrevs of the GC code are at: >>>>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02/ >>>>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.02.vs.01/ >>>>>>>>>>> >>>>>>>>>>> Tested with JPRT and running benchmarks with the dealloc_ performed >>>>>>>>>>> explicitly. Jiangli also tested the original failing cases, and will be >>>>>>>>>>> posting an updated webrev. >>>>>>>>>> - is it possible that shrink_by() uses shrink_at()? This would avoid two >>>>>>>>>> paths that uncommit regions like expand_by()/expand_at()? >>>>>>>>> OK, I made the change. I didn't do it originally because the asserts I >>>>>>>>> wanted to add for the call from g1CollectedHeap seemed superfluous for >>>>>>>>> the other call, and shrink_at was so small. Now shrink_at takes a >>>>>>>>> region count as well. >>>>>>>>> >>>>>>>>> Updated full and incremental webrevs are at: >>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03/ >>>>>>>>> http://cr.openjdk.java.net/~tbenson/8131734/webrev.03.vs.02/ >>>>>>>>> >>>>>>>>> >>>>>>>> Looks good. >>>>>>>> >>>>>>>>>> - I think the change should call at least HeapRegion::hr_clear() on the >>>>>>>>>> region to remove or reset any auxiliary data structures, if not >>>>>>>>>> G1CollectedHeap::free_region() (without adding the region to the free >>>>>>>>>> list). >>>>>>>>>> Since the HeapRegion* is not deallocated by the uncommit, this may cause >>>>>>>>>> strange behavior later when the region is reused. >>>>>>>>> I don't think calling hr_clear should be necessary... If it is, we >>>>>>>>> should be doing it in shrink_by as well, and I don't think we are. I >>>>>>>>> don't see how a HeapRegion can be 'reused' without having gone through >>>>>>>>> the constructor when expand_ asks (indirectly) for 'new HeapRegion', and >>>>>>>>> that does an hr_clear() as well as the rest of init. Or am I missing >>>>>>>>> something there? >>>>>>>> Leave it as is. I thought that a full gc (which is the only case where >>>>>>>> the heap shrinks at the moment) also clears the remset of these regions >>>>>>>> at least. >>>>>>>> >>>>>>>> It should, I filed JDK-8134048 for looking in this issue. >>>>>>>> >>>>>>>> Looks good. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Thomas >>>>>>>> >>>>>>>> > From daniel.daugherty at oracle.com Tue Aug 25 21:08:26 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Aug 2015 15:08:26 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() Message-ID: <55DCD94A.30705@oracle.com> Greetings, I have a "fix" for a long standing race between JVM shutdown and the JVM statistics subsystem: JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() https://bugs.openjdk.java.net/browse/JDK-8049304 Webrev URL: http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/ Testing: Aurora Adhoc RT-SVC nightly batch Aurora Adhoc vm.tmtools batch Kim's repro sequence for JDK-8049304 Kim's repro sequence for JDK-8129978 JPRT -testset hotspot This "fix": - adds a volatile flag to record whether PerfDataManager is holding data (PerfData objects) - adds PerfDataManager::has_PerfData() to return the flag - changes the Java monitor subsystem's use of PerfData to check both allocation of the monitor subsystem specific PerfData object and the new PerfDataManager::has_PerfData() return value If the global 'UsePerfData' option is false, the system works as it did before. If 'UsePerfData' is true (the default on non-embedded systems), the Java monitor subsystem will allocate a number of PerfData objects to record information. The objects will record information about Java monitor subsystem until the JVM shuts down. When the JVM starts to shutdown, the new PerfDataManager flag will change to false and the Java monitor subsystem will stop using the PerfData objects. This is the new behavior. As noted in the comments I added to the code, the race is still present; I'm just changing the order and the timing to reduce the likelihood of the crash. Thanks, in advance, for any comments, questions or suggestions. Dan From gerard.ziemski at oracle.com Wed Aug 26 14:22:16 2015 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Wed, 26 Aug 2015 09:22:16 -0500 Subject: RFR: 8134396: Check upper limit of flags setting numbers of stack protection pages. In-Reply-To: <4295855A5C1DE049A61835A1887419CC2D0199BB@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2D0199BB@DEWDFEMB12A.global.corp.sap> Message-ID: <55DDCB98.2020605@oracle.com> hi Goetz, I have an upcoming patch that addresses stack pages and all the other runtime flags that have their ranges/constraints unimplemented. Could you withdraw your change for the moment and give me a chance to get my (large) commit in and then we can verify that your concerns are met please? cheers On 08/25/2015 07:35 AM, Lindenmaier, Goetz wrote: > Hi, > > I detected a problem with the values of the flags setting the numbers of stack protection pages. > TestOptionsWithRanges shows that illegal values for red and yellow pages are possible. > > In JavaThread::create_stack_guard_pages(), > setting StackRedPages=92233720368547753 and StackYellowPages=1 yields > > len = (92233720368547753 + 1) * os::vm_page_size() > = 0x8000000000000000 * os::vm_page_size() > = 0 > > Thus not protecting any pages of the stack. > > The check in os::init_2() succeeds, as there the Shadow pages are considered, too, so that the overflowed value is > 0. > > On linux, the VM stops with the misleading message > "OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard pages failed." > On aix, we get assertions and SIGSEGVs. > > This fix implements a ConstraintFunc for the three flags involved, checking that no overflow > can happen. This is a very nice new feature! > http://cr.openjdk.java.net/~goetz/webrevs/8134396-StGPg/webrev.01/ > > Please review this change. I please need a sponsor. > > Alternatively, one could set a smaller upper limit in the range. This limit would have to be small > enough to fulfill the property for any page size possible on 32 bit systems. > > Best regards, > Goetz. > > > From goetz.lindenmaier at sap.com Wed Aug 26 14:25:32 2015 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 26 Aug 2015 14:25:32 +0000 Subject: RFR: 8134396: Check upper limit of flags setting numbers of stack protection pages. In-Reply-To: <55DDCB98.2020605@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2D0199BB@DEWDFEMB12A.global.corp.sap> <55DDCB98.2020605@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2D019DDE@DEWDFEMB12A.global.corp.sap> Hi Gerald, yes, sure, I can wait. Do you have a webrev on review? If so I could test whether it resolves the issue. Best regards, Goetz. -----Original Message----- From: gerard ziemski [mailto:gerard.ziemski at oracle.com] Sent: Mittwoch, 26. August 2015 16:22 To: Lindenmaier, Goetz; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR: 8134396: Check upper limit of flags setting numbers of stack protection pages. hi Goetz, I have an upcoming patch that addresses stack pages and all the other runtime flags that have their ranges/constraints unimplemented. Could you withdraw your change for the moment and give me a chance to get my (large) commit in and then we can verify that your concerns are met please? cheers On 08/25/2015 07:35 AM, Lindenmaier, Goetz wrote: > Hi, > > I detected a problem with the values of the flags setting the numbers of stack protection pages. > TestOptionsWithRanges shows that illegal values for red and yellow pages are possible. > > In JavaThread::create_stack_guard_pages(), > setting StackRedPages=92233720368547753 and StackYellowPages=1 yields > > len = (92233720368547753 + 1) * os::vm_page_size() > = 0x8000000000000000 * os::vm_page_size() > = 0 > > Thus not protecting any pages of the stack. > > The check in os::init_2() succeeds, as there the Shadow pages are considered, too, so that the overflowed value is > 0. > > On linux, the VM stops with the misleading message > "OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard pages failed." > On aix, we get assertions and SIGSEGVs. > > This fix implements a ConstraintFunc for the three flags involved, checking that no overflow > can happen. This is a very nice new feature! > http://cr.openjdk.java.net/~goetz/webrevs/8134396-StGPg/webrev.01/ > > Please review this change. I please need a sponsor. > > Alternatively, one could set a smaller upper limit in the range. This limit would have to be small > enough to fulfill the property for any page size possible on 32 bit systems. > > Best regards, > Goetz. > > > From gerard.ziemski at oracle.com Wed Aug 26 15:01:57 2015 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Wed, 26 Aug 2015 10:01:57 -0500 Subject: RFR: 8134396: Check upper limit of flags setting numbers of stack protection pages. In-Reply-To: <4295855A5C1DE049A61835A1887419CC2D019DDE@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2D0199BB@DEWDFEMB12A.global.corp.sap> <55DDCB98.2020605@oracle.com> <4295855A5C1DE049A61835A1887419CC2D019DDE@DEWDFEMB12A.global.corp.sap> Message-ID: <55DDD4E5.1040708@oracle.com> hi Goetz, I don't have the webrev ready yet, but I will make sure to include you when I post it. cheers On 08/26/2015 09:25 AM, Lindenmaier, Goetz wrote: > Hi Gerald, > > yes, sure, I can wait. > Do you have a webrev on review? If so I could test whether it resolves the > issue. > > Best regards, > Goetz. > > -----Original Message----- > From: gerard ziemski [mailto:gerard.ziemski at oracle.com] > Sent: Mittwoch, 26. August 2015 16:22 > To: Lindenmaier, Goetz; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR: 8134396: Check upper limit of flags setting numbers of stack protection pages. > > hi Goetz, > > I have an upcoming patch that addresses stack pages and all the other runtime flags that have their ranges/constraints > unimplemented. Could you withdraw your change for the moment and give me a chance to get my (large) commit in and then > we can verify that your concerns are met please? > > > cheers > > On 08/25/2015 07:35 AM, Lindenmaier, Goetz wrote: >> Hi, >> >> I detected a problem with the values of the flags setting the numbers of stack protection pages. >> TestOptionsWithRanges shows that illegal values for red and yellow pages are possible. >> >> In JavaThread::create_stack_guard_pages(), >> setting StackRedPages=92233720368547753 and StackYellowPages=1 yields >> >> len = (92233720368547753 + 1) * os::vm_page_size() >> = 0x8000000000000000 * os::vm_page_size() >> = 0 >> >> Thus not protecting any pages of the stack. >> >> The check in os::init_2() succeeds, as there the Shadow pages are considered, too, so that the overflowed value is > 0. >> >> On linux, the VM stops with the misleading message >> "OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard pages failed." >> On aix, we get assertions and SIGSEGVs. >> >> This fix implements a ConstraintFunc for the three flags involved, checking that no overflow >> can happen. This is a very nice new feature! >> http://cr.openjdk.java.net/~goetz/webrevs/8134396-StGPg/webrev.01/ >> >> Please review this change. I please need a sponsor. >> >> Alternatively, one could set a smaller upper limit in the range. This limit would have to be small >> enough to fulfill the property for any page size possible on 32 bit systems. >> >> Best regards, >> Goetz. >> >> >> > > From dmitry.dmitriev at oracle.com Wed Aug 26 18:05:18 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Wed, 26 Aug 2015 21:05:18 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DC5F33.6060401@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> Message-ID: <55DDFFDE.5020101@oracle.com> Hello, Still need a Reviewer. Can someone review this patch? Thank you! Dmitry On 25.08.2015 15:27, Dmitry Dmitriev wrote: > Hi Ioi, > > Thank you for review and sponsorship! Still need a Reviewer please. > > I added assert. Also I fix indention on line 1023 and change "char > *var_name" to "char* var_name" to match style which used in this > function. > > webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ > > webrev 02 vs 01: > http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ > > > Thanks, > Dmitry > > On 24.08.2015 20:42, Ioi Lam wrote: >> Hi Dmitry, >> >> The new changes look good. >> >> For defensive programming, I would suggest adding an assert here: >> >> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >> assert(_java_vendor_url_bug != NULL, "......"); >> 1036 os::free((void *)_java_vendor_url_bug); >> >> I can sponsor the change, but we still need a Reviewer for this change. >> >> Thanks >> - Ioi >> >> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>> Hi Ioi, >>> >>> Thank you for comments! Please, see my answers inline. >>> >>> On 24.08.2015 2:13, Ioi Lam wrote: >>>> Hi Dmitry, >>>> >>>> Is this change part of 8132725? >>>> >>>> 3904 jint code = set_aggressive_opts_flags(); >>>> 3905 if (code != JNI_OK) { >>>> 3906 return code; >>>> 3907 } >>> Yes, set_aggressive_opts_flags not check return value of >>> add_property function, so I add check to the >>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>> and thus now it returns jint. >>> >>>> >>>> >>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>> >>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>> I think that this is unnecessary in this case, because >>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>> add_property function. Before new value is assigned to >>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>> >>>> >>>> >>>> Also, there's a lot of duplicated "if (eq != NULL) { FreeHeap((void >>>> *)key);}". Maybe these can be consolidated with a "goto"? I know >>>> lots of people haye goto but it will make the clean up less error >>>> prone: >>> Thank you for this proposal. Since "goto" is not widely used in >>> Hotspot code I decided to refactor current implementation to avoid >>> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >>> >>>> >>>> bool Arguments::add_property(const char* prop) { >>>> .... >>>> bool status = false; >>>> .... >>>> char *_java_command_new = os::strdup(value, mtInternal); >>>> if (_java_command_new == NULL) { >>>> goto done; >>>> }else { >>>> if (_java_command != NULL) { >>>> os::free(_java_command); >>>> } >>>> _java_command = _java_command_new; >>>> } >>>> .... >>>> } >>>> // Create new property and add at the end of the list >>>> PropertyList_unique_add(&_system_properties, key, value); >>>> } >>>> status = true; >>>> >>>> done: >>>> if (key != prop) { >>>> // SystemProperty copy passed value, thus free previously >>>> allocated >>>> // memory >>>> FreeHeap((void *)key); >>>> } >>>> return status; >>>> } >>>> >>>> Also, using (key != prop) would make the code clearer than (eq != >>>> NULL). >>> Fixed! >>> >>> webrev 01: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>> >>> webrev 01 vs 00: >>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>> >>> >>> Thank you, >>> Dmitry >>>> >>>> Thanks >>>> - Ioi >>>> >>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>> Hello, >>>>> >>>>> Please review this fix which remove memory leak in >>>>> Arguments::add_property function. Also, I need a sponsor for this >>>>> fix, who can push it. >>>>> >>>>> Arguments::add_property function allocate memory for key and >>>>> value. Then key and values are passed to the >>>>> PropertyList_unique_add function which use SystemProperty class to >>>>> add or update property value. SystemProperty class maintains it's >>>>> own copy of key and value and thus copy passed key and value. >>>>> Therefore key and value must be freed in add_property >>>>> function(with exception for value in case of "java.vendor.url.bug" >>>>> and "sun.java.command" properties). >>>>> >>>>> In this fix I allocate memory only for key when passed property >>>>> contains value. If passed property not contains value, then I not >>>>> allocate memory for key and use passed property string. Value also >>>>> extracted from passed property string instead of allocating. To >>>>> accomplish that I changed declaration of "value" in several >>>>> functions from "char *" to "const char *" since value is not >>>>> modified in these functions(PropertyList_* functions, >>>>> SystemProperty class methods). >>>>> >>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>> properties also corrected. Now when these properties redefined, >>>>> then code checks if memory was allocated for special variables of >>>>> these properties(checking that not contains default value) and >>>>> free it. >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>> >>>>> Thanks, >>>>> Dmitry >>>> >>> >> > From kim.barrett at oracle.com Wed Aug 26 21:00:04 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 26 Aug 2015 17:00:04 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DCD94A.30705@oracle.com> References: <55DCD94A.30705@oracle.com> Message-ID: On Aug 25, 2015, at 5:08 PM, Daniel D. Daugherty wrote: > > Greetings, > > I have a "fix" for a long standing race between JVM shutdown and the > JVM statistics subsystem: > > JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() > https://bugs.openjdk.java.net/browse/JDK-8049304 > > Webrev URL: http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/ Looking at the webrev, I was initially surprised at the limited number of places that are being changed to check the new predicate for PerfData validity. (The webrev only changes the Java monitor subsystem to call the new predicate, and the RFR even says that, so I really shouldn't have been surprised.) There are lots of places that use PerfData. Don't they all need to be updated? I was hoping the answer would be no, but after reviewing the bug thread and code, I think that hope was in vain. The problem is that exit_globals (and so, ultimately, perfMemory_exit and the proposed modification in PerfDataManager::destroy) is called from a variety of contexts, some of which may have concurrent threads that may touch non-monitor PerfData. A crash in the GC is just one example of the places where this situation could arise. Unfortunately, this puts me back in the position of thinking we should just leak the memory when we're on our way to process exit. That is, callers of exit_globals indicate (via a new flag argument) whether we're on the way to process exit, and that flag gets passed down to perfMemory_exit, which elides the call to PerfDataManager::destroy (but still must call PerfMemory::destroy) when we're exiting the process. Dan, sorry for reversing positions on you, but I'd been so focused on the monitor crashes that we were looking at that I missed the wider scope of the problem. From daniel.daugherty at oracle.com Wed Aug 26 21:15:14 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 26 Aug 2015 15:15:14 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: References: <55DCD94A.30705@oracle.com> Message-ID: <55DE2C62.2@oracle.com> Kim, Thanks for the review! On 8/26/15 3:00 PM, Kim Barrett wrote: > On Aug 25, 2015, at 5:08 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a "fix" for a long standing race between JVM shutdown and the >> JVM statistics subsystem: >> >> JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() >> https://bugs.openjdk.java.net/browse/JDK-8049304 >> >> Webrev URL: http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/ > Looking at the webrev, I was initially surprised at the limited number > of places that are being changed to check the new predicate for > PerfData validity. (The webrev only changes the Java monitor subsystem > to call the new predicate, and the RFR even says that, so I really > shouldn't have been surprised.) There are lots of places that use > PerfData. Don't they all need to be updated? I was hoping the answer > would be no, but after reviewing the bug thread and code, I think that > hope was in vain. I have only seen sightings of this crash in the monitor subsystem. Do you know of sightings with other PerfData usage? > The problem is that exit_globals (and so, ultimately, perfMemory_exit > and the proposed modification in PerfDataManager::destroy) is called > from a variety of contexts, some of which may have concurrent threads > that may touch non-monitor PerfData. A crash in the GC is just one > example of the places where this situation could arise. > > Unfortunately, this puts me back in the position of thinking we should > just leak the memory when we're on our way to process exit. That is, > callers of exit_globals indicate (via a new flag argument) whether > we're on the way to process exit, and that flag gets passed down to > perfMemory_exit, which elides the call to PerfDataManager::destroy > (but still must call PerfMemory::destroy) when we're exiting the > process. Sorry, I'm not in favor of leaking memory. > Dan, sorry for reversing positions on you, but I'd been so focused on > the monitor crashes that we were looking at that I missed the wider > scope of the problem. It's a little disappointing, but I'll live. :-) Dan From coleen.phillimore at oracle.com Wed Aug 26 21:57:27 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 26 Aug 2015 17:57:27 -0400 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DDFFDE.5020101@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> Message-ID: <55DE3647.3000405@oracle.com> + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); + + if (tmp_key == NULL) { + return false; } AllocateHeap will call vm_exit_out_of_memory if it fails, and not return NULL. You have to add AllocFailStrategy::RETURN_NULL Otherwise, this seems good. Thanks for not adding a goto. Coleen On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: > Hello, > > Still need a Reviewer. Can someone review this patch? Thank you! > > Dmitry > > On 25.08.2015 15:27, Dmitry Dmitriev wrote: >> Hi Ioi, >> >> Thank you for review and sponsorship! Still need a Reviewer please. >> >> I added assert. Also I fix indention on line 1023 and change "char >> *var_name" to "char* var_name" to match style which used in this >> function. >> >> webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >> >> webrev 02 vs 01: >> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >> >> >> Thanks, >> Dmitry >> >> On 24.08.2015 20:42, Ioi Lam wrote: >>> Hi Dmitry, >>> >>> The new changes look good. >>> >>> For defensive programming, I would suggest adding an assert here: >>> >>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>> assert(_java_vendor_url_bug != NULL, "......"); >>> 1036 os::free((void *)_java_vendor_url_bug); >>> >>> I can sponsor the change, but we still need a Reviewer for this change. >>> >>> Thanks >>> - Ioi >>> >>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>> Hi Ioi, >>>> >>>> Thank you for comments! Please, see my answers inline. >>>> >>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>> Hi Dmitry, >>>>> >>>>> Is this change part of 8132725? >>>>> >>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>> 3905 if (code != JNI_OK) { >>>>> 3906 return code; >>>>> 3907 } >>>> Yes, set_aggressive_opts_flags not check return value of >>>> add_property function, so I add check to the >>>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>>> and thus now it returns jint. >>>> >>>>> >>>>> >>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>> >>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>> I think that this is unnecessary in this case, because >>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>> add_property function. Before new value is assigned to >>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>> >>>>> >>>>> >>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with a >>>>> "goto"? I know lots of people haye goto but it will make the clean >>>>> up less error prone: >>>> Thank you for this proposal. Since "goto" is not widely used in >>>> Hotspot code I decided to refactor current implementation to avoid >>>> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >>>> >>>>> >>>>> bool Arguments::add_property(const char* prop) { >>>>> .... >>>>> bool status = false; >>>>> .... >>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>> if (_java_command_new == NULL) { >>>>> goto done; >>>>> }else { >>>>> if (_java_command != NULL) { >>>>> os::free(_java_command); >>>>> } >>>>> _java_command = _java_command_new; >>>>> } >>>>> .... >>>>> } >>>>> // Create new property and add at the end of the list >>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>> } >>>>> status = true; >>>>> >>>>> done: >>>>> if (key != prop) { >>>>> // SystemProperty copy passed value, thus free previously >>>>> allocated >>>>> // memory >>>>> FreeHeap((void *)key); >>>>> } >>>>> return status; >>>>> } >>>>> >>>>> Also, using (key != prop) would make the code clearer than (eq != >>>>> NULL). >>>> Fixed! >>>> >>>> webrev 01: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>> >>>> webrev 01 vs 00: >>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>> >>>> >>>> Thank you, >>>> Dmitry >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>> Hello, >>>>>> >>>>>> Please review this fix which remove memory leak in >>>>>> Arguments::add_property function. Also, I need a sponsor for this >>>>>> fix, who can push it. >>>>>> >>>>>> Arguments::add_property function allocate memory for key and >>>>>> value. Then key and values are passed to the >>>>>> PropertyList_unique_add function which use SystemProperty class >>>>>> to add or update property value. SystemProperty class maintains >>>>>> it's own copy of key and value and thus copy passed key and >>>>>> value. Therefore key and value must be freed in add_property >>>>>> function(with exception for value in case of >>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>> >>>>>> In this fix I allocate memory only for key when passed property >>>>>> contains value. If passed property not contains value, then I not >>>>>> allocate memory for key and use passed property string. Value >>>>>> also extracted from passed property string instead of allocating. >>>>>> To accomplish that I changed declaration of "value" in several >>>>>> functions from "char *" to "const char *" since value is not >>>>>> modified in these functions(PropertyList_* functions, >>>>>> SystemProperty class methods). >>>>>> >>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>> properties also corrected. Now when these properties redefined, >>>>>> then code checks if memory was allocated for special variables of >>>>>> these properties(checking that not contains default value) and >>>>>> free it. >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>> >>>>>> Thanks, >>>>>> Dmitry >>>>> >>>> >>> >> > From kim.barrett at oracle.com Wed Aug 26 22:02:09 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 26 Aug 2015 18:02:09 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DE2C62.2@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE2C62.2@oracle.com> Message-ID: On Aug 26, 2015, at 5:15 PM, Daniel D. Daugherty wrote: > > On 8/26/15 3:00 PM, Kim Barrett wrote: >> ...There are lots of places that use >> PerfData. Don't they all need to be updated? I was hoping the answer >> would be no, but after reviewing the bug thread and code, I think that >> hope was in vain. > > I have only seen sightings of this crash in the monitor subsystem. > Do you know of sightings with other PerfData usage? I don't, but I haven't been looking. Also, the monitor subsystem gets hit a lot more heavily than any other PerfData I've looked at, and sightings in the monitor subsystem are pretty rare. I'm pretty sure I could demonstrate one by patching in some sleeps and flag spin-waits in order to achieve the necessary state. >> Unfortunately, this puts me back in the position of thinking we should >> just leak the memory when we're on our way to process exit. ... > > Sorry, I'm not in favor of leaking memory. We're on our way to process exit, where all such sins are forgiven. We already leak like a bucket without a bottom (sieves being too fine to accurately model the situation) when performing an abnormal exit. Actually, a possible upside to this leak would be that the PerfData will be present in a core file. That could maybe even be useful. From daniel.daugherty at oracle.com Wed Aug 26 22:46:39 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 26 Aug 2015 16:46:39 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: References: <55DCD94A.30705@oracle.com> <55DE2C62.2@oracle.com> Message-ID: <55DE41CF.1030002@oracle.com> On 8/26/15 4:02 PM, Kim Barrett wrote: > On Aug 26, 2015, at 5:15 PM, Daniel D. Daugherty wrote: >> On 8/26/15 3:00 PM, Kim Barrett wrote: >>> ...There are lots of places that use >>> PerfData. Don't they all need to be updated? I was hoping the answer >>> would be no, but after reviewing the bug thread and code, I think that >>> hope was in vain. >> I have only seen sightings of this crash in the monitor subsystem. >> Do you know of sightings with other PerfData usage? > I don't, but I haven't been looking. Also, the monitor subsystem gets > hit a lot more heavily than any other PerfData I've looked at, and > sightings in the monitor subsystem are pretty rare. I'm pretty confident we have the monitor case well under control and I'd like to move forward with this fix and see what else shakes out (if anything). The other idea that you had about the SIGSEGV signal handler would be the next place I'd look at if we continue to see issues with this type of race. > I'm pretty sure I could demonstrate one by patching in some sleeps and > flag spin-waits in order to achieve the necessary state. I have no doubt since you came up with the debugging code to make the monitor subsystem race easily reproducible. However, I can't find any other PerfData sighting so I'm wondering if non-monitor subsystem races are more rare. >>> Unfortunately, this puts me back in the position of thinking we should >>> just leak the memory when we're on our way to process exit. ... >> Sorry, I'm not in favor of leaking memory. > We're on our way to process exit, where all such sins are forgiven. I thought the general idea was that we're trying to reduce our memory leaks so that we can eventually reach the stretch goal of being able to restart the VM in the same process. I don't want to add another memory leak to the system without a very compelling reason to do so. So far I'm not convinced especially without any existing bugs pointing to PerfData races with non-monitor subsystem usage. > We already leak like a bucket without a bottom (sieves being too fine > to accurately model the situation) when performing an abnormal exit. But we're not talking about an abnormal exit here. > Actually, a possible upside to this leak would be that the PerfData will > be present in a core file. That could maybe even be useful. I'm pretty sure if you have a PerfData server attached, then we leave the PerfData objects laying around so they can be harvested. However, I'm not really a user of the PerfData stuff so I don't know that for certain. Dan From david.holmes at oracle.com Thu Aug 27 04:26:03 2015 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Aug 2015 14:26:03 +1000 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DCD94A.30705@oracle.com> References: <55DCD94A.30705@oracle.com> Message-ID: <55DE915B.9020605@oracle.com> Hi Dan, On 26/08/2015 7:08 AM, Daniel D. Daugherty wrote: > Greetings, > > I have a "fix" for a long standing race between JVM shutdown and the > JVM statistics subsystem: > > JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() > https://bugs.openjdk.java.net/browse/JDK-8049304 > > Webrev URL: http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/ > > Testing: Aurora Adhoc RT-SVC nightly batch > Aurora Adhoc vm.tmtools batch > Kim's repro sequence for JDK-8049304 > Kim's repro sequence for JDK-8129978 > JPRT -testset hotspot > > This "fix": > > - adds a volatile flag to record whether PerfDataManager is holding > data (PerfData objects) > - adds PerfDataManager::has_PerfData() to return the flag > - changes the Java monitor subsystem's use of PerfData to > check both allocation of the monitor subsystem specific > PerfData object and the new PerfDataManager::has_PerfData() > return value > > If the global 'UsePerfData' option is false, the system works as > it did before. If 'UsePerfData' is true (the default on non-embedded > systems), the Java monitor subsystem will allocate a number of > PerfData objects to record information. The objects will record > information about Java monitor subsystem until the JVM shuts down. > > When the JVM starts to shutdown, the new PerfDataManager flag will > change to false and the Java monitor subsystem will stop using the > PerfData objects. This is the new behavior. As noted in the comments > I added to the code, the race is still present; I'm just changing > the order and the timing to reduce the likelihood of the crash. Right. To sum up: the basic problem is that the PerfData objects are deallocated at the safepoint established for VM termination, but those objects can actually be used by threads that are in a safepoint-safe state: in particular within the low-level synchronization code. As you say this fix narrows the window where a crash can occur, but can not close it. If a thread is descheduled after the check of hasPerfData it can still access the PerfData object when it resumes, which may be after the object was deallocated. There's no true fix here without introducing synchronization (which would have to be even lower-level to avoid reentrant use of the same code we're fixing!) and the overhead of that would be prohibitive for these perf counters. In response to Kim's concern about other code that uses PerfData objects I think you would have to examine those uses to see which, if any, can occur from either a non-JavaThread, or from within the code where a thread is considered safepoint-safe. I'm inclined to agree that given we have not seen issues with such code, either it does not exist or is extremely unlikely to hit this issue. Given the "fix" is itself only narrowing the window it doesn't seem necessary to address code that already has a narrower window. That all said "leaking" the PerfData objects seems no less unpleasant a "fix". There are so many obstacles in the way of being able to unload and re-load the JVM that I do not think this makes the position measurably worse. In fact I can imagine that if we were to allow for such behaviour we would need to be able to terminate threads and reclaim all their resources (like Monitor instances), at which point it would also become easy to deallocate shared memory like PerfData objects. I'll leave it up to you which way to go. As it stands this is Reviewed. Thanks, David > Thanks, in advance, for any comments, questions or suggestions. > > Dan > From ioi.lam at oracle.com Thu Aug 27 05:27:59 2015 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 26 Aug 2015 22:27:59 -0700 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DE3647.3000405@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> Message-ID: <55DE9FDF.7030701@oracle.com> On the topic of goto, some people like to do this: do { if (...) { break; } ... if (...) { break; } } while (0); // "break" will "goto" here Will this be less of an eyesore than "goto"? - Ioi On 8/26/15 2:57 PM, Coleen Phillimore wrote: > > + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); > + > + if (tmp_key == NULL) { > + return false; > } > > AllocateHeap will call vm_exit_out_of_memory if it fails, and not > return NULL. You have to add AllocFailStrategy::RETURN_NULL > > Otherwise, this seems good. > > Thanks for not adding a goto. > > Coleen > > > On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >> Hello, >> >> Still need a Reviewer. Can someone review this patch? Thank you! >> >> Dmitry >> >> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>> Hi Ioi, >>> >>> Thank you for review and sponsorship! Still need a Reviewer please. >>> >>> I added assert. Also I fix indention on line 1023 and change "char >>> *var_name" to "char* var_name" to match style which used in this >>> function. >>> >>> webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>> >>> webrev 02 vs 01: >>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>> >>> >>> Thanks, >>> Dmitry >>> >>> On 24.08.2015 20:42, Ioi Lam wrote: >>>> Hi Dmitry, >>>> >>>> The new changes look good. >>>> >>>> For defensive programming, I would suggest adding an assert here: >>>> >>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>> assert(_java_vendor_url_bug != NULL, "......"); >>>> 1036 os::free((void *)_java_vendor_url_bug); >>>> >>>> I can sponsor the change, but we still need a Reviewer for this >>>> change. >>>> >>>> Thanks >>>> - Ioi >>>> >>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>> Hi Ioi, >>>>> >>>>> Thank you for comments! Please, see my answers inline. >>>>> >>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>> Hi Dmitry, >>>>>> >>>>>> Is this change part of 8132725? >>>>>> >>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>> 3905 if (code != JNI_OK) { >>>>>> 3906 return code; >>>>>> 3907 } >>>>> Yes, set_aggressive_opts_flags not check return value of >>>>> add_property function, so I add check to the >>>>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>>>> and thus now it returns jint. >>>>> >>>>>> >>>>>> >>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>> >>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>> I think that this is unnecessary in this case, because >>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>> add_property function. Before new value is assigned to >>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>>> >>>>>> >>>>>> >>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with a >>>>>> "goto"? I know lots of people haye goto but it will make the >>>>>> clean up less error prone: >>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>> Hotspot code I decided to refactor current implementation to avoid >>>>> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". > >>>>> >>>>>> >>>>>> bool Arguments::add_property(const char* prop) { >>>>>> .... >>>>>> bool status = false; >>>>>> .... >>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>> if (_java_command_new == NULL) { >>>>>> goto done; >>>>>> }else { >>>>>> if (_java_command != NULL) { >>>>>> os::free(_java_command); >>>>>> } >>>>>> _java_command = _java_command_new; >>>>>> } >>>>>> .... >>>>>> } >>>>>> // Create new property and add at the end of the list >>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>> } >>>>>> status = true; >>>>>> >>>>>> done: >>>>>> if (key != prop) { >>>>>> // SystemProperty copy passed value, thus free previously >>>>>> allocated >>>>>> // memory >>>>>> FreeHeap((void *)key); >>>>>> } >>>>>> return status; >>>>>> } >>>>>> >>>>>> Also, using (key != prop) would make the code clearer than (eq != >>>>>> NULL). >>>>> Fixed! >>>>> >>>>> webrev 01: >>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>> >>>>> webrev 01 vs 00: >>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>> >>>>> >>>>> Thank you, >>>>> Dmitry >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>> Hello, >>>>>>> >>>>>>> Please review this fix which remove memory leak in >>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>> this fix, who can push it. >>>>>>> >>>>>>> Arguments::add_property function allocate memory for key and >>>>>>> value. Then key and values are passed to the >>>>>>> PropertyList_unique_add function which use SystemProperty class >>>>>>> to add or update property value. SystemProperty class maintains >>>>>>> it's own copy of key and value and thus copy passed key and >>>>>>> value. Therefore key and value must be freed in add_property >>>>>>> function(with exception for value in case of >>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>> >>>>>>> In this fix I allocate memory only for key when passed property >>>>>>> contains value. If passed property not contains value, then I >>>>>>> not allocate memory for key and use passed property string. >>>>>>> Value also extracted from passed property string instead of >>>>>>> allocating. To accomplish that I changed declaration of "value" >>>>>>> in several functions from "char *" to "const char *" since >>>>>>> value is not modified in these functions(PropertyList_* >>>>>>> functions, SystemProperty class methods). >>>>>>> >>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>> properties also corrected. Now when these properties redefined, >>>>>>> then code checks if memory was allocated for special variables >>>>>>> of these properties(checking that not contains default value) >>>>>>> and free it. >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>> >>>>>>> Thanks, >>>>>>> Dmitry >>>>>> >>>>> >>>> >>> >> > From david.holmes at oracle.com Thu Aug 27 06:51:23 2015 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Aug 2015 16:51:23 +1000 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DE9FDF.7030701@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> <55DE9FDF.7030701@oracle.com> Message-ID: <55DEB36B.2070907@oracle.com> On 27/08/2015 3:27 PM, Ioi Lam wrote: > On the topic of goto, some people like to do this: > > do { > if (...) { > break; > } > ... > if (...) { > break; > } > } while (0); > // "break" will "goto" here > > Will this be less of an eyesore than "goto"? No! A goto by any other name ... :) Might as well just use a goto if you are going to resort to such an ugly structure just to avoid using goto. David > - Ioi > > > On 8/26/15 2:57 PM, Coleen Phillimore wrote: >> >> + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); >> + >> + if (tmp_key == NULL) { >> + return false; >> } >> >> AllocateHeap will call vm_exit_out_of_memory if it fails, and not >> return NULL. You have to add AllocFailStrategy::RETURN_NULL >> >> Otherwise, this seems good. >> >> Thanks for not adding a goto. >> >> Coleen >> >> >> On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >>> Hello, >>> >>> Still need a Reviewer. Can someone review this patch? Thank you! >>> >>> Dmitry >>> >>> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>>> Hi Ioi, >>>> >>>> Thank you for review and sponsorship! Still need a Reviewer please. >>>> >>>> I added assert. Also I fix indention on line 1023 and change "char >>>> *var_name" to "char* var_name" to match style which used in this >>>> function. >>>> >>>> webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>>> >>>> webrev 02 vs 01: >>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>>> >>>> >>>> Thanks, >>>> Dmitry >>>> >>>> On 24.08.2015 20:42, Ioi Lam wrote: >>>>> Hi Dmitry, >>>>> >>>>> The new changes look good. >>>>> >>>>> For defensive programming, I would suggest adding an assert here: >>>>> >>>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>> assert(_java_vendor_url_bug != NULL, "......"); >>>>> 1036 os::free((void *)_java_vendor_url_bug); >>>>> >>>>> I can sponsor the change, but we still need a Reviewer for this >>>>> change. >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> Thank you for comments! Please, see my answers inline. >>>>>> >>>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>>> Hi Dmitry, >>>>>>> >>>>>>> Is this change part of 8132725? >>>>>>> >>>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>>> 3905 if (code != JNI_OK) { >>>>>>> 3906 return code; >>>>>>> 3907 } >>>>>> Yes, set_aggressive_opts_flags not check return value of >>>>>> add_property function, so I add check to the >>>>>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>>>>> and thus now it returns jint. >>>>>> >>>>>>> >>>>>>> >>>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>>> >>>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>>> I think that this is unnecessary in this case, because >>>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>>> add_property function. Before new value is assigned to >>>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>>>> >>>>>>> >>>>>>> >>>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with a >>>>>>> "goto"? I know lots of people haye goto but it will make the >>>>>>> clean up less error prone: >>>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>>> Hotspot code I decided to refactor current implementation to avoid >>>>>> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >> >>>>>> >>>>>>> >>>>>>> bool Arguments::add_property(const char* prop) { >>>>>>> .... >>>>>>> bool status = false; >>>>>>> .... >>>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>>> if (_java_command_new == NULL) { >>>>>>> goto done; >>>>>>> }else { >>>>>>> if (_java_command != NULL) { >>>>>>> os::free(_java_command); >>>>>>> } >>>>>>> _java_command = _java_command_new; >>>>>>> } >>>>>>> .... >>>>>>> } >>>>>>> // Create new property and add at the end of the list >>>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>>> } >>>>>>> status = true; >>>>>>> >>>>>>> done: >>>>>>> if (key != prop) { >>>>>>> // SystemProperty copy passed value, thus free previously >>>>>>> allocated >>>>>>> // memory >>>>>>> FreeHeap((void *)key); >>>>>>> } >>>>>>> return status; >>>>>>> } >>>>>>> >>>>>>> Also, using (key != prop) would make the code clearer than (eq != >>>>>>> NULL). >>>>>> Fixed! >>>>>> >>>>>> webrev 01: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>>> >>>>>> webrev 01 vs 00: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>>> >>>>>> >>>>>> Thank you, >>>>>> Dmitry >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> Please review this fix which remove memory leak in >>>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>>> this fix, who can push it. >>>>>>>> >>>>>>>> Arguments::add_property function allocate memory for key and >>>>>>>> value. Then key and values are passed to the >>>>>>>> PropertyList_unique_add function which use SystemProperty class >>>>>>>> to add or update property value. SystemProperty class maintains >>>>>>>> it's own copy of key and value and thus copy passed key and >>>>>>>> value. Therefore key and value must be freed in add_property >>>>>>>> function(with exception for value in case of >>>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>>> >>>>>>>> In this fix I allocate memory only for key when passed property >>>>>>>> contains value. If passed property not contains value, then I >>>>>>>> not allocate memory for key and use passed property string. >>>>>>>> Value also extracted from passed property string instead of >>>>>>>> allocating. To accomplish that I changed declaration of "value" >>>>>>>> in several functions from "char *" to "const char *" since >>>>>>>> value is not modified in these functions(PropertyList_* >>>>>>>> functions, SystemProperty class methods). >>>>>>>> >>>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>>> properties also corrected. Now when these properties redefined, >>>>>>>> then code checks if memory was allocated for special variables >>>>>>>> of these properties(checking that not contains default value) >>>>>>>> and free it. >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>>> >>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Dmitry >>>>>>> >>>>>> >>>>> >>>> >>> >> > From dmitry.dmitriev at oracle.com Thu Aug 27 11:43:58 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Thu, 27 Aug 2015 14:43:58 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DE3647.3000405@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> Message-ID: <55DEF7FE.4070806@oracle.com> Hello Coleen, Thank you for review and hint about AllocateHeap. I remove check for 'tmp_key'. In this case new code behave as old code, i.e. call 'vm_exit_out_of_memory' if it fails. Also, I change 'os::strdup' in 'add_property' function to 'os::strdup_check_oom' to achieve the same thing, i.e. behave as old code. In these case we don't need 'status' variable. webrev 03: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03/ webrev 03 vs 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03.vs.02/ Thank you, Dmitry On 27.08.2015 0:57, Coleen Phillimore wrote: > > + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); > + > + if (tmp_key == NULL) { > + return false; > } > > AllocateHeap will call vm_exit_out_of_memory if it fails, and not > return NULL. You have to add AllocFailStrategy::RETURN_NULL > > Otherwise, this seems good. > > Thanks for not adding a goto. > > Coleen > > > On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >> Hello, >> >> Still need a Reviewer. Can someone review this patch? Thank you! >> >> Dmitry >> >> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>> Hi Ioi, >>> >>> Thank you for review and sponsorship! Still need a Reviewer please. >>> >>> I added assert. Also I fix indention on line 1023 and change "char >>> *var_name" to "char* var_name" to match style which used in this >>> function. >>> >>> webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>> >>> webrev 02 vs 01: >>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>> >>> >>> Thanks, >>> Dmitry >>> >>> On 24.08.2015 20:42, Ioi Lam wrote: >>>> Hi Dmitry, >>>> >>>> The new changes look good. >>>> >>>> For defensive programming, I would suggest adding an assert here: >>>> >>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>> assert(_java_vendor_url_bug != NULL, "......"); >>>> 1036 os::free((void *)_java_vendor_url_bug); >>>> >>>> I can sponsor the change, but we still need a Reviewer for this >>>> change. >>>> >>>> Thanks >>>> - Ioi >>>> >>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>> Hi Ioi, >>>>> >>>>> Thank you for comments! Please, see my answers inline. >>>>> >>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>> Hi Dmitry, >>>>>> >>>>>> Is this change part of 8132725? >>>>>> >>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>> 3905 if (code != JNI_OK) { >>>>>> 3906 return code; >>>>>> 3907 } >>>>> Yes, set_aggressive_opts_flags not check return value of >>>>> add_property function, so I add check to the >>>>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>>>> and thus now it returns jint. >>>>> >>>>>> >>>>>> >>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>> >>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>> I think that this is unnecessary in this case, because >>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>> add_property function. Before new value is assigned to >>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>>> >>>>>> >>>>>> >>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with a >>>>>> "goto"? I know lots of people haye goto but it will make the >>>>>> clean up less error prone: >>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>> Hotspot code I decided to refactor current implementation to avoid >>>>> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". > >>>>> >>>>>> >>>>>> bool Arguments::add_property(const char* prop) { >>>>>> .... >>>>>> bool status = false; >>>>>> .... >>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>> if (_java_command_new == NULL) { >>>>>> goto done; >>>>>> }else { >>>>>> if (_java_command != NULL) { >>>>>> os::free(_java_command); >>>>>> } >>>>>> _java_command = _java_command_new; >>>>>> } >>>>>> .... >>>>>> } >>>>>> // Create new property and add at the end of the list >>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>> } >>>>>> status = true; >>>>>> >>>>>> done: >>>>>> if (key != prop) { >>>>>> // SystemProperty copy passed value, thus free previously >>>>>> allocated >>>>>> // memory >>>>>> FreeHeap((void *)key); >>>>>> } >>>>>> return status; >>>>>> } >>>>>> >>>>>> Also, using (key != prop) would make the code clearer than (eq != >>>>>> NULL). >>>>> Fixed! >>>>> >>>>> webrev 01: >>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>> >>>>> webrev 01 vs 00: >>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>> >>>>> >>>>> Thank you, >>>>> Dmitry >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>> Hello, >>>>>>> >>>>>>> Please review this fix which remove memory leak in >>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>> this fix, who can push it. >>>>>>> >>>>>>> Arguments::add_property function allocate memory for key and >>>>>>> value. Then key and values are passed to the >>>>>>> PropertyList_unique_add function which use SystemProperty class >>>>>>> to add or update property value. SystemProperty class maintains >>>>>>> it's own copy of key and value and thus copy passed key and >>>>>>> value. Therefore key and value must be freed in add_property >>>>>>> function(with exception for value in case of >>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>> >>>>>>> In this fix I allocate memory only for key when passed property >>>>>>> contains value. If passed property not contains value, then I >>>>>>> not allocate memory for key and use passed property string. >>>>>>> Value also extracted from passed property string instead of >>>>>>> allocating. To accomplish that I changed declaration of "value" >>>>>>> in several functions from "char *" to "const char *" since >>>>>>> value is not modified in these functions(PropertyList_* >>>>>>> functions, SystemProperty class methods). >>>>>>> >>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>> properties also corrected. Now when these properties redefined, >>>>>>> then code checks if memory was allocated for special variables >>>>>>> of these properties(checking that not contains default value) >>>>>>> and free it. >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>> >>>>>>> Thanks, >>>>>>> Dmitry >>>>>> >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Thu Aug 27 13:01:01 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 27 Aug 2015 07:01:01 -0600 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DEB36B.2070907@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> <55DE9FDF.7030701@oracle.com> <55DEB36B.2070907@oracle.com> Message-ID: <55DF0A0D.7010102@oracle.com> On 8/27/15 12:51 AM, David Holmes wrote: > On 27/08/2015 3:27 PM, Ioi Lam wrote: >> On the topic of goto, some people like to do this: >> >> do { >> if (...) { >> break; >> } >> ... >> if (...) { >> break; >> } >> } while (0); >> // "break" will "goto" here >> >> Will this be less of an eyesore than "goto"? > > No! A goto by any other name ... :) Might as well just use a goto if > you are going to resort to such an ugly structure just to avoid using > goto. Slightly less ugly than goto in my opinion. Dan > > David > >> - Ioi >> >> >> On 8/26/15 2:57 PM, Coleen Phillimore wrote: >>> >>> + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); >>> + >>> + if (tmp_key == NULL) { >>> + return false; >>> } >>> >>> AllocateHeap will call vm_exit_out_of_memory if it fails, and not >>> return NULL. You have to add AllocFailStrategy::RETURN_NULL >>> >>> Otherwise, this seems good. >>> >>> Thanks for not adding a goto. >>> >>> Coleen >>> >>> >>> On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >>>> Hello, >>>> >>>> Still need a Reviewer. Can someone review this patch? Thank you! >>>> >>>> Dmitry >>>> >>>> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>>>> Hi Ioi, >>>>> >>>>> Thank you for review and sponsorship! Still need a Reviewer please. >>>>> >>>>> I added assert. Also I fix indention on line 1023 and change "char >>>>> *var_name" to "char* var_name" to match style which used in this >>>>> function. >>>>> >>>>> webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>>>> >>>>> webrev 02 vs 01: >>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>>>> >>>>> >>>>> Thanks, >>>>> Dmitry >>>>> >>>>> On 24.08.2015 20:42, Ioi Lam wrote: >>>>>> Hi Dmitry, >>>>>> >>>>>> The new changes look good. >>>>>> >>>>>> For defensive programming, I would suggest adding an assert here: >>>>>> >>>>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>> assert(_java_vendor_url_bug != NULL, "......"); >>>>>> 1036 os::free((void *)_java_vendor_url_bug); >>>>>> >>>>>> I can sponsor the change, but we still need a Reviewer for this >>>>>> change. >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>>>> Hi Ioi, >>>>>>> >>>>>>> Thank you for comments! Please, see my answers inline. >>>>>>> >>>>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>>>> Hi Dmitry, >>>>>>>> >>>>>>>> Is this change part of 8132725? >>>>>>>> >>>>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>>>> 3905 if (code != JNI_OK) { >>>>>>>> 3906 return code; >>>>>>>> 3907 } >>>>>>> Yes, set_aggressive_opts_flags not check return value of >>>>>>> add_property function, so I add check to the >>>>>>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>>>>>> and thus now it returns jint. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>>>> >>>>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>>>> I think that this is unnecessary in this case, because >>>>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>>>> add_property function. Before new value is assigned to >>>>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>>>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with a >>>>>>>> "goto"? I know lots of people haye goto but it will make the >>>>>>>> clean up less error prone: >>>>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>>>> Hotspot code I decided to refactor current implementation to avoid >>>>>>> duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >>> >>>>>>> >>>>>>>> >>>>>>>> bool Arguments::add_property(const char* prop) { >>>>>>>> .... >>>>>>>> bool status = false; >>>>>>>> .... >>>>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>>>> if (_java_command_new == NULL) { >>>>>>>> goto done; >>>>>>>> }else { >>>>>>>> if (_java_command != NULL) { >>>>>>>> os::free(_java_command); >>>>>>>> } >>>>>>>> _java_command = _java_command_new; >>>>>>>> } >>>>>>>> .... >>>>>>>> } >>>>>>>> // Create new property and add at the end of the list >>>>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>>>> } >>>>>>>> status = true; >>>>>>>> >>>>>>>> done: >>>>>>>> if (key != prop) { >>>>>>>> // SystemProperty copy passed value, thus free previously >>>>>>>> allocated >>>>>>>> // memory >>>>>>>> FreeHeap((void *)key); >>>>>>>> } >>>>>>>> return status; >>>>>>>> } >>>>>>>> >>>>>>>> Also, using (key != prop) would make the code clearer than (eq != >>>>>>>> NULL). >>>>>>> Fixed! >>>>>>> >>>>>>> webrev 01: >>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>>>> >>>>>>> webrev 01 vs 00: >>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> Dmitry >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Please review this fix which remove memory leak in >>>>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>>>> this fix, who can push it. >>>>>>>>> >>>>>>>>> Arguments::add_property function allocate memory for key and >>>>>>>>> value. Then key and values are passed to the >>>>>>>>> PropertyList_unique_add function which use SystemProperty class >>>>>>>>> to add or update property value. SystemProperty class maintains >>>>>>>>> it's own copy of key and value and thus copy passed key and >>>>>>>>> value. Therefore key and value must be freed in add_property >>>>>>>>> function(with exception for value in case of >>>>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>>>> >>>>>>>>> In this fix I allocate memory only for key when passed property >>>>>>>>> contains value. If passed property not contains value, then I >>>>>>>>> not allocate memory for key and use passed property string. >>>>>>>>> Value also extracted from passed property string instead of >>>>>>>>> allocating. To accomplish that I changed declaration of "value" >>>>>>>>> in several functions from "char *" to "const char *" since >>>>>>>>> value is not modified in these functions(PropertyList_* >>>>>>>>> functions, SystemProperty class methods). >>>>>>>>> >>>>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>>>> properties also corrected. Now when these properties redefined, >>>>>>>>> then code checks if memory was allocated for special variables >>>>>>>>> of these properties(checking that not contains default value) >>>>>>>>> and free it. >>>>>>>>> >>>>>>>>> Webrev: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>>>> >>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Dmitry >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> From ioi.lam at oracle.com Thu Aug 27 14:21:13 2015 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 27 Aug 2015 07:21:13 -0700 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DEF7FE.4070806@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> <55DEF7FE.4070806@oracle.com> Message-ID: <55DF1CD9.4000801@oracle.com> Hi Dmitry, Maybe you can also get rid of tmp_key? - Ioi On 8/27/15 4:43 AM, Dmitry Dmitriev wrote: > Hello Coleen, > > Thank you for review and hint about AllocateHeap. I remove check for > 'tmp_key'. In this case new code behave as old code, i.e. call > 'vm_exit_out_of_memory' if it fails. Also, I change 'os::strdup' in > 'add_property' function to 'os::strdup_check_oom' to achieve the same > thing, i.e. behave as old code. In these case we don't need 'status' > variable. > > webrev 03: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03/ > > webrev 03 vs 02: > http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03.vs.02/ > > > Thank you, > Dmitry > > On 27.08.2015 0:57, Coleen Phillimore wrote: >> >> + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); >> + >> + if (tmp_key == NULL) { >> + return false; >> } >> >> AllocateHeap will call vm_exit_out_of_memory if it fails, and not >> return NULL. You have to add AllocFailStrategy::RETURN_NULL >> >> Otherwise, this seems good. >> >> Thanks for not adding a goto. >> >> Coleen >> >> >> On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >>> Hello, >>> >>> Still need a Reviewer. Can someone review this patch? Thank you! >>> >>> Dmitry >>> >>> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>>> Hi Ioi, >>>> >>>> Thank you for review and sponsorship! Still need a Reviewer please. >>>> >>>> I added assert. Also I fix indention on line 1023 and change "char >>>> *var_name" to "char* var_name" to match style which used in this >>>> function. >>>> >>>> webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>>> >>>> webrev 02 vs 01: >>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>>> >>>> >>>> Thanks, >>>> Dmitry >>>> >>>> On 24.08.2015 20:42, Ioi Lam wrote: >>>>> Hi Dmitry, >>>>> >>>>> The new changes look good. >>>>> >>>>> For defensive programming, I would suggest adding an assert here: >>>>> >>>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>> assert(_java_vendor_url_bug != NULL, "......"); >>>>> 1036 os::free((void *)_java_vendor_url_bug); >>>>> >>>>> I can sponsor the change, but we still need a Reviewer for this >>>>> change. >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> Thank you for comments! Please, see my answers inline. >>>>>> >>>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>>> Hi Dmitry, >>>>>>> >>>>>>> Is this change part of 8132725? >>>>>>> >>>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>>> 3905 if (code != JNI_OK) { >>>>>>> 3906 return code; >>>>>>> 3907 } >>>>>> Yes, set_aggressive_opts_flags not check return value of >>>>>> add_property function, so I add check to the >>>>>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>>>>> and thus now it returns jint. >>>>>> >>>>>>> >>>>>>> >>>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>>> >>>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>>> I think that this is unnecessary in this case, because >>>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>>> add_property function. Before new value is assigned to >>>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>>>> >>>>>>> >>>>>>> >>>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with a >>>>>>> "goto"? I know lots of people haye goto but it will make the >>>>>>> clean up less error prone: >>>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>>> Hotspot code I decided to refactor current implementation to >>>>>> avoid duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >> >>>>>> >>>>>>> >>>>>>> bool Arguments::add_property(const char* prop) { >>>>>>> .... >>>>>>> bool status = false; >>>>>>> .... >>>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>>> if (_java_command_new == NULL) { >>>>>>> goto done; >>>>>>> }else { >>>>>>> if (_java_command != NULL) { >>>>>>> os::free(_java_command); >>>>>>> } >>>>>>> _java_command = _java_command_new; >>>>>>> } >>>>>>> .... >>>>>>> } >>>>>>> // Create new property and add at the end of the list >>>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>>> } >>>>>>> status = true; >>>>>>> >>>>>>> done: >>>>>>> if (key != prop) { >>>>>>> // SystemProperty copy passed value, thus free previously >>>>>>> allocated >>>>>>> // memory >>>>>>> FreeHeap((void *)key); >>>>>>> } >>>>>>> return status; >>>>>>> } >>>>>>> >>>>>>> Also, using (key != prop) would make the code clearer than (eq >>>>>>> != NULL). >>>>>> Fixed! >>>>>> >>>>>> webrev 01: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>>> >>>>>> webrev 01 vs 00: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>>> >>>>>> >>>>>> Thank you, >>>>>> Dmitry >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> Please review this fix which remove memory leak in >>>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>>> this fix, who can push it. >>>>>>>> >>>>>>>> Arguments::add_property function allocate memory for key and >>>>>>>> value. Then key and values are passed to the >>>>>>>> PropertyList_unique_add function which use SystemProperty class >>>>>>>> to add or update property value. SystemProperty class maintains >>>>>>>> it's own copy of key and value and thus copy passed key and >>>>>>>> value. Therefore key and value must be freed in add_property >>>>>>>> function(with exception for value in case of >>>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>>> >>>>>>>> In this fix I allocate memory only for key when passed property >>>>>>>> contains value. If passed property not contains value, then I >>>>>>>> not allocate memory for key and use passed property string. >>>>>>>> Value also extracted from passed property string instead of >>>>>>>> allocating. To accomplish that I changed declaration of "value" >>>>>>>> in several functions from "char *" to "const char *" since >>>>>>>> value is not modified in these functions(PropertyList_* >>>>>>>> functions, SystemProperty class methods). >>>>>>>> >>>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>>> properties also corrected. Now when these properties redefined, >>>>>>>> then code checks if memory was allocated for special variables >>>>>>>> of these properties(checking that not contains default value) >>>>>>>> and free it. >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>>> >>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Dmitry >>>>>>> >>>>>> >>>>> >>>> >>> >> > From dmitry.dmitriev at oracle.com Thu Aug 27 14:31:26 2015 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Thu, 27 Aug 2015 17:31:26 +0300 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DF1CD9.4000801@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> <55DEF7FE.4070806@oracle.com> <55DF1CD9.4000801@oracle.com> Message-ID: <55DF1F3E.4070902@oracle.com> Hi Ioi, In this fix I changed 'key' type from 'char*' to 'const char*'. This allow me to assign without casting passed 'prop' value to the 'key' if 'prop' doesn't contain a value. Thus, I need a temporary variable to extract key if passed property contains value. So, to get rid of 'tmp_key' I need to change type of the 'key' back to 'char*' and cast 'prop' to 'char*' in case when no value is passed in 'prop'. But I think it safer to leave 'key' as 'const char*'. What you think about that? Thank you, Dmitry On 27.08.2015 17:21, Ioi Lam wrote: > Hi Dmitry, > > Maybe you can also get rid of tmp_key? > > - Ioi > > On 8/27/15 4:43 AM, Dmitry Dmitriev wrote: >> Hello Coleen, >> >> Thank you for review and hint about AllocateHeap. I remove check for >> 'tmp_key'. In this case new code behave as old code, i.e. call >> 'vm_exit_out_of_memory' if it fails. Also, I change 'os::strdup' in >> 'add_property' function to 'os::strdup_check_oom' to achieve the same >> thing, i.e. behave as old code. In these case we don't need 'status' >> variable. >> >> webrev 03: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03/ >> >> webrev 03 vs 02: >> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03.vs.02/ >> >> >> Thank you, >> Dmitry >> >> On 27.08.2015 0:57, Coleen Phillimore wrote: >>> >>> + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); >>> + >>> + if (tmp_key == NULL) { >>> + return false; >>> } >>> >>> AllocateHeap will call vm_exit_out_of_memory if it fails, and not >>> return NULL. You have to add AllocFailStrategy::RETURN_NULL >>> >>> Otherwise, this seems good. >>> >>> Thanks for not adding a goto. >>> >>> Coleen >>> >>> >>> On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >>>> Hello, >>>> >>>> Still need a Reviewer. Can someone review this patch? Thank you! >>>> >>>> Dmitry >>>> >>>> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>>>> Hi Ioi, >>>>> >>>>> Thank you for review and sponsorship! Still need a Reviewer please. >>>>> >>>>> I added assert. Also I fix indention on line 1023 and change "char >>>>> *var_name" to "char* var_name" to match style which used in this >>>>> function. >>>>> >>>>> webrev 02: >>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>>>> >>>>> webrev 02 vs 01: >>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>>>> >>>>> >>>>> Thanks, >>>>> Dmitry >>>>> >>>>> On 24.08.2015 20:42, Ioi Lam wrote: >>>>>> Hi Dmitry, >>>>>> >>>>>> The new changes look good. >>>>>> >>>>>> For defensive programming, I would suggest adding an assert here: >>>>>> >>>>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>> assert(_java_vendor_url_bug != NULL, "......"); >>>>>> 1036 os::free((void *)_java_vendor_url_bug); >>>>>> >>>>>> I can sponsor the change, but we still need a Reviewer for this >>>>>> change. >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>>>> Hi Ioi, >>>>>>> >>>>>>> Thank you for comments! Please, see my answers inline. >>>>>>> >>>>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>>>> Hi Dmitry, >>>>>>>> >>>>>>>> Is this change part of 8132725? >>>>>>>> >>>>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>>>> 3905 if (code != JNI_OK) { >>>>>>>> 3906 return code; >>>>>>>> 3907 } >>>>>>> Yes, set_aggressive_opts_flags not check return value of >>>>>>> add_property function, so I add check to the >>>>>>> set_aggressive_opts_flags()(lines 1911-1913 in new >>>>>>> arguments.cpp) and thus now it returns jint. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>>>> >>>>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>>>> I think that this is unnecessary in this case, because >>>>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>>>> add_property function. Before new value is assigned to >>>>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>>>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with >>>>>>>> a "goto"? I know lots of people haye goto but it will make the >>>>>>>> clean up less error prone: >>>>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>>>> Hotspot code I decided to refactor current implementation to >>>>>>> avoid duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >>> >>>>>>> >>>>>>>> >>>>>>>> bool Arguments::add_property(const char* prop) { >>>>>>>> .... >>>>>>>> bool status = false; >>>>>>>> .... >>>>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>>>> if (_java_command_new == NULL) { >>>>>>>> goto done; >>>>>>>> }else { >>>>>>>> if (_java_command != NULL) { >>>>>>>> os::free(_java_command); >>>>>>>> } >>>>>>>> _java_command = _java_command_new; >>>>>>>> } >>>>>>>> .... >>>>>>>> } >>>>>>>> // Create new property and add at the end of the list >>>>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>>>> } >>>>>>>> status = true; >>>>>>>> >>>>>>>> done: >>>>>>>> if (key != prop) { >>>>>>>> // SystemProperty copy passed value, thus free previously >>>>>>>> allocated >>>>>>>> // memory >>>>>>>> FreeHeap((void *)key); >>>>>>>> } >>>>>>>> return status; >>>>>>>> } >>>>>>>> >>>>>>>> Also, using (key != prop) would make the code clearer than (eq >>>>>>>> != NULL). >>>>>>> Fixed! >>>>>>> >>>>>>> webrev 01: >>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>>>> >>>>>>> webrev 01 vs 00: >>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> Dmitry >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Please review this fix which remove memory leak in >>>>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>>>> this fix, who can push it. >>>>>>>>> >>>>>>>>> Arguments::add_property function allocate memory for key and >>>>>>>>> value. Then key and values are passed to the >>>>>>>>> PropertyList_unique_add function which use SystemProperty >>>>>>>>> class to add or update property value. SystemProperty class >>>>>>>>> maintains it's own copy of key and value and thus copy passed >>>>>>>>> key and value. Therefore key and value must be freed in >>>>>>>>> add_property function(with exception for value in case of >>>>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>>>> >>>>>>>>> In this fix I allocate memory only for key when passed >>>>>>>>> property contains value. If passed property not contains >>>>>>>>> value, then I not allocate memory for key and use passed >>>>>>>>> property string. Value also extracted from passed property >>>>>>>>> string instead of allocating. To accomplish that I changed >>>>>>>>> declaration of "value" in several functions from "char *" to >>>>>>>>> "const char *" since value is not modified in these >>>>>>>>> functions(PropertyList_* functions, SystemProperty class >>>>>>>>> methods). >>>>>>>>> >>>>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>>>> properties also corrected. Now when these properties >>>>>>>>> redefined, then code checks if memory was allocated for >>>>>>>>> special variables of these properties(checking that not >>>>>>>>> contains default value) and free it. >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>>>> >>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Dmitry >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From ioi.lam at oracle.com Thu Aug 27 14:35:59 2015 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 27 Aug 2015 07:35:59 -0700 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DF1F3E.4070902@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> <55DEF7FE.4070806@oracle.com> <55DF1CD9.4000801@oracle.com> <55DF1F3E.4070902@oracle.com> Message-ID: <55DF204F.7080203@oracle.com> Hmmm, that sounds fine then. Thanks for the explanation. - Ioi On 8/27/15 7:31 AM, Dmitry Dmitriev wrote: > Hi Ioi, > > In this fix I changed 'key' type from 'char*' to 'const char*'. This > allow me to assign without casting passed 'prop' value to the 'key' if > 'prop' doesn't contain a value. Thus, I need a temporary variable to > extract key if passed property contains value. > So, to get rid of 'tmp_key' I need to change type of the 'key' back to > 'char*' and cast 'prop' to 'char*' in case when no value is passed in > 'prop'. But I think it safer to leave 'key' as 'const char*'. What you > think about that? > > Thank you, > Dmitry > > On 27.08.2015 17:21, Ioi Lam wrote: >> Hi Dmitry, >> >> Maybe you can also get rid of tmp_key? >> >> - Ioi >> >> On 8/27/15 4:43 AM, Dmitry Dmitriev wrote: >>> Hello Coleen, >>> >>> Thank you for review and hint about AllocateHeap. I remove check for >>> 'tmp_key'. In this case new code behave as old code, i.e. call >>> 'vm_exit_out_of_memory' if it fails. Also, I change 'os::strdup' in >>> 'add_property' function to 'os::strdup_check_oom' to achieve the >>> same thing, i.e. behave as old code. In these case we don't need >>> 'status' variable. >>> >>> webrev 03: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03/ >>> >>> webrev 03 vs 02: >>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03.vs.02/ >>> >>> >>> Thank you, >>> Dmitry >>> >>> On 27.08.2015 0:57, Coleen Phillimore wrote: >>>> >>>> + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); >>>> + >>>> + if (tmp_key == NULL) { >>>> + return false; >>>> } >>>> >>>> AllocateHeap will call vm_exit_out_of_memory if it fails, and not >>>> return NULL. You have to add AllocFailStrategy::RETURN_NULL >>>> >>>> Otherwise, this seems good. >>>> >>>> Thanks for not adding a goto. >>>> >>>> Coleen >>>> >>>> >>>> On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >>>>> Hello, >>>>> >>>>> Still need a Reviewer. Can someone review this patch? Thank you! >>>>> >>>>> Dmitry >>>>> >>>>> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> Thank you for review and sponsorship! Still need a Reviewer please. >>>>>> >>>>>> I added assert. Also I fix indention on line 1023 and change >>>>>> "char *var_name" to "char* var_name" to match style which used in >>>>>> this function. >>>>>> >>>>>> webrev 02: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>>>>> >>>>>> webrev 02 vs 01: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Dmitry >>>>>> >>>>>> On 24.08.2015 20:42, Ioi Lam wrote: >>>>>>> Hi Dmitry, >>>>>>> >>>>>>> The new changes look good. >>>>>>> >>>>>>> For defensive programming, I would suggest adding an assert here: >>>>>>> >>>>>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>>> assert(_java_vendor_url_bug != NULL, "......"); >>>>>>> 1036 os::free((void *)_java_vendor_url_bug); >>>>>>> >>>>>>> I can sponsor the change, but we still need a Reviewer for this >>>>>>> change. >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>>>>> Hi Ioi, >>>>>>>> >>>>>>>> Thank you for comments! Please, see my answers inline. >>>>>>>> >>>>>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>>>>> Hi Dmitry, >>>>>>>>> >>>>>>>>> Is this change part of 8132725? >>>>>>>>> >>>>>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>>>>> 3905 if (code != JNI_OK) { >>>>>>>>> 3906 return code; >>>>>>>>> 3907 } >>>>>>>> Yes, set_aggressive_opts_flags not check return value of >>>>>>>> add_property function, so I add check to the >>>>>>>> set_aggressive_opts_flags()(lines 1911-1913 in new >>>>>>>> arguments.cpp) and thus now it returns jint. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>>>>> >>>>>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>>>>> I think that this is unnecessary in this case, because >>>>>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>>>>> add_property function. Before new value is assigned to >>>>>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think >>>>>>>> that check (_java_vendor_url_bug != NULL) is unnecessary in >>>>>>>> this case. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with >>>>>>>>> a "goto"? I know lots of people haye goto but it will make the >>>>>>>>> clean up less error prone: >>>>>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>>>>> Hotspot code I decided to refactor current implementation to >>>>>>>> avoid duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >>>> >>>>>>>> >>>>>>>>> >>>>>>>>> bool Arguments::add_property(const char* prop) { >>>>>>>>> .... >>>>>>>>> bool status = false; >>>>>>>>> .... >>>>>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>>>>> if (_java_command_new == NULL) { >>>>>>>>> goto done; >>>>>>>>> }else { >>>>>>>>> if (_java_command != NULL) { >>>>>>>>> os::free(_java_command); >>>>>>>>> } >>>>>>>>> _java_command = _java_command_new; >>>>>>>>> } >>>>>>>>> .... >>>>>>>>> } >>>>>>>>> // Create new property and add at the end of the list >>>>>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>>>>> } >>>>>>>>> status = true; >>>>>>>>> >>>>>>>>> done: >>>>>>>>> if (key != prop) { >>>>>>>>> // SystemProperty copy passed value, thus free previously >>>>>>>>> allocated >>>>>>>>> // memory >>>>>>>>> FreeHeap((void *)key); >>>>>>>>> } >>>>>>>>> return status; >>>>>>>>> } >>>>>>>>> >>>>>>>>> Also, using (key != prop) would make the code clearer than (eq >>>>>>>>> != NULL). >>>>>>>> Fixed! >>>>>>>> >>>>>>>> webrev 01: >>>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>>>>> >>>>>>>> webrev 01 vs 00: >>>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>>>>> >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Dmitry >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> Please review this fix which remove memory leak in >>>>>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>>>>> this fix, who can push it. >>>>>>>>>> >>>>>>>>>> Arguments::add_property function allocate memory for key and >>>>>>>>>> value. Then key and values are passed to the >>>>>>>>>> PropertyList_unique_add function which use SystemProperty >>>>>>>>>> class to add or update property value. SystemProperty class >>>>>>>>>> maintains it's own copy of key and value and thus copy passed >>>>>>>>>> key and value. Therefore key and value must be freed in >>>>>>>>>> add_property function(with exception for value in case of >>>>>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>>>>> >>>>>>>>>> In this fix I allocate memory only for key when passed >>>>>>>>>> property contains value. If passed property not contains >>>>>>>>>> value, then I not allocate memory for key and use passed >>>>>>>>>> property string. Value also extracted from passed property >>>>>>>>>> string instead of allocating. To accomplish that I changed >>>>>>>>>> declaration of "value" in several functions from "char *" to >>>>>>>>>> "const char *" since value is not modified in these >>>>>>>>>> functions(PropertyList_* functions, SystemProperty class >>>>>>>>>> methods). >>>>>>>>>> >>>>>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>>>>> properties also corrected. Now when these properties >>>>>>>>>> redefined, then code checks if memory was allocated for >>>>>>>>>> special variables of these properties(checking that not >>>>>>>>>> contains default value) and free it. >>>>>>>>>> >>>>>>>>>> Webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>>>>> >>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Dmitry >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Thu Aug 27 15:51:19 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 27 Aug 2015 09:51:19 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DE915B.9020605@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> Message-ID: <55DF31F7.7050609@oracle.com> Hi David! Thanks for chiming in on this thread! Replies embedded below as usual... On 8/26/15 10:26 PM, David Holmes wrote: > Hi Dan, > > On 26/08/2015 7:08 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a "fix" for a long standing race between JVM shutdown and the >> JVM statistics subsystem: >> >> JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() >> https://bugs.openjdk.java.net/browse/JDK-8049304 >> >> Webrev URL: >> http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/ >> >> Testing: Aurora Adhoc RT-SVC nightly batch >> Aurora Adhoc vm.tmtools batch >> Kim's repro sequence for JDK-8049304 >> Kim's repro sequence for JDK-8129978 >> JPRT -testset hotspot >> >> This "fix": >> >> - adds a volatile flag to record whether PerfDataManager is holding >> data (PerfData objects) >> - adds PerfDataManager::has_PerfData() to return the flag >> - changes the Java monitor subsystem's use of PerfData to >> check both allocation of the monitor subsystem specific >> PerfData object and the new PerfDataManager::has_PerfData() >> return value >> >> If the global 'UsePerfData' option is false, the system works as >> it did before. If 'UsePerfData' is true (the default on non-embedded >> systems), the Java monitor subsystem will allocate a number of >> PerfData objects to record information. The objects will record >> information about Java monitor subsystem until the JVM shuts down. >> >> When the JVM starts to shutdown, the new PerfDataManager flag will >> change to false and the Java monitor subsystem will stop using the >> PerfData objects. This is the new behavior. As noted in the comments >> I added to the code, the race is still present; I'm just changing >> the order and the timing to reduce the likelihood of the crash. > > Right. To sum up: the basic problem is that the PerfData objects are > deallocated at the safepoint established for VM termination, but those > objects can actually be used by threads that are in a safepoint-safe > state: in particular within the low-level synchronization code. > > As you say this fix narrows the window where a crash can occur, but > can not close it. If a thread is descheduled after the check of > hasPerfData it can still access the PerfData object when it resumes, > which may be after the object was deallocated. There's no true fix > here without introducing synchronization (which would have to be even > lower-level to avoid reentrant use of the same code we're fixing!) and > the overhead of that would be prohibitive for these perf counters. > > In response to Kim's concern about other code that uses PerfData > objects I think you would have to examine those uses to see which, if > any, can occur from either a non-JavaThread, or from within the code > where a thread is considered safepoint-safe. I'm inclined to agree > that given we have not seen issues with such code, either it does not > exist or is extremely unlikely to hit this issue. Given the "fix" is > itself only narrowing the window it doesn't seem necessary to address > code that already has a narrower window. > > That all said "leaking" the PerfData objects seems no less unpleasant > a "fix". There are so many obstacles in the way of being able to > unload and re-load the JVM that I do not think this makes the position > measurably worse. In fact I can imagine that if we were to allow for > such behaviour we would need to be able to terminate threads and > reclaim all their resources (like Monitor instances), at which point > it would also become easy to deallocate shared memory like PerfData > objects. Here's what I wrote in the bug report before I started this review cycle: Daniel Daugherty added a comment - 2015-08-21 20:40 Continued investigating VM shutdown race: JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() JDK-8129978 SIGSEGV when parsing command line options - Thanks to Kim for providing easy reproduction instructions for both bugs; I've tweaked the repro code a bit - The "correct" solution is to add a locking/memory ordering mechanism to ensure that PerfData is only used when valid. The locking/memory ordering would slow down the PerfData mechanism for every update. Ouch! - The "fast but safe" solution is to leak the PerfData memory and not clean them up at VM shutdown. We're trying to clean up the code base so the idea of intentionally leaking memory makes me cringe. - The solution I'm investigating is between "fast but safe" and "correct". I'm adding a PerfDataManager.has_PerfData() function that returns true when PerfDataManager is holding PerfData objects and false when none have been allocated or when they have been freed at VM shutdown. The flag holding the state is volatile and I use release_store() to change it so that publication is visible more quickly. On the VM shutdown path, I also do a 1ms sleep after setting the flag and before freeing the memory. - The idea is that the 1ms sleep will give any threads that saw PerfDataManager.has_PerfData() == true a chance to do their operation on the PerfData object before VM shutdown thread frees it. So I think we're all roughly on the same page here: 1) We don't like the current system because we keep getting these shutdown race crashes. Of course, a new one came in early this AM: JDK-8134566 java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java crashes in monitor synchronization code https://bugs.openjdk.java.net/browse/JDK-8134566 2) We don't like the "correct" solution because it would slow down the performance counters and possibly skew the very data we are trying to gather. Kim has also pointed out that adding more locking in a subsystem used by higher level locking is risky. 3) We don't like the "fast but safe" solution of leaking the PerfData memory. We try to make ourselves feel better about this by saying there are plenty of other leaks in the VM... slippery slope? 4) We don't like the proposed solution because the race still exists and we could continue to see failures like these. Only they would be more rare and possibly harder to spot. 5) Off-thread Kim and I have been talking about adding logic to the signal handler filters to detect a SIGSEGV that comes from use of a now freed PerfData object. We're mulling on the idea, but have not determined if it is even possible or an acceptable idea... Hopefully, the above accurately sums up our options... > I'll leave it up to you which way to go. As it stands this is Reviewed. Thanks! Here's my proposed plan: 1) I'd like to move forward with this change in order to reduce the occurrences of this crasher. Yes, I'm getting tired of seeing and analyzing them. 2) If we see PerfData crashes in the future with non-monitor subsystem PerfData usage, then we look at adding has_PerfData() calls to that subsystem. 3) If we see PerfData crashes in the future in the monitor subsystem, then that indicates that the theoretical race is real or I missed protecting a PerfData usage with has_PerfData(). If the race is real, then we examine these alternatives: - leak the PerfData objects on the JVM shutdown path, i.e., switch to the "fast but safe" solution - add signal handler support to make PerfData SIGSEGVs benign What do folks think? Dan > > Thanks, > David > >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> From kim.barrett at oracle.com Thu Aug 27 17:47:50 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 27 Aug 2015 13:47:50 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DF31F7.7050609@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF31F7.7050609@oracle.com> Message-ID: <1457BA32-6B88-4259-8E10-0F4CA93C8856@oracle.com> On Aug 27, 2015, at 11:51 AM, Daniel D. Daugherty wrote: > > 3) We don't like the "fast but safe" solution of leaking the PerfData > memory. We try to make ourselves feel better about this by saying > there are plenty of other leaks in the VM... slippery slope? "Leak" is perhaps misleading and overly perjorative. A better description of what I've suggested is "pass the buck for memory cleanup to the OS". When we're on our way to process exit, we know the OS will deal with our memory resources. We already (and would still) clean up relevant non-memory resources (like shared memory files for the PerfData) - I'm not suggesting any change there. perfMemory_exit already has that separation of memory vs non-memory resource cleanup. I submit that in many situations when we're on our way to process exit, the sleep being introduced by this change may actually have a significant negative impact. If we're on our way to dumping a core file for debugging an error, putting a sleep along the way just allows other threads more time to run further from the state where the error occurred. I wish we were running less code rather than more in that situation. If we're not on our way to process exit, and instead want to achieve a state where we can restart the VM or unload the VM code, we clearly need to make sure that other cleanup has been done, such as bringing all threads to quiescence and eventually tearing them down. But if we don't have other threads running then the problem of some thread trying to touch the PerfData after we've destroyed it simply doesn't happen. And indeed, there is at least the beginnings of code to do that sort of thing; see jni_DestroyJavaVM. And what I've proposed is that we do the PerfData memory cleanup exactly and only when that's the goal state, since in that case it is safe and necessary to do the PerfData memory cleanup. From daniel.daugherty at oracle.com Thu Aug 27 17:59:03 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 27 Aug 2015 11:59:03 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <1457BA32-6B88-4259-8E10-0F4CA93C8856@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF31F7.7050609@oracle.com> <1457BA32-6B88-4259-8E10-0F4CA93C8856@oracle.com> Message-ID: <55DF4FE7.9050509@oracle.com> On 8/27/15 11:47 AM, Kim Barrett wrote: > On Aug 27, 2015, at 11:51 AM, Daniel D. Daugherty wrote: >> 3) We don't like the "fast but safe" solution of leaking the PerfData >> memory. We try to make ourselves feel better about this by saying >> there are plenty of other leaks in the VM... slippery slope? > "Leak" is perhaps misleading and overly perjorative. A better > description of what I've suggested is "pass the buck for memory > cleanup to the OS". When we're on our way to process exit, we know > the OS will deal with our memory resources. We already (and would > still) clean up relevant non-memory resources (like shared memory > files for the PerfData) - I'm not suggesting any change there. > perfMemory_exit already has that separation of memory vs non-memory > resource cleanup. > > I submit that in many situations when we're on our way to process > exit, the sleep being introduced by this change may actually have a > significant negative impact. If we're on our way to dumping a core > file for debugging an error, putting a sleep along the way just allows > other threads more time to run further from the state where the error > occurred. I wish we were running less code rather than more in that > situation. > > If we're not on our way to process exit, and instead want to achieve a > state where we can restart the VM or unload the VM code, we clearly > need to make sure that other cleanup has been done, such as bringing > all threads to quiescence and eventually tearing them down. But if we > don't have other threads running then the problem of some thread > trying to touch the PerfData after we've destroyed it simply doesn't > happen. > > And indeed, there is at least the beginnings of code to do that sort > of thing; see jni_DestroyJavaVM. And what I've proposed is that we do > the PerfData memory cleanup exactly and only when that's the goal > state, since in that case it is safe and necessary to do the PerfData > memory cleanup. > Kim, you keep changing your position. You've gone from acknowledging that this would be a "leak" that is likely detectible by a leak detection tool and saying that we'd have to figure out a way to shut it up to what you have written above. I have no idea what to think anymore. Dan From david.holmes at oracle.com Thu Aug 27 21:26:41 2015 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Aug 2015 07:26:41 +1000 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <1457BA32-6B88-4259-8E10-0F4CA93C8856@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF31F7.7050609@oracle.com> <1457BA32-6B88-4259-8E10-0F4CA93C8856@oracle.com> Message-ID: <55DF8091.8040608@oracle.com> On 28/08/2015 3:47 AM, Kim Barrett wrote: > On Aug 27, 2015, at 11:51 AM, Daniel D. Daugherty wrote: >> >> 3) We don't like the "fast but safe" solution of leaking the PerfData >> memory. We try to make ourselves feel better about this by saying >> there are plenty of other leaks in the VM... slippery slope? > > "Leak" is perhaps misleading and overly perjorative. A better > description of what I've suggested is "pass the buck for memory > cleanup to the OS". When we're on our way to process exit, we know > the OS will deal with our memory resources. We already (and would > still) clean up relevant non-memory resources (like shared memory > files for the PerfData) - I'm not suggesting any change there. > perfMemory_exit already has that separation of memory vs non-memory > resource cleanup. I agree this is not a "leak" in the regular sense it is simply memory not explicitly returned before process exit - of which we already have a lot. Hence I'm not opposed to simply making PerfData objects "permanent". > I submit that in many situations when we're on our way to process > exit, the sleep being introduced by this change may actually have a > significant negative impact. If we're on our way to dumping a core > file for debugging an error, putting a sleep along the way just allows > other threads more time to run further from the state where the error > occurred. I wish we were running less code rather than more in that > situation. If we're on our way to dumping a core file, the error was in the current thread. But point noted other threads will continue longer and potentially hit secondary errors. > If we're not on our way to process exit, and instead want to achieve a > state where we can restart the VM or unload the VM code, we clearly > need to make sure that other cleanup has been done, such as bringing > all threads to quiescence and eventually tearing them down. But if we > don't have other threads running then the problem of some thread > trying to touch the PerfData after we've destroyed it simply doesn't > happen. Right - once we have solved the problem of how to terminate all the existing threads and reclaim their resources, then reclaiming a shared resource like the PerfData objects is "trivial". > And indeed, there is at least the beginnings of code to do that sort > of thing; see jni_DestroyJavaVM. And what I've proposed is that we do I think you misunderstand what jni_DestroyJavaVM is doing. It implements the normal lifecycle management for the JVM - which is that the JVM keeps running as long as there is one non-daemon thread running. So the only waiting in that method is for the non-daemon thread count to hit zero, at which point VM termination is initiated. But there can be a zillion daemon threads still running (application and VM). Cheers, David > the PerfData memory cleanup exactly and only when that's the goal > state, since in that case it is safe and necessary to do the PerfData > memory cleanup. > From daniel.daugherty at oracle.com Thu Aug 27 21:42:06 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 27 Aug 2015 15:42:06 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DE915B.9020605@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> Message-ID: <55DF842E.1010407@oracle.com> Sorry for starting another e-mail thread fork in an already complicated review... I was mulling on this piece of the thread over lunch (my TZ): > Right. To sum up: the basic problem is that the PerfData objects > are deallocated at the safepoint established for VM termination, > but those objects can actually be used by threads that are in a > safepoint-safe state: in particular within the low-level > synchronization code. and that got me thinking about safepoints and VM exit/shutdown/termination. The problem that I'm trying to solve is the one where we have a test that has passed for all intents and purposes. It has logged success in a message to stdout, stderr, or a log file. The test is exiting with a successful exit code (exit_code == 0 for some, exit_code == 95 for others)... On the way out the door, the test crashes with a SIGSEGV. In these failures, the VM was doing a normal shutdown. In the case where main() has fallen off the end, jni_DestroyJavaVM() is called. In the case where System.exit() is called from Java then vm_exit() is called. src/share/vm/runtime/java.cpp: 503 void vm_exit(int code) { 511 if (VMThread::vm_thread() != NULL) { 512 // Fire off a VM_Exit operation to bring VM to a safepoint and exit 513 VM_Exit op(code); 514 if (thread->is_Java_thread()) 515 ((JavaThread*)thread)->set_thread_state(_thread_in_vm); 516 VMThread::execute(&op); src/share/vm/runtime/vm_operations.cpp: 443 void VM_Exit::doit() { 457 exit_globals(); exit_globals() calls perfMemory_exit() which calls PerfDataManager::destroy() which sets the flag that says there is no more PerfData... So a System.exit() call results in setting the new flag at a safepoint. The jni_DestroyJavaVM() is more complicated: src/share/vm/prims/jni.cpp: 4081 jint JNICALL jni_DestroyJavaVM(JavaVM *vm) { 4105 if (Threads::destroy_vm()) { src/share/vm/runtime/thread.cpp: 3890 bool Threads::destroy_vm() { 3896 // Wait until we are the last non-daemon thread to execute 3897 { MutexLocker nu(Threads_lock); 3898 while (Threads::number_of_non_daemon_threads() > 1) After this while-loop there are no non-daemon JavaThreads. 3926 // Stop VM thread. 3927 { 3928 // 4945125 The vm thread comes to a safepoint during exit. 3929 // GC vm_operations can get caught at the safepoint, and the 3930 // heap is unparseable if they are caught. Grab the Heap_lock 3931 // to prevent this. The GC vm_operations will not be able to 3932 // queue until after the vm thread is dead. After this point, 3933 // we'll never emerge out of the safepoint before the VM exits. 3934 3935 MutexLocker ml(Heap_lock); 3936 3937 VMThread::wait_for_vm_thread_exit(); During VMThread exit, the system was brought to a safepoint. All daemon JavaThreads are at a safepoint. 3966 // exit_globals() will delete tty 3967 exit_globals(); exit_globals() calls perfMemory_exit() which calls PerfDataManager::destroy() which sets the flag that says there is no more PerfData... So a jni_DestroyJavaVM() call results in setting the flag at a safepoint. Whew!!! So the two "normal" exit paths both set the new flag while the system is at a safepoint (as David H said). The remaining question is whether the Java monitor PerfData usage can happen in parallel while the system is at a safepoint. Fortunately, I converted all of the Java monitor subsystem's usage of PerfData to use a new macro called OM_PERFDATA_OP so I can easily visit them all: src/share/vm/runtime/objectMonitor.cpp: ObjectMonitor::enter() OM_PERFDATA_OP(ContendedLockAttempts, inc()) at the end to record a contended monitor enter. The thread is not seen as safepoint-safe at this point so no conflict. src/share/vm/runtime/objectMonitor.cpp: ObjectMonitor::EnterI() OM_PERFDATA_OP(FutileWakeups, inc()) after returning from a park() call and not being able to acquire the lock. The thread is in state _thread_blocked so the thread is seen as safepoint-safe so we may have a conflict. src/share/vm/runtime/objectMonitor.cpp: ObjectMonitor::ReenterI() OM_PERFDATA_OP(FutileWakeups, inc()) after returning from a park() call and not being able to acquire the lock. The thread is in state _thread_blocked so the thread is seen as safepoint-safe so we may have a conflict. src/share/vm/runtime/objectMonitor.cpp: ObjectMonitor::ExitEpilog() OM_PERFDATA_OP(Parks, inc()) at the end of the function after releasing ownership of the monitor. The thread is not seen as safepoint-safe at this point so no conflict. src/share/vm/runtime/objectMonitor.cpp: ObjectMonitor::notify() OM_PERFDATA_OP(Notifications, inc(1)) at the end of the function after internal INotify() has returned. The thread is not seen as safepoint-safe at this point so no conflict. src/share/vm/runtime/objectMonitor.cpp: ObjectMonitor::notifyAll() OM_PERFDATA_OP(Notifications, inc(tally)) at the end of the function after internal INotify() while-loop has finished. The thread is not seen as safepoint-safe at this point so no conflict. src/share/vm/runtime/synchronizer.cpp: ObjectSynchronizer::quick_notify() OM_PERFDATA_OP(Notifications, inc(tally)) after internal INotify() has been called on one or more monitors. The thread is not seen as safepoint-safe at this point so no conflict. src/share/vm/runtime/synchronizer.cpp: ObjectSynchronizer::inflate() OM_PERFDATA_OP(Inflations, inc()) in two places after we've successfully inflated a stack lock into an ObjectMonitor; The thread is not seen as safepoint-safe at this point so no conflict. src/share/vm/runtime/synchronizer.cpp: ObjectSynchronizer::deflate_idle_monitors() OM_PERFDATA_OP(Deflations, inc(nScavenged)) and OM_PERFDATA_OP(MonExtant, set_value(nInCirculation)) after we've scavenged the global list. This function is only called at a safepoint as a periodic cleanup task. No conflict because the periodic task stuff is already disabled. So the only two conflicts that I see are the futile wakeup in EnterI() and ReenterI(). In both of those cases, the thread was in park() and had to be spuriously unpark()'ed or intentionally unpark()'ed after being chosen as the successor for a contended monitor. Additionally, the associated monitor has be held by another thread which is why this is called a futile wakeup. This brings us full circle... The futile wakeup case happens to be the original sighting that motivated me to file this bug back on 2014.07.03. The EnterI() futile wakeup is also the case that Kim's debug repro code tickles for this bug. Kim's repro case for this bug requires a CTRL-C, but I think that results in a mostly orderly shutdown of the VM. Since the test case requires an interactive java session, I tested the fix with delays in place for 50 runs without a failure two different times. So the EnterI() futile wakeup case is tested with delays in place and I can't get the bug to reproduce anymore. Without the fix and with the delays, the crash happens everytime. For the other bug that Kim did repro code for: JDK-8129978, that crash happened in the ObjectMonitor inflation code path and that case is covered by the OM_PERFDATA_OP(Inflations, inc()) notes above. The flag is set at a safepoint, the safepoint ends, the stack lock is inflated into an ObjectMonitor while the system is not at a safepoint, the flag is seen and the PerfData is not used so no crash. I did a 1000 runs with the delays in place to verify that the repro for JDK-8129978 no longer crashes. Short version (if you read this far, you should be laughing at this :-)) I still think this is a good "fix" for this problem. I'm not yet convinced that we need to take the path where we "leak" PerfData objects. Thanks for reading this far... :-) Dan On 8/26/15 10:26 PM, David Holmes wrote: > Hi Dan, > > On 26/08/2015 7:08 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a "fix" for a long standing race between JVM shutdown and the >> JVM statistics subsystem: >> >> JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() >> https://bugs.openjdk.java.net/browse/JDK-8049304 >> >> Webrev URL: >> http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/ >> >> Testing: Aurora Adhoc RT-SVC nightly batch >> Aurora Adhoc vm.tmtools batch >> Kim's repro sequence for JDK-8049304 >> Kim's repro sequence for JDK-8129978 >> JPRT -testset hotspot >> >> This "fix": >> >> - adds a volatile flag to record whether PerfDataManager is holding >> data (PerfData objects) >> - adds PerfDataManager::has_PerfData() to return the flag >> - changes the Java monitor subsystem's use of PerfData to >> check both allocation of the monitor subsystem specific >> PerfData object and the new PerfDataManager::has_PerfData() >> return value >> >> If the global 'UsePerfData' option is false, the system works as >> it did before. If 'UsePerfData' is true (the default on non-embedded >> systems), the Java monitor subsystem will allocate a number of >> PerfData objects to record information. The objects will record >> information about Java monitor subsystem until the JVM shuts down. >> >> When the JVM starts to shutdown, the new PerfDataManager flag will >> change to false and the Java monitor subsystem will stop using the >> PerfData objects. This is the new behavior. As noted in the comments >> I added to the code, the race is still present; I'm just changing >> the order and the timing to reduce the likelihood of the crash. > > Right. To sum up: the basic problem is that the PerfData objects are > deallocated at the safepoint established for VM termination, but those > objects can actually be used by threads that are in a safepoint-safe > state: in particular within the low-level synchronization code. > > As you say this fix narrows the window where a crash can occur, but > can not close it. If a thread is descheduled after the check of > hasPerfData it can still access the PerfData object when it resumes, > which may be after the object was deallocated. There's no true fix > here without introducing synchronization (which would have to be even > lower-level to avoid reentrant use of the same code we're fixing!) and > the overhead of that would be prohibitive for these perf counters. > > In response to Kim's concern about other code that uses PerfData > objects I think you would have to examine those uses to see which, if > any, can occur from either a non-JavaThread, or from within the code > where a thread is considered safepoint-safe. I'm inclined to agree > that given we have not seen issues with such code, either it does not > exist or is extremely unlikely to hit this issue. Given the "fix" is > itself only narrowing the window it doesn't seem necessary to address > code that already has a narrower window. > > That all said "leaking" the PerfData objects seems no less unpleasant > a "fix". There are so many obstacles in the way of being able to > unload and re-load the JVM that I do not think this makes the position > measurably worse. In fact I can imagine that if we were to allow for > such behaviour we would need to be able to terminate threads and > reclaim all their resources (like Monitor instances), at which point > it would also become easy to deallocate shared memory like PerfData > objects. > > I'll leave it up to you which way to go. As it stands this is Reviewed. > > Thanks, > David > >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> > > From kim.barrett at oracle.com Fri Aug 28 00:16:39 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 27 Aug 2015 20:16:39 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DF842E.1010407@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> Message-ID: <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty wrote: > > Sorry for starting another e-mail thread fork in an already complicated > review... OK, that was fascinating. No, really, I mean it. It made me realize that we've been arguing and talking past each other in part because we're really dealing with two distinct though closely related bugs here. I've been primarily thinking about the case where we're calling vm_abort / os::abort, where the we presently delete the PerfData memory even though there can be arbitrary other threads running. This was the case in JDK-8129978, which is how I got involved here in the first place. In that bug we were in vm_exit_during_initialization and had called perfMemory_exit when some thread attempted to inflate a monitor (which is not one of the conflicting cases discussed by Dan). The problem Dan has been looking at, JDK-8049304, is about a "normal" VM shutdown. In this case, the problem is that we believe it is safe to delete the PerfData, because we've safepointed, and yet some thread unexpectedly runs and attempts to touch the deleted data anyway. I think Dan's proposed fix (mostly) avoids the specific instance of JDK-8129978, but doesn't solve the more general problem of abnormal exit deleting the PerfData while some running thread is touching some non-monitor-related part of that data. My proposal to leave it to the OS to deal with memory cleanup on process exit would deal with this case. I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. And the approach I've been talking about doesn't help at all for this case. But I wonder if Dan's proposed fix can be improved. A "futile wakeup" case doesn't seem to me like one which requires super-high performance. Would it be ok, in the two problematic cases that Dan identified, to use some kind of atomic / locking protocol with the cleanup? Or is the comment for the counter increment in EnterI (and only there) correct that it's important to avoid a lock or atomics here (and presumably in ReenterI too). From ron.durbin at oracle.com Fri Aug 28 13:56:28 2015 From: ron.durbin at oracle.com (Ron Durbin) Date: Fri, 28 Aug 2015 06:56:28 -0700 (PDT) Subject: RFR (M) round 2 8061999 Enhance VM option parsing to allow options to be specified Message-ID: <32b8e18a-c363-4d5a-bb10-a54250cf4aa1@default> Here is the round 2 webrev for 8061999. Due to the large number of conflicts with other bug fixes in the cmd options area and the resulting refactoring of this fix, a delta webrev is not provided relative to round 1 because it wouldn't make any sense. Webrev link: http://cr.openjdk.java.net/~rdurbin/8061999_OCR2_JDK9_webrev RFE link: https://bugs.openjdk.java.net/browse/JDK-8061999 This RFE allows a file to be specified that holds VM Options that would otherwise be specified on the command line or in an environment variable. Only one options file may be specified on the command line and no options file may be specified in either of the following environment variables "JAVA_TOOL_OPTIONS" or "_JAVA_OPTIONS". The options file feature supports all VM options currently supported on the command line, except the options file option. The option to specify an options file is "-XX:VMOptionsFile=". The options file feature supports an options file up to 1024 bytes in size, This feature has been tested on: OS: Solaris, MAC, Windows, Linux Tests: Manual unit tests JPRT with -testset hotspot (including the SQE proposed test coverage for this feature.) Aurora,(Big Apps, JTREG,Tonga), Runtime SVC Nightly From coleen.phillimore at oracle.com Fri Aug 28 14:27:21 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 28 Aug 2015 10:27:21 -0400 Subject: RFR: 8132725: Memory leak in Arguments::add_property function In-Reply-To: <55DEF7FE.4070806@oracle.com> References: <55CC4D88.2030601@oracle.com> <55DA5381.9080004@oracle.com> <55DB1A61.7020508@oracle.com> <55DB579A.9000903@oracle.com> <55DC5F33.6060401@oracle.com> <55DDFFDE.5020101@oracle.com> <55DE3647.3000405@oracle.com> <55DEF7FE.4070806@oracle.com> Message-ID: <55E06FC9.5010202@oracle.com> This latest version looks good to me. Thank you for adding the const's! Coleen On 8/27/15 7:43 AM, Dmitry Dmitriev wrote: > Hello Coleen, > > Thank you for review and hint about AllocateHeap. I remove check for > 'tmp_key'. In this case new code behave as old code, i.e. call > 'vm_exit_out_of_memory' if it fails. Also, I change 'os::strdup' in > 'add_property' function to 'os::strdup_check_oom' to achieve the same > thing, i.e. behave as old code. In these case we don't need 'status' > variable. > > webrev 03: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03/ > > webrev 03 vs 02: > http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.03.vs.02/ > > > Thank you, > Dmitry > > On 27.08.2015 0:57, Coleen Phillimore wrote: >> >> + char* tmp_key = AllocateHeap(key_len + 1, mtInternal); >> + >> + if (tmp_key == NULL) { >> + return false; >> } >> >> AllocateHeap will call vm_exit_out_of_memory if it fails, and not >> return NULL. You have to add AllocFailStrategy::RETURN_NULL >> >> Otherwise, this seems good. >> >> Thanks for not adding a goto. >> >> Coleen >> >> >> On 8/26/15 2:05 PM, Dmitry Dmitriev wrote: >>> Hello, >>> >>> Still need a Reviewer. Can someone review this patch? Thank you! >>> >>> Dmitry >>> >>> On 25.08.2015 15:27, Dmitry Dmitriev wrote: >>>> Hi Ioi, >>>> >>>> Thank you for review and sponsorship! Still need a Reviewer please. >>>> >>>> I added assert. Also I fix indention on line 1023 and change "char >>>> *var_name" to "char* var_name" to match style which used in this >>>> function. >>>> >>>> webrev 02: http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02/ >>>> >>>> webrev 02 vs 01: >>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.02.vs.01/ >>>> >>>> >>>> Thanks, >>>> Dmitry >>>> >>>> On 24.08.2015 20:42, Ioi Lam wrote: >>>>> Hi Dmitry, >>>>> >>>>> The new changes look good. >>>>> >>>>> For defensive programming, I would suggest adding an assert here: >>>>> >>>>> 1035 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>> assert(_java_vendor_url_bug != NULL, "......"); >>>>> 1036 os::free((void *)_java_vendor_url_bug); >>>>> >>>>> I can sponsor the change, but we still need a Reviewer for this >>>>> change. >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/24/15 6:21 AM, Dmitry Dmitriev wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> Thank you for comments! Please, see my answers inline. >>>>>> >>>>>> On 24.08.2015 2:13, Ioi Lam wrote: >>>>>>> Hi Dmitry, >>>>>>> >>>>>>> Is this change part of 8132725? >>>>>>> >>>>>>> 3904 jint code = set_aggressive_opts_flags(); >>>>>>> 3905 if (code != JNI_OK) { >>>>>>> 3906 return code; >>>>>>> 3907 } >>>>>> Yes, set_aggressive_opts_flags not check return value of >>>>>> add_property function, so I add check to the >>>>>> set_aggressive_opts_flags()(lines 1911-1913 in new arguments.cpp) >>>>>> and thus now it returns jint. >>>>>> >>>>>>> >>>>>>> >>>>>>> 1041 if (_java_vendor_url_bug != DEFAULT_VENDOR_URL_BUG) { >>>>>>> >>>>>>> >> also check (_java_vendor_url_bug != NULL) for sanity? >>>>>> I think that this is unnecessary in this case, because >>>>>> _java_vendor_url_bug can not be NULL. _java_vendor_url_bug >>>>>> initialized to DEFAULT_VENDOR_URL_BUG and changed only in >>>>>> add_property function. Before new value is assigned to >>>>>> _java_vendor_url_bug it's check for not NULL. Thus, I think that >>>>>> check (_java_vendor_url_bug != NULL) is unnecessary in this case. >>>>>> >>>>>>> >>>>>>> >>>>>>> Also, there's a lot of duplicated "if (eq != NULL) { >>>>>>> FreeHeap((void *)key);}". Maybe these can be consolidated with a >>>>>>> "goto"? I know lots of people haye goto but it will make the >>>>>>> clean up less error prone: >>>>>> Thank you for this proposal. Since "goto" is not widely used in >>>>>> Hotspot code I decided to refactor current implementation to >>>>>> avoid duplication of "if (eq != NULL) { FreeHeap((void *)key);}". >> >>>>>> >>>>>>> >>>>>>> bool Arguments::add_property(const char* prop) { >>>>>>> .... >>>>>>> bool status = false; >>>>>>> .... >>>>>>> char *_java_command_new = os::strdup(value, mtInternal); >>>>>>> if (_java_command_new == NULL) { >>>>>>> goto done; >>>>>>> }else { >>>>>>> if (_java_command != NULL) { >>>>>>> os::free(_java_command); >>>>>>> } >>>>>>> _java_command = _java_command_new; >>>>>>> } >>>>>>> .... >>>>>>> } >>>>>>> // Create new property and add at the end of the list >>>>>>> PropertyList_unique_add(&_system_properties, key, value); >>>>>>> } >>>>>>> status = true; >>>>>>> >>>>>>> done: >>>>>>> if (key != prop) { >>>>>>> // SystemProperty copy passed value, thus free previously >>>>>>> allocated >>>>>>> // memory >>>>>>> FreeHeap((void *)key); >>>>>>> } >>>>>>> return status; >>>>>>> } >>>>>>> >>>>>>> Also, using (key != prop) would make the code clearer than (eq >>>>>>> != NULL). >>>>>> Fixed! >>>>>> >>>>>> webrev 01: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01/ >>>>>> >>>>>> webrev 01 vs 00: >>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.01.vs.00/ >>>>>> >>>>>> >>>>>> Thank you, >>>>>> Dmitry >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> On 8/13/15 12:55 AM, Dmitry Dmitriev wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> Please review this fix which remove memory leak in >>>>>>>> Arguments::add_property function. Also, I need a sponsor for >>>>>>>> this fix, who can push it. >>>>>>>> >>>>>>>> Arguments::add_property function allocate memory for key and >>>>>>>> value. Then key and values are passed to the >>>>>>>> PropertyList_unique_add function which use SystemProperty class >>>>>>>> to add or update property value. SystemProperty class maintains >>>>>>>> it's own copy of key and value and thus copy passed key and >>>>>>>> value. Therefore key and value must be freed in add_property >>>>>>>> function(with exception for value in case of >>>>>>>> "java.vendor.url.bug" and "sun.java.command" properties). >>>>>>>> >>>>>>>> In this fix I allocate memory only for key when passed property >>>>>>>> contains value. If passed property not contains value, then I >>>>>>>> not allocate memory for key and use passed property string. >>>>>>>> Value also extracted from passed property string instead of >>>>>>>> allocating. To accomplish that I changed declaration of "value" >>>>>>>> in several functions from "char *" to "const char *" since >>>>>>>> value is not modified in these functions(PropertyList_* >>>>>>>> functions, SystemProperty class methods). >>>>>>>> >>>>>>>> Processing of "java.vendor.url.bug" and "sun.java.command" >>>>>>>> properties also corrected. Now when these properties redefined, >>>>>>>> then code checks if memory was allocated for special variables >>>>>>>> of these properties(checking that not contains default value) >>>>>>>> and free it. >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~ddmitriev/8132725/webrev.00/ >>>>>>>> >>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8132725 >>>>>>>> Tested: JPRT(hotspot test set), hotspot all, vm.quick >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Dmitry >>>>>>> >>>>>> >>>>> >>>> >>> >> > From tom.benson at oracle.com Fri Aug 28 14:45:08 2015 From: tom.benson at oracle.com (Tom Benson) Date: Fri, 28 Aug 2015 10:45:08 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> Message-ID: <55E073F4.5050906@oracle.com> Hi, One more pair of eyes on this. 8^) On 8/27/2015 8:16 PM, Kim Barrett wrote: > On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty wrote: >> Sorry for starting another e-mail thread fork in an already complicated >> review... > OK, that was fascinating. No, really, I mean it. > > It made me realize that we've been arguing and talking past each other > in part because we're really dealing with two distinct though closely > related bugs here. > > I've been primarily thinking about the case where we're calling > vm_abort / os::abort, where the we presently delete the PerfData > memory even though there can be arbitrary other threads running. This > was the case in JDK-8129978, which is how I got involved here in the > first place. In that bug we were in vm_exit_during_initialization and > had called perfMemory_exit when some thread attempted to inflate a > monitor (which is not one of the conflicting cases discussed by Dan). > > The problem Dan has been looking at, JDK-8049304, is about a "normal" > VM shutdown. In this case, the problem is that we believe it is safe > to delete the PerfData, because we've safepointed, and yet some thread > unexpectedly runs and attempts to touch the deleted data anyway. > > I think Dan's proposed fix (mostly) avoids the specific instance of > JDK-8129978, but doesn't solve the more general problem of abnormal > exit deleting the PerfData while some running thread is touching some > non-monitor-related part of that data. My proposal to leave it to the > OS to deal with memory cleanup on process exit would deal with this > case. > > I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. > And the approach I've been talking about doesn't help at all for this > case. But I wonder if Dan's proposed fix can be improved. A "futile > wakeup" case doesn't seem to me like one which requires super-high > performance. Would it be ok, in the two problematic cases that Dan > identified, to use some kind of atomic / locking protocol with the > cleanup? Or is the comment for the counter increment in EnterI (and > only there) correct that it's important to avoid a lock or atomics > here (and presumably in ReenterI too). > I notice that EnteriI/ReenterI both end with OrderAccess::fence(). Can the potential update of _sync_FutileWakeups be delayed until that point, to take advantage of the fence to make the sync hole even smaller? You've got a release() (and and short nap!) with the store in PerfDataManager::destroy() to try to close the window somewhat. But I think rather than the release_store() you used, you want a store, followed by a release(). release_store() puts a fence before the store to ensure earlier updates are seen before the current one, no? Also, I think the comment above that release_store() could be clarified. It is fine as is if you're familiar with this bug report and discussion, but... I think it should explicitly say there is still a very small window for the lack of true synchronization to cause a failure. And perhaps that the release_store() (or store/release()) is not half of an acquire/release pair. Tom From daniel.daugherty at oracle.com Fri Aug 28 15:13:02 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Aug 2015 09:13:02 -0600 Subject: RFR (M) round 2 8061999 Enhance VM option parsing to allow options to be specified In-Reply-To: <32b8e18a-c363-4d5a-bb10-a54250cf4aa1@default> References: <32b8e18a-c363-4d5a-bb10-a54250cf4aa1@default> Message-ID: <55E07A7E.8000503@oracle.com> On 8/28/15 7:56 AM, Ron Durbin wrote: > Here is the round 2 webrev for 8061999. > > Due to the large number of conflicts with other bug fixes in the cmd options > area and the resulting refactoring of this fix, a delta webrev is not provided > relative to round 1 because it wouldn't make any sense. > > Webrev link: http://cr.openjdk.java.net/~rdurbin/8061999_OCR2_JDK9_webrev src/share/vm/runtime/arguments.hpp No comments src/share/vm/runtime/arguments.cpp Thanks for switching the buffer parse algorithm to be buffer length terminated instead of relying on NULL termination. I think that makes the algorithm more robust in terms of buffer overruns and it should make dropping the OPTION_BUFFER_SIZE restriction in the future easier. L3757: *flags_file = strdup(tail); This should be 'os::strdup((char *)tail);' L3843: // If there's a VMOptionFile, parse that (also can set flags_file) Typo: 'VMOptionFile' -> 'VMOptionsFile' Thumbs up modulo the two minor changes above. I don't need to see another webrev for just the above two changes. Dan > > RFE link: https://bugs.openjdk.java.net/browse/JDK-8061999 > > This RFE allows a file to be specified that holds VM Options that > would otherwise be specified on the command line or in an environment variable. > Only one options file may be specified on the command line and no options file > may be specified in either of the following environment variables > "JAVA_TOOL_OPTIONS" or "_JAVA_OPTIONS". > > The options file feature supports all VM options currently supported on > the command line, except the options file option. The option to specify an > options file is "-XX:VMOptionsFile=". > The options file feature supports an options file up to 1024 bytes in size, > > This feature has been tested on: > OS: > Solaris, MAC, Windows, Linux > Tests: > Manual unit tests > JPRT with -testset hotspot (including the SQE proposed test coverage for this feature.) > Aurora,(Big Apps, JTREG,Tonga), Runtime SVC Nightly From daniel.daugherty at oracle.com Fri Aug 28 15:52:33 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Aug 2015 09:52:33 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> Message-ID: <55E083C1.8090709@oracle.com> On 8/27/15 6:16 PM, Kim Barrett wrote: > On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty wrote: >> Sorry for starting another e-mail thread fork in an already complicated >> review... > OK, that was fascinating. No, really, I mean it. :-) > It made me realize that we've been arguing and talking past each other > in part because we're really dealing with two distinct though closely > related bugs here. > > I've been primarily thinking about the case where we're calling > vm_abort / os::abort, where the we presently delete the PerfData > memory even though there can be arbitrary other threads running. This > was the case in JDK-8129978, which is how I got involved here in the > first place. In that bug we were in vm_exit_during_initialization and > had called perfMemory_exit when some thread attempted to inflate a > monitor (which is not one of the conflicting cases discussed by Dan). Right. All abnormal exit cases would be considered conflicting in what I wrote yesterday. > The problem Dan has been looking at, JDK-8049304, is about a "normal" > VM shutdown. In this case, the problem is that we believe it is safe > to delete the PerfData, because we've safepointed, and yet some thread > unexpectedly runs and attempts to touch the deleted data anyway. Right, I was concentrating on "normal". > I think Dan's proposed fix (mostly) avoids the specific instance of > JDK-8129978, but doesn't solve the more general problem of abnormal > exit deleting the PerfData while some running thread is touching some > non-monitor-related part of that data. My proposal to leave it to the > OS to deal with memory cleanup on process exit would deal with this > case. Now I'm starting to see that you are/were focused on a different problem. OK. Definitely should look at what we can do here. If we can avoid crashing a second time while dealing with another crash/ abnormal exit that would be good. > I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. > And the approach I've been talking about doesn't help at all for this > case. But I wonder if Dan's proposed fix can be improved. A "futile > wakeup" case doesn't seem to me like one which requires super-high > performance. Would it be ok, in the two problematic cases that Dan > identified, to use some kind of atomic / locking protocol with the > cleanup? Or is the comment for the counter increment in EnterI (and > only there) correct that it's important to avoid a lock or atomics > here (and presumably in ReenterI too). This comment: 570 // Keep a tally of the # of futile wakeups. 571 // Note that the counter is not protected by a lock or updated by atomics. 572 // That is by design - we trade "lossy" counters which are exposed to 573 // races during updates for a lower probe effect. and this comment: 732 // Keep a tally of the # of futile wakeups. 733 // Note that the counter is not protected by a lock or updated by atomics. 734 // That is by design - we trade "lossy" counters which are exposed to 735 // races during updates for a lower probe effect. are not really specific to the monitor subsystem. I think the comments are generally true about the perf counters. As we discussed earlier in the thread, generally updating the perf counters with syncs or locks will cost and potentially perturb the things we are trying to count. So I think what you're proposing is putting a lock protocol around the setting of the flag and then have the non-safepoint-safe uses grab that lock while the safepoint-safe uses skip the lock because they can rely on the safepoint protocol in the "normal" exit case. Do I have this right? Dan From daniel.daugherty at oracle.com Fri Aug 28 16:27:39 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Aug 2015 10:27:39 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E073F4.5050906@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> Message-ID: <55E08BFB.70507@oracle.com> On 8/28/15 8:45 AM, Tom Benson wrote: > Hi, > One more pair of eyes on this. 8^) Hi Tom! Thanks for reviewing and welcome to the party... > > On 8/27/2015 8:16 PM, Kim Barrett wrote: >> On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty >> wrote: >>> Sorry for starting another e-mail thread fork in an already complicated >>> review... >> OK, that was fascinating. No, really, I mean it. >> >> It made me realize that we've been arguing and talking past each other >> in part because we're really dealing with two distinct though closely >> related bugs here. >> >> I've been primarily thinking about the case where we're calling >> vm_abort / os::abort, where the we presently delete the PerfData >> memory even though there can be arbitrary other threads running. This >> was the case in JDK-8129978, which is how I got involved here in the >> first place. In that bug we were in vm_exit_during_initialization and >> had called perfMemory_exit when some thread attempted to inflate a >> monitor (which is not one of the conflicting cases discussed by Dan). >> >> The problem Dan has been looking at, JDK-8049304, is about a "normal" >> VM shutdown. In this case, the problem is that we believe it is safe >> to delete the PerfData, because we've safepointed, and yet some thread >> unexpectedly runs and attempts to touch the deleted data anyway. >> >> I think Dan's proposed fix (mostly) avoids the specific instance of >> JDK-8129978, but doesn't solve the more general problem of abnormal >> exit deleting the PerfData while some running thread is touching some >> non-monitor-related part of that data. My proposal to leave it to the >> OS to deal with memory cleanup on process exit would deal with this >> case. >> >> I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. >> And the approach I've been talking about doesn't help at all for this >> case. But I wonder if Dan's proposed fix can be improved. A "futile >> wakeup" case doesn't seem to me like one which requires super-high >> performance. Would it be ok, in the two problematic cases that Dan >> identified, to use some kind of atomic / locking protocol with the >> cleanup? Or is the comment for the counter increment in EnterI (and >> only there) correct that it's important to avoid a lock or atomics >> here (and presumably in ReenterI too). >> > > I notice that EnteriI/ReenterI both end with OrderAccess::fence(). Can > the potential update of _sync_FutileWakeups be delayed until that > point, to take advantage of the fence to make the sync hole even smaller? Not easily with EnterI() since there is one optional optimization between the OM_PERFDATA_OP(FutileWakeups, inc()) call and the OrderAccess::fence() call and that would result in lost FutileWakeups increments. Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) call is at the bottom of the for-loop and the OrderAccess::fence() call at the end of the function is outside the loop. This would result in lost FutileWakeups increments. So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call immediately follows an OrderAccess::fence() call. Doesn't that make that increment as "safe" as it can be without having a real lock? > You've got a release() (and and short nap!) with the store in > PerfDataManager::destroy() to try to close the window somewhat. Yes, I modeled that after: src/share/vm/runtime/perfMemory.cpp: 83 void PerfMemory::initialize() { 156 OrderAccess::release_store(&_initialized, 1); 157 } > But I think rather than the release_store() you used, you want a > store, followed by a release(). release_store() puts a fence before > the store to ensure earlier updates are seen before the current one, no? Yup, and I see I got my reasoning wrong. The code I modeled is right because you want to flush all the inits and it's OK if the _initialized transition from '0' -> '1' is lazily seen. For my shutdown use, we are transitioning from '1' -> '0' and we need that to be seen proactively so: OrderAccess::release_store(&_has_PerfData, 0); OrderAccess::storeload(); which is modeled after _owner field transitions from non-zeo -> NULL in ObjectMonitor.cpp > > Also, I think the comment above that release_store() could be > clarified. It is fine as is if you're familiar with this bug report > and discussion, but... I think it should explicitly say there is > still a very small window for the lack of true synchronization to > cause a failure. And perhaps that the release_store() (or > store/release()) is not half of an acquire/release pair. Here's the existing comment: 286 // Clear the flag before we free the PerfData counters. Thus begins 287 // the race between this thread and another thread that has just 288 // queried PerfDataManager::has_PerfData() and gotten back 'true'. 289 // The hope is that the other thread will finish its PerfData 290 // manipulation before we free the memory. The two alternatives 291 // are 1) leak the PerfData memory or 2) do some form of ordered 292 // access before every PerfData operation. I think it pretty clearly states that there is still a race here. And I think that option 2 covers that we're not doing completely safe ordered access. I'm not sure how to make this comment more clear, but if you have specific suggestions... Dan > > Tom From james.laskey at oracle.com Fri Aug 28 17:05:04 2015 From: james.laskey at oracle.com (Jim Laskey (Oracle)) Date: Fri, 28 Aug 2015 14:05:04 -0300 Subject: RFR: 8087181: Move native jimage code to its own library (maybe libjimage) (against hs-rt) Message-ID: <2D6069D1-9D31-4CF4-9A06-093B2548590F@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8087181 http://cr.openjdk.java.net/~jlaskey/hs-rt/webrev-top http://cr.openjdk.java.net/~jlaskey/hs-rt/webrev-jdk http://cr.openjdk.java.net/~jlaskey/hs-rt/webrev-hotspot From tom.benson at oracle.com Fri Aug 28 17:57:39 2015 From: tom.benson at oracle.com (Tom Benson) Date: Fri, 28 Aug 2015 13:57:39 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E08BFB.70507@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> Message-ID: <55E0A113.2090106@oracle.com> Hi Dan, On 8/28/2015 12:27 PM, Daniel D. Daugherty wrote: > On 8/28/15 8:45 AM, Tom Benson wrote: >> Hi, >> One more pair of eyes on this. 8^) > > Hi Tom! > > Thanks for reviewing and welcome to the party... > > >> >> On 8/27/2015 8:16 PM, Kim Barrett wrote: >>> On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty >>> wrote: >>>> Sorry for starting another e-mail thread fork in an already >>>> complicated >>>> review... >>> OK, that was fascinating. No, really, I mean it. >>> >>> It made me realize that we've been arguing and talking past each other >>> in part because we're really dealing with two distinct though closely >>> related bugs here. >>> >>> I've been primarily thinking about the case where we're calling >>> vm_abort / os::abort, where the we presently delete the PerfData >>> memory even though there can be arbitrary other threads running. This >>> was the case in JDK-8129978, which is how I got involved here in the >>> first place. In that bug we were in vm_exit_during_initialization and >>> had called perfMemory_exit when some thread attempted to inflate a >>> monitor (which is not one of the conflicting cases discussed by Dan). >>> >>> The problem Dan has been looking at, JDK-8049304, is about a "normal" >>> VM shutdown. In this case, the problem is that we believe it is safe >>> to delete the PerfData, because we've safepointed, and yet some thread >>> unexpectedly runs and attempts to touch the deleted data anyway. >>> >>> I think Dan's proposed fix (mostly) avoids the specific instance of >>> JDK-8129978, but doesn't solve the more general problem of abnormal >>> exit deleting the PerfData while some running thread is touching some >>> non-monitor-related part of that data. My proposal to leave it to the >>> OS to deal with memory cleanup on process exit would deal with this >>> case. >>> >>> I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. >>> And the approach I've been talking about doesn't help at all for this >>> case. But I wonder if Dan's proposed fix can be improved. A "futile >>> wakeup" case doesn't seem to me like one which requires super-high >>> performance. Would it be ok, in the two problematic cases that Dan >>> identified, to use some kind of atomic / locking protocol with the >>> cleanup? Or is the comment for the counter increment in EnterI (and >>> only there) correct that it's important to avoid a lock or atomics >>> here (and presumably in ReenterI too). >>> >> >> I notice that EnteriI/ReenterI both end with OrderAccess::fence(). >> Can the potential update of _sync_FutileWakeups be delayed until that >> point, to take advantage of the fence to make the sync hole even >> smaller? > > Not easily with EnterI() since there is one optional optimization > between the OM_PERFDATA_OP(FutileWakeups, inc()) call and the > OrderAccess::fence() call and that would result in lost FutileWakeups > increments. > > Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) call > is at the bottom of the for-loop and the OrderAccess::fence() call at > the end of the function is outside the loop. This would result in lost > FutileWakeups increments. Yes, you'd have to keep a local count, and then add the total outside the loop for both Enter/Reenter, after the fence. But I see what you mean about the other exit paths in Enter. (The more I look at this code, the more I remember it... BTW, are those knob_ setting defaults ever going to be moved to a platform specific-module? That was my beef (well, one of them) in 2 different ports. Or is the goal to make monitors so well self-tuning that they can go away? Sorry for the digression... 8^)) At any rate, as you say, perhaps it's not worth it to leverage the fences, though it could be done. > > So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call > immediately > follows an OrderAccess::fence() call. Doesn't that make that increment as > "safe" as it can be without having a real lock? > > >> You've got a release() (and and short nap!) with the store in >> PerfDataManager::destroy() to try to close the window somewhat. > > Yes, I modeled that after: > > src/share/vm/runtime/perfMemory.cpp: > > 83 void PerfMemory::initialize() { > > 156 OrderAccess::release_store(&_initialized, 1); > 157 } > > >> But I think rather than the release_store() you used, you want a >> store, followed by a release(). release_store() puts a fence before >> the store to ensure earlier updates are seen before the current one, no? > > Yup, and I see I got my reasoning wrong. The code I modeled > is right because you want to flush all the inits and it's OK > if the _initialized transition from '0' -> '1' is lazily seen. > > For my shutdown use, we are transitioning from '1' -> '0' and > we need that to be seen proactively so: > > OrderAccess::release_store(&_has_PerfData, 0); > OrderAccess::storeload(); > > which is modeled after _owner field transitions from non-zeo > -> NULL in ObjectMonitor.cpp > It's not clear to me why the store needs to be a release_store in this case, as long as the storeload() follows it. You're not protecting any earlier writes. ? > >> >> Also, I think the comment above that release_store() could be >> clarified. It is fine as is if you're familiar with this bug report >> and discussion, but... I think it should explicitly say there is >> still a very small window for the lack of true synchronization to >> cause a failure. And perhaps that the release_store() (or >> store/release()) is not half of an acquire/release pair. > > Here's the existing comment: > > 286 // Clear the flag before we free the PerfData counters. Thus > begins > 287 // the race between this thread and another thread that has > just > 288 // queried PerfDataManager::has_PerfData() and gotten back > 'true'. > 289 // The hope is that the other thread will finish its PerfData > 290 // manipulation before we free the memory. The two alternatives > 291 // are 1) leak the PerfData memory or 2) do some form of > ordered > 292 // access before every PerfData operation. > > I think it pretty clearly states that there is still a race here. > And I think that option 2 covers that we're not doing completely > safe ordered access. I'm not sure how to make this comment more > clear, but if you have specific suggestions... > OK, I guess it's subjective. Tom > Dan > > >> >> Tom > From daniel.daugherty at oracle.com Fri Aug 28 18:12:03 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Aug 2015 12:12:03 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E0A113.2090106@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> <55E0A113.2090106@oracle.com> Message-ID: <55E0A473.3010900@oracle.com> On 8/28/15 11:57 AM, Tom Benson wrote: > Hi Dan, > > On 8/28/2015 12:27 PM, Daniel D. Daugherty wrote: >> On 8/28/15 8:45 AM, Tom Benson wrote: >>> Hi, >>> One more pair of eyes on this. 8^) >> >> Hi Tom! >> >> Thanks for reviewing and welcome to the party... >> >> >>> >>> On 8/27/2015 8:16 PM, Kim Barrett wrote: >>>> On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty >>>> wrote: >>>>> Sorry for starting another e-mail thread fork in an already >>>>> complicated >>>>> review... >>>> OK, that was fascinating. No, really, I mean it. >>>> >>>> It made me realize that we've been arguing and talking past each other >>>> in part because we're really dealing with two distinct though closely >>>> related bugs here. >>>> >>>> I've been primarily thinking about the case where we're calling >>>> vm_abort / os::abort, where the we presently delete the PerfData >>>> memory even though there can be arbitrary other threads running. This >>>> was the case in JDK-8129978, which is how I got involved here in the >>>> first place. In that bug we were in vm_exit_during_initialization and >>>> had called perfMemory_exit when some thread attempted to inflate a >>>> monitor (which is not one of the conflicting cases discussed by Dan). >>>> >>>> The problem Dan has been looking at, JDK-8049304, is about a "normal" >>>> VM shutdown. In this case, the problem is that we believe it is safe >>>> to delete the PerfData, because we've safepointed, and yet some thread >>>> unexpectedly runs and attempts to touch the deleted data anyway. >>>> >>>> I think Dan's proposed fix (mostly) avoids the specific instance of >>>> JDK-8129978, but doesn't solve the more general problem of abnormal >>>> exit deleting the PerfData while some running thread is touching some >>>> non-monitor-related part of that data. My proposal to leave it to the >>>> OS to deal with memory cleanup on process exit would deal with this >>>> case. >>>> >>>> I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. >>>> And the approach I've been talking about doesn't help at all for this >>>> case. But I wonder if Dan's proposed fix can be improved. A "futile >>>> wakeup" case doesn't seem to me like one which requires super-high >>>> performance. Would it be ok, in the two problematic cases that Dan >>>> identified, to use some kind of atomic / locking protocol with the >>>> cleanup? Or is the comment for the counter increment in EnterI (and >>>> only there) correct that it's important to avoid a lock or atomics >>>> here (and presumably in ReenterI too). >>>> >>> >>> I notice that EnteriI/ReenterI both end with OrderAccess::fence(). >>> Can the potential update of _sync_FutileWakeups be delayed until >>> that point, to take advantage of the fence to make the sync hole >>> even smaller? >> >> Not easily with EnterI() since there is one optional optimization >> between the OM_PERFDATA_OP(FutileWakeups, inc()) call and the >> OrderAccess::fence() call and that would result in lost FutileWakeups >> increments. >> >> Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) call >> is at the bottom of the for-loop and the OrderAccess::fence() call at >> the end of the function is outside the loop. This would result in lost >> FutileWakeups increments. > Yes, you'd have to keep a local count, and then add the total outside > the loop for both Enter/Reenter, after the fence. But I see what you > mean about the other exit paths in Enter. (The more I look at this > code, the more I remember it... I'm sure that is a wonderfully loving memory too! > BTW, are those knob_ setting defaults ever going to be moved to a > platform specific-module? That was my beef (well, one of them) in 2 > different ports. Or is the goal to make monitors so well self-tuning > that they can go away? Sorry for the digression... 8^)) Dice is working on another idea to move tuning to a separate loadable module which is why we deferred the "adaptive spin" and "SpinPause on SPARC" buckets for the Contended Locking project. > At any rate, as you say, perhaps it's not worth it to leverage the > fences, though it could be done. OK so we're agreed on no change here. > >> >> So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call >> immediately >> follows an OrderAccess::fence() call. Doesn't that make that >> increment as >> "safe" as it can be without having a real lock? >> >> >>> You've got a release() (and and short nap!) with the store in >>> PerfDataManager::destroy() to try to close the window somewhat. >> >> Yes, I modeled that after: >> >> src/share/vm/runtime/perfMemory.cpp: >> >> 83 void PerfMemory::initialize() { >> >> 156 OrderAccess::release_store(&_initialized, 1); >> 157 } >> >> >>> But I think rather than the release_store() you used, you want a >>> store, followed by a release(). release_store() puts a fence >>> before the store to ensure earlier updates are seen before the >>> current one, no? >> >> Yup, and I see I got my reasoning wrong. The code I modeled >> is right because you want to flush all the inits and it's OK >> if the _initialized transition from '0' -> '1' is lazily seen. >> >> For my shutdown use, we are transitioning from '1' -> '0' and >> we need that to be seen proactively so: >> >> OrderAccess::release_store(&_has_PerfData, 0); >> OrderAccess::storeload(); >> >> which is modeled after _owner field transitions from non-zero >> -> NULL in ObjectMonitor.cpp >> > > It's not clear to me why the store needs to be a release_store in this > case, as long as the storeload() follows it. You're not protecting > any earlier writes. ? I'm following the model that Dice uses for ObjectMonitors when we change the _owner field from non-NULL -> NULL. There are some long comments in the ObjectMonitor.cpp stuff that talk about why it is done this way. I'm planning to file a cleanup bug for the non-NULL -> NULL transition stuff because the comments and code are not consistent. > >> >>> >>> Also, I think the comment above that release_store() could be >>> clarified. It is fine as is if you're familiar with this bug report >>> and discussion, but... I think it should explicitly say there is >>> still a very small window for the lack of true synchronization to >>> cause a failure. And perhaps that the release_store() (or >>> store/release()) is not half of an acquire/release pair. >> >> Here's the existing comment: >> >> 286 // Clear the flag before we free the PerfData counters. >> Thus begins >> 287 // the race between this thread and another thread that has >> just >> 288 // queried PerfDataManager::has_PerfData() and gotten back >> 'true'. >> 289 // The hope is that the other thread will finish its PerfData >> 290 // manipulation before we free the memory. The two >> alternatives >> 291 // are 1) leak the PerfData memory or 2) do some form of >> ordered >> 292 // access before every PerfData operation. >> >> I think it pretty clearly states that there is still a race here. >> And I think that option 2 covers that we're not doing completely >> safe ordered access. I'm not sure how to make this comment more >> clear, but if you have specific suggestions... >> > OK, I guess it's subjective. I'll always take wording tweaks so if something occurs to you later on... :-) Dan > Tom > >> Dan >> >> >>> >>> Tom >> > From tom.benson at oracle.com Fri Aug 28 18:24:44 2015 From: tom.benson at oracle.com (Tom Benson) Date: Fri, 28 Aug 2015 14:24:44 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E0A473.3010900@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> <55E0A113.2090106@oracle.com> <55E0A473.3010900@oracle.com> Message-ID: <55E0A76C.4000101@oracle.com> Hi Dan, On 8/28/2015 2:12 PM, Daniel D. Daugherty wrote: > . . . >>> >>> Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) call >>> is at the bottom of the for-loop and the OrderAccess::fence() call at >>> the end of the function is outside the loop. This would result in lost >>> FutileWakeups increments. >> Yes, you'd have to keep a local count, and then add the total outside >> the loop for both Enter/Reenter, after the fence. But I see what you >> mean about the other exit paths in Enter. (The more I look at this >> code, the more I remember it... > > I'm sure that is a wonderfully loving memory too! Absolutely! 8^) > > >> BTW, are those knob_ setting defaults ever going to be moved to a >> platform specific-module? That was my beef (well, one of them) in 2 >> different ports. Or is the goal to make monitors so well self-tuning >> that they can go away? Sorry for the digression... 8^)) > > Dice is working on another idea to move tuning to a separate loadable > module which is why we deferred the "adaptive spin" and "SpinPause on > SPARC" buckets for the Contended Locking project. > > >> At any rate, as you say, perhaps it's not worth it to leverage the >> fences, though it could be done. > > OK so we're agreed on no change here. > > >> >>> >>> So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call >>> immediately >>> follows an OrderAccess::fence() call. Doesn't that make that >>> increment as >>> "safe" as it can be without having a real lock? >>> >>> >>>> You've got a release() (and and short nap!) with the store in >>>> PerfDataManager::destroy() to try to close the window somewhat. >>> >>> Yes, I modeled that after: >>> >>> src/share/vm/runtime/perfMemory.cpp: >>> >>> 83 void PerfMemory::initialize() { >>> >>> 156 OrderAccess::release_store(&_initialized, 1); >>> 157 } >>> >>> >>>> But I think rather than the release_store() you used, you want a >>>> store, followed by a release(). release_store() puts a fence >>>> before the store to ensure earlier updates are seen before the >>>> current one, no? >>> >>> Yup, and I see I got my reasoning wrong. The code I modeled >>> is right because you want to flush all the inits and it's OK >>> if the _initialized transition from '0' -> '1' is lazily seen. >>> >>> For my shutdown use, we are transitioning from '1' -> '0' and >>> we need that to be seen proactively so: >>> >>> OrderAccess::release_store(&_has_PerfData, 0); >>> OrderAccess::storeload(); >>> >>> which is modeled after _owner field transitions from non-zero >>> -> NULL in ObjectMonitor.cpp >>> >> >> It's not clear to me why the store needs to be a release_store in >> this case, as long as the storeload() follows it. You're not >> protecting any earlier writes. ? > > I'm following the model that Dice uses for ObjectMonitors when we > change the _owner field from non-NULL -> NULL. There are some long > comments in the ObjectMonitor.cpp stuff that talk about why it is > done this way. I'm planning to file a cleanup bug for the non-NULL > -> NULL transition stuff because the comments and code are not > consistent. > But that's in lock exit, so the release is needed to ensure all outstanding writes are seen before the owner is set to null. There's nothing analogous in this case. However, since this will be executed once per JVM shutdown, having an extra release() isn't a big deal... Tom From kim.barrett at oracle.com Fri Aug 28 18:27:37 2015 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 28 Aug 2015 14:27:37 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E083C1.8090709@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E083C1.8090709@oracle.com> Message-ID: <6EDED012-FD41-4879-A9E5-3DFA6245FCAD@oracle.com> On Aug 28, 2015, at 11:52 AM, Daniel D. Daugherty wrote: > > This comment: > > 570 // Keep a tally of the # of futile wakeups. > 571 // Note that the counter is not protected by a lock or updated by atomics. > 572 // That is by design - we trade "lossy" counters which are exposed to > 573 // races during updates for a lower probe effect. > > and this comment: > > 732 // Keep a tally of the # of futile wakeups. > 733 // Note that the counter is not protected by a lock or updated by atomics. > 734 // That is by design - we trade "lossy" counters which are exposed to > 735 // races during updates for a lower probe effect. > > are not really specific to the monitor subsystem. I think > the comments are generally true about the perf counters. Yes, but oddly placed here. > As we discussed earlier in the thread, generally updating the perf > counters with syncs or locks will cost and potentially perturb the > things we are trying to count. Yes. > So I think what you're proposing is putting a lock protocol around > the setting of the flag and then have the non-safepoint-safe uses > grab that lock while the safepoint-safe uses skip the lock because > they can rely on the safepoint protocol in the "normal" exit case. > > Do I have this right? Yes. My question is, does the extra overhead matter in these specific cases. And the locking mechanism might be some clever use of atomics rather than any sort of ?standard" mutex. And the safepoint-safe uses not only skip the lock, but don?t need to check the flag at all. From tom.benson at oracle.com Fri Aug 28 18:29:58 2015 From: tom.benson at oracle.com (Tom Benson) Date: Fri, 28 Aug 2015 14:29:58 -0400 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E08BFB.70507@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> Message-ID: <55E0A8A6.5040101@oracle.com> Hi again, Just noticed I skipped this question in your reply: > > So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call > immediately > follows an OrderAccess::fence() call. Doesn't that make that increment as > "safe" as it can be without having a real lock? > > Yes - odd that I noticed the fence after the update in that code, and not the one right before it! Thanks, Tom From mikhailo.seledtsov at oracle.com Fri Aug 28 19:38:02 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 28 Aug 2015 12:38:02 -0700 Subject: RFR(S): JDK-8133180 - [TESTBUG] runtime/SharedArchiveFile/SharedStrings.java failed with WhiteBox.class : no such file or directory In-Reply-To: <55D28802.4070306@oracle.com> References: <55CE5C1C.7050002@oracle.com> <402AD339-8F17-44CB-9B91-36C27DE6437F@oracle.com> <55D28802.4070306@oracle.com> Message-ID: <55E0B89A.7060304@oracle.com> Hi Jiangli, I am back to this task after working on other stuff. As I started experimenting with your suggestions, I have found one potential problem. The fallback strategy that you suggest could create an ambiguity. For instance, there may be cases where both @build+ClassFileInstaller and @compile is used. The CFI will place classes in the current working directory, aka ".", where is @compile will place them under JTwork/../currentTest/classes/.. In such cases, the test author will probably need to create 2 jars, one for classes in the local directory, and another for classes in the "@compile", and may want to be explicit what goes where, instead of JarBuilder trying to guess and leading to possible ambiguity and confusion. The key to solving this problem is to never assume the location of classes produced by @build. If using @build, the classes MUST be copied to the "." using ClassFileInstaller, since @build does not guarantee the presence of these classes in a specific location. Perhaps, the best solution here is to require all classes used by the JarBuilder to be in the "." directory, and require the test author to use ClassFileInstaller when necessary. In such cases, if the class is not in the test's current directory, the error will become obvious immediately instead of under some occasional conditions in the test system. It will also greatly simplify the tests and will make them more robust. Please let me know comments/objections on such approach. Thank you, Misha On 8/17/2015 6:18 PM, Mikhailo Seledtsov wrote: > Hi Jiangli, > > Thank you for your suggestion; it should make testing more robust. > I will try your suggestion; if I do not see any undesired side effects > I will rerun full testset and re-submit the updated review. > > Thank you, > Misha > > On 8/14/2015 5:43 PM, Jiangli Zhou wrote: >> Hi Misha, >> >> I have one suggestion. Instead of searching either the ?work? or >> ?classes? directory depending on the ?classesInWorkDir? argument, >> how about searching both directories? So the BasicJarBuilder.build() >> method would use the ?classes? directory first, then try the ?work? >> directory if the file cannot be found in ?classes'. That might be a >> more robust solution. >> >> Thanks, >> Jiangli >> >> On Aug 14, 2015, at 2:22 PM, Mikhailo Seledtsov >> wrote: >> >>> Please review this fix to the CDS test bug. See the comments in the >>> bug for details. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8133180 >>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133180.00/ >>> Testing: >>> - ran the reproducer discussed in the bug description >>> rm -Rf JT* test >>> jtreg >>> /media/data3/hg/9/work01/hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java >>> jtreg >>> /media/data3/hg/9/work01/hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java >>> >>> - ran the CDS tests in concurrent mode >>> - running CDS tests via multi-platform build-and-test system >>> (in progress) >>> >>> Thank you, >>> Misha > From daniel.daugherty at oracle.com Fri Aug 28 19:46:52 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Aug 2015 13:46:52 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <6EDED012-FD41-4879-A9E5-3DFA6245FCAD@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E083C1.8090709@oracle.com> <6EDED012-FD41-4879-A9E5-3DFA6245FCAD@oracle.com> Message-ID: <55E0BAAC.8040300@oracle.com> On 8/28/15 12:27 PM, Kim Barrett wrote: > On Aug 28, 2015, at 11:52 AM, Daniel D. Daugherty wrote: >> This comment: >> >> 570 // Keep a tally of the # of futile wakeups. >> 571 // Note that the counter is not protected by a lock or updated by atomics. >> 572 // That is by design - we trade "lossy" counters which are exposed to >> 573 // races during updates for a lower probe effect. >> >> and this comment: >> >> 732 // Keep a tally of the # of futile wakeups. >> 733 // Note that the counter is not protected by a lock or updated by atomics. >> 734 // That is by design - we trade "lossy" counters which are exposed to >> 735 // races during updates for a lower probe effect. >> >> are not really specific to the monitor subsystem. I think >> the comments are generally true about the perf counters. > Yes, but oddly placed here. > >> As we discussed earlier in the thread, generally updating the perf >> counters with syncs or locks will cost and potentially perturb the >> things we are trying to count. > Yes. > >> So I think what you're proposing is putting a lock protocol around >> the setting of the flag and then have the non-safepoint-safe uses >> grab that lock while the safepoint-safe uses skip the lock because >> they can rely on the safepoint protocol in the "normal" exit case. >> >> Do I have this right? > Yes. My question is, does the extra overhead matter in these specific cases. > And the locking mechanism might be some clever use of atomics rather than > any sort of ?standard" mutex. I figure the lightest I can get away with is an acquire. There's an existing lock for the perf stuff, but I don't want to use a full blow mutex... > And the safepoint-safe uses not only skip the lock, Agreed. > but don?t need to check the > flag at all. I don't agree with this part. Until the VMThread exits and raises the permanent safepoint barrier for remaining daemon threads, I don't think I can guarantee that we won't go to a safepoint, clear the flag & free the memory, and then return from that safepoint... which would allow a daemon thread to access the now freed memory... Dan From daniel.daugherty at oracle.com Fri Aug 28 19:48:08 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Aug 2015 13:48:08 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E0A76C.4000101@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> <55E0A113.2090106@oracle.com> <55E0A473.3010900@oracle.com> <55E0A76C.4000101@oracle.com> Message-ID: <55E0BAF8.9000303@oracle.com> On 8/28/15 12:24 PM, Tom Benson wrote: > Hi Dan, > > > On 8/28/2015 2:12 PM, Daniel D. Daugherty wrote: >> . . . >>>> >>>> Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) >>>> call >>>> is at the bottom of the for-loop and the OrderAccess::fence() call at >>>> the end of the function is outside the loop. This would result in lost >>>> FutileWakeups increments. >>> Yes, you'd have to keep a local count, and then add the total >>> outside the loop for both Enter/Reenter, after the fence. But I see >>> what you mean about the other exit paths in Enter. (The more I look >>> at this code, the more I remember it... >> >> I'm sure that is a wonderfully loving memory too! > > Absolutely! 8^) > >> >> >>> BTW, are those knob_ setting defaults ever going to be moved to a >>> platform specific-module? That was my beef (well, one of them) in >>> 2 different ports. Or is the goal to make monitors so well >>> self-tuning that they can go away? Sorry for the digression... 8^)) >> >> Dice is working on another idea to move tuning to a separate loadable >> module which is why we deferred the "adaptive spin" and "SpinPause on >> SPARC" buckets for the Contended Locking project. >> >> >>> At any rate, as you say, perhaps it's not worth it to leverage the >>> fences, though it could be done. >> >> OK so we're agreed on no change here. >> >> >>> >>>> >>>> So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call >>>> immediately >>>> follows an OrderAccess::fence() call. Doesn't that make that >>>> increment as >>>> "safe" as it can be without having a real lock? >>>> >>>> >>>>> You've got a release() (and and short nap!) with the store in >>>>> PerfDataManager::destroy() to try to close the window somewhat. >>>> >>>> Yes, I modeled that after: >>>> >>>> src/share/vm/runtime/perfMemory.cpp: >>>> >>>> 83 void PerfMemory::initialize() { >>>> >>>> 156 OrderAccess::release_store(&_initialized, 1); >>>> 157 } >>>> >>>> >>>>> But I think rather than the release_store() you used, you want a >>>>> store, followed by a release(). release_store() puts a fence >>>>> before the store to ensure earlier updates are seen before the >>>>> current one, no? >>>> >>>> Yup, and I see I got my reasoning wrong. The code I modeled >>>> is right because you want to flush all the inits and it's OK >>>> if the _initialized transition from '0' -> '1' is lazily seen. >>>> >>>> For my shutdown use, we are transitioning from '1' -> '0' and >>>> we need that to be seen proactively so: >>>> >>>> OrderAccess::release_store(&_has_PerfData, 0); >>>> OrderAccess::storeload(); >>>> >>>> which is modeled after _owner field transitions from non-zero >>>> -> NULL in ObjectMonitor.cpp >>>> >>> >>> It's not clear to me why the store needs to be a release_store in >>> this case, as long as the storeload() follows it. You're not >>> protecting any earlier writes. ? >> >> I'm following the model that Dice uses for ObjectMonitors when we >> change the _owner field from non-NULL -> NULL. There are some long >> comments in the ObjectMonitor.cpp stuff that talk about why it is >> done this way. I'm planning to file a cleanup bug for the non-NULL >> -> NULL transition stuff because the comments and code are not >> consistent. >> > > But that's in lock exit, so the release is needed to ensure all > outstanding writes are seen before the owner is set to null. There's > nothing analogous in this case. True... I'll mull on it since I'm tweaking the code again anyway... Dan > > However, since this will be executed once per JVM shutdown, having an > extra release() isn't a big deal... > Tom > > > From daniel.daugherty at oracle.com Fri Aug 28 19:49:00 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Aug 2015 13:49:00 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E0A8A6.5040101@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> <55E0A8A6.5040101@oracle.com> Message-ID: <55E0BB2C.2000402@oracle.com> On 8/28/15 12:29 PM, Tom Benson wrote: > Hi again, > Just noticed I skipped this question in your reply: > >> >> So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call >> immediately >> follows an OrderAccess::fence() call. Doesn't that make that >> increment as >> "safe" as it can be without having a real lock? >> >> > Yes - odd that I noticed the fence after the update in that code, and > not the one right before it! That's because you didn't uncross your eyes before reading that part of the monitor code... :-) Dan > Thanks, > Tom > From jiangli.zhou at oracle.com Fri Aug 28 21:48:37 2015 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 28 Aug 2015 14:48:37 -0700 Subject: RFR(S): JDK-8133180 - [TESTBUG] runtime/SharedArchiveFile/SharedStrings.java failed with WhiteBox.class : no such file or directory In-Reply-To: <55E0B89A.7060304@oracle.com> References: <55CE5C1C.7050002@oracle.com> <402AD339-8F17-44CB-9B91-36C27DE6437F@oracle.com> <55D28802.4070306@oracle.com> <55E0B89A.7060304@oracle.com> Message-ID: Hi Misha, Your proposal sounds good. Thanks, Jiangli > On Aug 28, 2015, at 12:38 PM, Mikhailo Seledtsov wrote: > > Hi Jiangli, > > I am back to this task after working on other stuff. > As I started experimenting with your suggestions, I have found one potential problem. > The fallback strategy that you suggest could create an ambiguity. > > For instance, there may be cases where both @build+ClassFileInstaller and @compile is used. The CFI will place classes in the current working directory, aka ".", where is @compile will place them under JTwork/../currentTest/classes/.. > In such cases, the test author will probably need to create 2 jars, one for classes in the local directory, and another for classes in the "@compile", and may want to be explicit what goes where, instead of JarBuilder trying to guess and leading to possible ambiguity and confusion. > > The key to solving this problem is to never assume the location of classes produced by @build. If using @build, the classes MUST be copied to the "." using ClassFileInstaller, since @build does not guarantee the presence of these classes in a specific location. Perhaps, the best solution here is to require all classes used by the JarBuilder to be in the "." directory, and require the test author to use ClassFileInstaller when necessary. In such cases, if the class is not in the test's current directory, the error will become obvious immediately instead of under some occasional conditions in the test system. It will also greatly simplify the tests and will make them more robust. > > Please let me know comments/objections on such approach. > > Thank you, > Misha > > > On 8/17/2015 6:18 PM, Mikhailo Seledtsov wrote: >> Hi Jiangli, >> >> Thank you for your suggestion; it should make testing more robust. >> I will try your suggestion; if I do not see any undesired side effects I will rerun full testset and re-submit the updated review. >> >> Thank you, >> Misha >> >> On 8/14/2015 5:43 PM, Jiangli Zhou wrote: >>> Hi Misha, >>> >>> I have one suggestion. Instead of searching either the ?work? or ?classes? directory depending on the ?classesInWorkDir? argument, how about searching both directories? So the BasicJarBuilder.build() method would use the ?classes? directory first, then try the ?work? directory if the file cannot be found in ?classes'. That might be a more robust solution. >>> >>> Thanks, >>> Jiangli >>> >>> On Aug 14, 2015, at 2:22 PM, Mikhailo Seledtsov wrote: >>> >>>> Please review this fix to the CDS test bug. See the comments in the bug for details. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8133180 >>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133180.00/ >>>> Testing: >>>> - ran the reproducer discussed in the bug description >>>> rm -Rf JT* test >>>> jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java >>>> jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java >>>> >>>> - ran the CDS tests in concurrent mode >>>> - running CDS tests via multi-platform build-and-test system >>>> (in progress) >>>> >>>> Thank you, >>>> Misha >> > From mikhailo.seledtsov at oracle.com Fri Aug 28 23:09:14 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 28 Aug 2015 16:09:14 -0700 Subject: RFR(S): JDK-8133180 - [TESTBUG] runtime/SharedArchiveFile/SharedStrings.java failed with WhiteBox.class : no such file or directory In-Reply-To: References: <55CE5C1C.7050002@oracle.com> <402AD339-8F17-44CB-9B91-36C27DE6437F@oracle.com> <55D28802.4070306@oracle.com> <55E0B89A.7060304@oracle.com> Message-ID: <55E0EA1A.3080405@oracle.com> Jiangli, Thank you for a quick reply. I have updated the webrev accordingly: http://cr.openjdk.java.net/~mseledtsov/8133180.01/ Testing: - ran the test with b75 (internal and openjdk) - ran the "easy reproducer" discussed in a bug rm -Rf JT* test jtreg //hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java jtreg //hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java - RBT of CDS tests - in progress Thank you, Misha On 8/28/2015 2:48 PM, Jiangli Zhou wrote: > Hi Misha, > > Your proposal sounds good. > > Thanks, > Jiangli > >> On Aug 28, 2015, at 12:38 PM, Mikhailo Seledtsov wrote: >> >> Hi Jiangli, >> >> I am back to this task after working on other stuff. >> As I started experimenting with your suggestions, I have found one potential problem. >> The fallback strategy that you suggest could create an ambiguity. >> >> For instance, there may be cases where both @build+ClassFileInstaller and @compile is used. The CFI will place classes in the current working directory, aka ".", where is @compile will place them under JTwork/../currentTest/classes/.. >> In such cases, the test author will probably need to create 2 jars, one for classes in the local directory, and another for classes in the "@compile", and may want to be explicit what goes where, instead of JarBuilder trying to guess and leading to possible ambiguity and confusion. >> >> The key to solving this problem is to never assume the location of classes produced by @build. If using @build, the classes MUST be copied to the "." using ClassFileInstaller, since @build does not guarantee the presence of these classes in a specific location. Perhaps, the best solution here is to require all classes used by the JarBuilder to be in the "." directory, and require the test author to use ClassFileInstaller when necessary. In such cases, if the class is not in the test's current directory, the error will become obvious immediately instead of under some occasional conditions in the test system. It will also greatly simplify the tests and will make them more robust. >> >> Please let me know comments/objections on such approach. >> >> Thank you, >> Misha >> >> >> On 8/17/2015 6:18 PM, Mikhailo Seledtsov wrote: >>> Hi Jiangli, >>> >>> Thank you for your suggestion; it should make testing more robust. >>> I will try your suggestion; if I do not see any undesired side effects I will rerun full testset and re-submit the updated review. >>> >>> Thank you, >>> Misha >>> >>> On 8/14/2015 5:43 PM, Jiangli Zhou wrote: >>>> Hi Misha, >>>> >>>> I have one suggestion. Instead of searching either the ?work? or ?classes? directory depending on the ?classesInWorkDir? argument, how about searching both directories? So the BasicJarBuilder.build() method would use the ?classes? directory first, then try the ?work? directory if the file cannot be found in ?classes'. That might be a more robust solution. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> On Aug 14, 2015, at 2:22 PM, Mikhailo Seledtsov wrote: >>>> >>>>> Please review this fix to the CDS test bug. See the comments in the bug for details. >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8133180 >>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133180.00/ >>>>> Testing: >>>>> - ran the reproducer discussed in the bug description >>>>> rm -Rf JT* test >>>>> jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java >>>>> jtreg /media/data3/hg/9/work01/hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java >>>>> >>>>> - ran the CDS tests in concurrent mode >>>>> - running CDS tests via multi-platform build-and-test system >>>>> (in progress) >>>>> >>>>> Thank you, >>>>> Misha From david.holmes at oracle.com Mon Aug 31 00:59:21 2015 From: david.holmes at oracle.com (David Holmes) Date: Mon, 31 Aug 2015 10:59:21 +1000 Subject: RFR (M) round 2 8061999 Enhance VM option parsing to allow options to be specified In-Reply-To: <55E07A7E.8000503@oracle.com> References: <32b8e18a-c363-4d5a-bb10-a54250cf4aa1@default> <55E07A7E.8000503@oracle.com> Message-ID: <55E3A6E9.4080903@oracle.com> On 29/08/2015 1:13 AM, Daniel D. Daugherty wrote: > On 8/28/15 7:56 AM, Ron Durbin wrote: >> Here is the round 2 webrev for 8061999. >> >> Due to the large number of conflicts with other bug fixes in the cmd >> options >> area and the resulting refactoring of this fix, a delta webrev is not >> provided >> relative to round 1 because it wouldn't make any sense. >> >> Webrev link: http://cr.openjdk.java.net/~rdurbin/8061999_OCR2_JDK9_webrev > > src/share/vm/runtime/arguments.hpp > No comments > > src/share/vm/runtime/arguments.cpp > Thanks for switching the buffer parse algorithm to be buffer length > terminated instead of relying on NULL termination. I think that > makes the algorithm more robust in terms of buffer overruns and > it should make dropping the OPTION_BUFFER_SIZE restriction in the > future easier. > > L3757: *flags_file = strdup(tail); > This should be 'os::strdup((char *)tail);' When do these strdup'd values get freed? I couldn't keep track of the calling sequence. Thanks, David > L3843: // If there's a VMOptionFile, parse that (also can set > flags_file) > Typo: 'VMOptionFile' -> 'VMOptionsFile' > > > Thumbs up modulo the two minor changes above. I don't need > to see another webrev for just the above two changes. > > Dan > > >> >> RFE link: https://bugs.openjdk.java.net/browse/JDK-8061999 >> >> This RFE allows a file to be specified that holds VM Options that >> would otherwise be specified on the command line or in an environment >> variable. >> Only one options file may be specified on the command line and no >> options file >> may be specified in either of the following environment variables >> "JAVA_TOOL_OPTIONS" or "_JAVA_OPTIONS". >> >> The options file feature supports all VM options currently supported on >> the command line, except the options file option. The option to >> specify an >> options file is "-XX:VMOptionsFile=". >> The options file feature supports an options file up to 1024 bytes in >> size, >> >> This feature has been tested on: >> OS: >> Solaris, MAC, Windows, Linux >> Tests: >> Manual unit tests >> JPRT with -testset hotspot (including the SQE proposed test >> coverage for this feature.) >> Aurora,(Big Apps, JTREG,Tonga), Runtime SVC Nightly > From david.holmes at oracle.com Mon Aug 31 01:30:10 2015 From: david.holmes at oracle.com (David Holmes) Date: Mon, 31 Aug 2015 11:30:10 +1000 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E08BFB.70507@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> Message-ID: <55E3AE22.6050309@oracle.com> Hi Dan, On 29/08/2015 2:27 AM, Daniel D. Daugherty wrote: > On 8/28/15 8:45 AM, Tom Benson wrote: >> Hi, >> One more pair of eyes on this. 8^) > > Hi Tom! > > Thanks for reviewing and welcome to the party... > > >> >> On 8/27/2015 8:16 PM, Kim Barrett wrote: >>> On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty >>> wrote: >>>> Sorry for starting another e-mail thread fork in an already complicated >>>> review... >>> OK, that was fascinating. No, really, I mean it. >>> >>> It made me realize that we've been arguing and talking past each other >>> in part because we're really dealing with two distinct though closely >>> related bugs here. >>> >>> I've been primarily thinking about the case where we're calling >>> vm_abort / os::abort, where the we presently delete the PerfData >>> memory even though there can be arbitrary other threads running. This >>> was the case in JDK-8129978, which is how I got involved here in the >>> first place. In that bug we were in vm_exit_during_initialization and >>> had called perfMemory_exit when some thread attempted to inflate a >>> monitor (which is not one of the conflicting cases discussed by Dan). >>> >>> The problem Dan has been looking at, JDK-8049304, is about a "normal" >>> VM shutdown. In this case, the problem is that we believe it is safe >>> to delete the PerfData, because we've safepointed, and yet some thread >>> unexpectedly runs and attempts to touch the deleted data anyway. >>> >>> I think Dan's proposed fix (mostly) avoids the specific instance of >>> JDK-8129978, but doesn't solve the more general problem of abnormal >>> exit deleting the PerfData while some running thread is touching some >>> non-monitor-related part of that data. My proposal to leave it to the >>> OS to deal with memory cleanup on process exit would deal with this >>> case. >>> >>> I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. >>> And the approach I've been talking about doesn't help at all for this >>> case. But I wonder if Dan's proposed fix can be improved. A "futile >>> wakeup" case doesn't seem to me like one which requires super-high >>> performance. Would it be ok, in the two problematic cases that Dan >>> identified, to use some kind of atomic / locking protocol with the >>> cleanup? Or is the comment for the counter increment in EnterI (and >>> only there) correct that it's important to avoid a lock or atomics >>> here (and presumably in ReenterI too). >>> >> >> I notice that EnteriI/ReenterI both end with OrderAccess::fence(). Can >> the potential update of _sync_FutileWakeups be delayed until that >> point, to take advantage of the fence to make the sync hole even smaller? > > Not easily with EnterI() since there is one optional optimization > between the OM_PERFDATA_OP(FutileWakeups, inc()) call and the > OrderAccess::fence() call and that would result in lost FutileWakeups > increments. > > Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) call > is at the bottom of the for-loop and the OrderAccess::fence() call at > the end of the function is outside the loop. This would result in lost > FutileWakeups increments. > > So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call immediately > follows an OrderAccess::fence() call. Doesn't that make that increment as > "safe" as it can be without having a real lock? > > >> You've got a release() (and and short nap!) with the store in >> PerfDataManager::destroy() to try to close the window somewhat. > > Yes, I modeled that after: > > src/share/vm/runtime/perfMemory.cpp: > > 83 void PerfMemory::initialize() { > > 156 OrderAccess::release_store(&_initialized, 1); > 157 } > > >> But I think rather than the release_store() you used, you want a >> store, followed by a release(). release_store() puts a fence before >> the store to ensure earlier updates are seen before the current one, no? > > Yup, and I see I got my reasoning wrong. The code I modeled > is right because you want to flush all the inits and it's OK > if the _initialized transition from '0' -> '1' is lazily seen. > > For my shutdown use, we are transitioning from '1' -> '0' and > we need that to be seen proactively so: Nit: OrderAccess "barriers" enforce ordering constraints but don't in general provide any guarantees about visibility - ie they are not necessarily "flushes". So while it may be true on some platforms by virtue of the underlying barrier mechanism, in general they don't change when a write becomes visible and so there is nothing "proactive" about them. > OrderAccess::release_store(&_has_PerfData, 0); > OrderAccess::storeload(); I agree with Tom that the release is unnecessary - though harmless. The real ordering constraint here is that we preserve: _has_PerfData = 0; for which a storeload|storestore barrier after the write seems most appropriate. Though with the insertion of the sleep after the write there won't be any reordering anyway so explicit barriers seem redundant. Cheers, David ----- > which is modeled after _owner field transitions from non-zeo > -> NULL in ObjectMonitor.cpp > > >> >> Also, I think the comment above that release_store() could be >> clarified. It is fine as is if you're familiar with this bug report >> and discussion, but... I think it should explicitly say there is >> still a very small window for the lack of true synchronization to >> cause a failure. And perhaps that the release_store() (or >> store/release()) is not half of an acquire/release pair. > > Here's the existing comment: > > 286 // Clear the flag before we free the PerfData counters. Thus > begins > 287 // the race between this thread and another thread that has just > 288 // queried PerfDataManager::has_PerfData() and gotten back > 'true'. > 289 // The hope is that the other thread will finish its PerfData > 290 // manipulation before we free the memory. The two alternatives > 291 // are 1) leak the PerfData memory or 2) do some form of ordered > 292 // access before every PerfData operation. > > I think it pretty clearly states that there is still a race here. > And I think that option 2 covers that we're not doing completely > safe ordered access. I'm not sure how to make this comment more > clear, but if you have specific suggestions... > > Dan > > >> >> Tom > From jesper.wilhelmsson at oracle.com Mon Aug 31 13:56:54 2015 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Mon, 31 Aug 2015 15:56:54 +0200 Subject: FWD: RFR: 8134626 - Misc cleanups after generation array removal In-Reply-To: <55E07247.8020304@oracle.com> References: <55E07247.8020304@oracle.com> Message-ID: <55E45D26.5040000@oracle.com> Hi runtime, While working with the generation array removal I did a bunch of unrelated cleanups in the code I passed by. I split this out into a separate patch to keep the other changes clean. This is mostly GC code, but a few runtime files was touched as well and might cause merge conflicts with your favorite local repository, so I was advised to forward this RFR to the runtime list as well. This change is split into three webrevs to make it easier to review. I intend to push all three as one change. Bug: https://bugs.openjdk.java.net/browse/JDK-8134626 Increment 1: Renaming http://cr.openjdk.java.net/~jwilhelm/8134626/webrev.00/inc1-renaming/ Changes the variable name gen to young_gen or old_gen depending on what the variable will contain. The few cases where the name gen is untouched are places where the variable actually can contain either generation. Increment 2: Comments and indentation http://cr.openjdk.java.net/~jwilhelm/8134626/webrev.00/inc2-comments_and_indentation/ * Fixes up alignment in lots of places. * Inserts and removes empty lines where needed. * Cleans up comments where we previously talked about older and younger generations. Fixes a few typos and clarifies some comments with respect to young/old generations and collections. * Added spaces around some operators. Increment 3: Code changes http://cr.openjdk.java.net/~jwilhelm/8134626/webrev.00/inc3-code_changes/ * Merged strings that was split on several lines for no good reason. * Added braces for if statements and for loops. * Removed dead code. * Moved variable initialization to initializer list in CollectedHeap constructor. * Updated flag descriptions "youngest" -> "young". Thanks, /Jesper From ron.durbin at oracle.com Mon Aug 31 16:10:19 2015 From: ron.durbin at oracle.com (Ron Durbin) Date: Mon, 31 Aug 2015 09:10:19 -0700 (PDT) Subject: RFR (M) round 2 8061999 Enhance VM option parsing to allow options to be specified In-Reply-To: <55E3A6E9.4080903@oracle.com> References: <32b8e18a-c363-4d5a-bb10-a54250cf4aa1@default> <55E07A7E.8000503@oracle.com> <55E3A6E9.4080903@oracle.com> Message-ID: <76c00b14-bb22-47ba-8a72-734d5081059c@default> David Thx for the review, I am reworking the code to free the strdup()'ed memory. Ron >-----Original Message----- >From: David Holmes >Sent: Sunday, August 30, 2015 6:59 PM >To: Ron Durbin >Cc: Daniel Daugherty; hotspot-runtime-dev at openjdk.java.net >Subject: Re: RFR (M) round 2 8061999 Enhance VM option parsing to allow options to be specified > >On 29/08/2015 1:13 AM, Daniel D. Daugherty wrote: >> On 8/28/15 7:56 AM, Ron Durbin wrote: >>> Here is the round 2 webrev for 8061999. >>> >>> Due to the large number of conflicts with other bug fixes in the cmd >>> options >>> area and the resulting refactoring of this fix, a delta webrev is not >>> provided >>> relative to round 1 because it wouldn't make any sense. >>> >>> Webrev link: http://cr.openjdk.java.net/~rdurbin/8061999_OCR2_JDK9_webrev >> >> src/share/vm/runtime/arguments.hpp >> No comments >> >> src/share/vm/runtime/arguments.cpp >> Thanks for switching the buffer parse algorithm to be buffer length >> terminated instead of relying on NULL termination. I think that >> makes the algorithm more robust in terms of buffer overruns and >> it should make dropping the OPTION_BUFFER_SIZE restriction in the >> future easier. >> >> L3757: *flags_file = strdup(tail); >> This should be 'os::strdup((char *)tail);' > >When do these strdup'd values get freed? I couldn't keep track of the >calling sequence. > >Thanks, >David > >> L3843: // If there's a VMOptionFile, parse that (also can set >> flags_file) >> Typo: 'VMOptionFile' -> 'VMOptionsFile' >> >> >> Thumbs up modulo the two minor changes above. I don't need >> to see another webrev for just the above two changes. >> >> Dan >> >> >>> >>> RFE link: https://bugs.openjdk.java.net/browse/JDK-8061999 >>> >>> This RFE allows a file to be specified that holds VM Options that >>> would otherwise be specified on the command line or in an environment >>> variable. >>> Only one options file may be specified on the command line and no >>> options file >>> may be specified in either of the following environment variables >>> "JAVA_TOOL_OPTIONS" or "_JAVA_OPTIONS". >>> >>> The options file feature supports all VM options currently supported on >>> the command line, except the options file option. The option to >>> specify an >>> options file is "-XX:VMOptionsFile=". >>> The options file feature supports an options file up to 1024 bytes in >>> size, >>> >>> This feature has been tested on: >>> OS: >>> Solaris, MAC, Windows, Linux >>> Tests: >>> Manual unit tests >>> JPRT with -testset hotspot (including the SQE proposed test >>> coverage for this feature.) >>> Aurora,(Big Apps, JTREG,Tonga), Runtime SVC Nightly >> From daniel.daugherty at oracle.com Mon Aug 31 16:58:04 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Aug 2015 10:58:04 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55E3AE22.6050309@oracle.com> References: <55DCD94A.30705@oracle.com> <55DE915B.9020605@oracle.com> <55DF842E.1010407@oracle.com> <6C52FDE5-A4C0-4FC9-9114-83376C97FD88@oracle.com> <55E073F4.5050906@oracle.com> <55E08BFB.70507@oracle.com> <55E3AE22.6050309@oracle.com> Message-ID: <55E4879C.7090203@oracle.com> Hi David, Replies embedded below... On 8/30/15 7:30 PM, David Holmes wrote: > Hi Dan, > > On 29/08/2015 2:27 AM, Daniel D. Daugherty wrote: >> On 8/28/15 8:45 AM, Tom Benson wrote: >>> Hi, >>> One more pair of eyes on this. 8^) >> >> Hi Tom! >> >> Thanks for reviewing and welcome to the party... >> >> >>> >>> On 8/27/2015 8:16 PM, Kim Barrett wrote: >>>> On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty >>>> wrote: >>>>> Sorry for starting another e-mail thread fork in an already >>>>> complicated >>>>> review... >>>> OK, that was fascinating. No, really, I mean it. >>>> >>>> It made me realize that we've been arguing and talking past each other >>>> in part because we're really dealing with two distinct though closely >>>> related bugs here. >>>> >>>> I've been primarily thinking about the case where we're calling >>>> vm_abort / os::abort, where the we presently delete the PerfData >>>> memory even though there can be arbitrary other threads running. This >>>> was the case in JDK-8129978, which is how I got involved here in the >>>> first place. In that bug we were in vm_exit_during_initialization and >>>> had called perfMemory_exit when some thread attempted to inflate a >>>> monitor (which is not one of the conflicting cases discussed by Dan). >>>> >>>> The problem Dan has been looking at, JDK-8049304, is about a "normal" >>>> VM shutdown. In this case, the problem is that we believe it is safe >>>> to delete the PerfData, because we've safepointed, and yet some thread >>>> unexpectedly runs and attempts to touch the deleted data anyway. >>>> >>>> I think Dan's proposed fix (mostly) avoids the specific instance of >>>> JDK-8129978, but doesn't solve the more general problem of abnormal >>>> exit deleting the PerfData while some running thread is touching some >>>> non-monitor-related part of that data. My proposal to leave it to the >>>> OS to deal with memory cleanup on process exit would deal with this >>>> case. >>>> >>>> I think Dan's proposed fix (mostly) avoids problems like JDK-8049304. >>>> And the approach I've been talking about doesn't help at all for this >>>> case. But I wonder if Dan's proposed fix can be improved. A "futile >>>> wakeup" case doesn't seem to me like one which requires super-high >>>> performance. Would it be ok, in the two problematic cases that Dan >>>> identified, to use some kind of atomic / locking protocol with the >>>> cleanup? Or is the comment for the counter increment in EnterI (and >>>> only there) correct that it's important to avoid a lock or atomics >>>> here (and presumably in ReenterI too). >>>> >>> >>> I notice that EnteriI/ReenterI both end with OrderAccess::fence(). Can >>> the potential update of _sync_FutileWakeups be delayed until that >>> point, to take advantage of the fence to make the sync hole even >>> smaller? >> >> Not easily with EnterI() since there is one optional optimization >> between the OM_PERFDATA_OP(FutileWakeups, inc()) call and the >> OrderAccess::fence() call and that would result in lost FutileWakeups >> increments. >> >> Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) call >> is at the bottom of the for-loop and the OrderAccess::fence() call at >> the end of the function is outside the loop. This would result in lost >> FutileWakeups increments. >> >> So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call >> immediately >> follows an OrderAccess::fence() call. Doesn't that make that >> increment as >> "safe" as it can be without having a real lock? >> >> >>> You've got a release() (and and short nap!) with the store in >>> PerfDataManager::destroy() to try to close the window somewhat. >> >> Yes, I modeled that after: >> >> src/share/vm/runtime/perfMemory.cpp: >> >> 83 void PerfMemory::initialize() { >> >> 156 OrderAccess::release_store(&_initialized, 1); >> 157 } >> >> >>> But I think rather than the release_store() you used, you want a >>> store, followed by a release(). release_store() puts a fence before >>> the store to ensure earlier updates are seen before the current one, >>> no? >> >> Yup, and I see I got my reasoning wrong. The code I modeled >> is right because you want to flush all the inits and it's OK >> if the _initialized transition from '0' -> '1' is lazily seen. >> >> For my shutdown use, we are transitioning from '1' -> '0' and >> we need that to be seen proactively so: > > Nit: OrderAccess "barriers" enforce ordering constraints but don't in > general provide any guarantees about visibility - ie they are not > necessarily "flushes". So while it may be true on some platforms by > virtue of the underlying barrier mechanism, in general they don't > change when a write becomes visible and so there is nothing > "proactive" about them. My apologies for being sloppy with my wording. What I'm trying to do is put up a barrier so that once the deleting thread has changed the flag from '1' -> '0' another thread trying to read the flag won't see the '1' when it should see the '0'. In order words, I don't want the other thread's read of the flag to float above the setting of the flag to '0'. > >> OrderAccess::release_store(&_has_PerfData, 0); >> OrderAccess::storeload(); > > I agree with Tom that the release is unnecessary - though harmless. I tried to switch the code to: OrderAccess::store(&_has_PerfData, 0) OrderAccess::storeload(); but that wouldn't compile on Solaris X64: Error: static OrderAccess::store(volatile int*, int) is not accessible from static PerfDataManager::destroy(). But then it occurred to me that I was trying too hard to use OrderAccess functions... > The real ordering constraint here is that we preserve: > > _has_PerfData = 0; > I think I can switch to a straight "_has_PerfData = 0;" here and not use either OrderAccess::release_store() or OrderAccess::store(). I also reread OrderAccess.hpp again and convinced myself that an "OrderAccess::fence()" call is the only way to be sure here. I'm pretty sure that achieves 'storeload|storestore' barrier semantics... Yes, I have a headache again... :-) > for which a storeload|storestore barrier after the write seems most > appropriate. Though with the insertion of the sleep after the write > there won't be any reordering anyway so explicit barriers seem redundant. I didn't think that os::naked_short_sleep() would do anything to keep another thread's read from floating above the "_has_PerfData = 0;". What did I miss? My current plan: - switch from "OrderAccess::release_store(&_has_PerfData, 0);" to "_has_PerfData = 0;" - keep "OrderAccess::fence();" - retest and send out for another round of review Dan > > Cheers, > David > ----- > >> which is modeled after _owner field transitions from non-zeo >> -> NULL in ObjectMonitor.cpp >> >> >>> >>> Also, I think the comment above that release_store() could be >>> clarified. It is fine as is if you're familiar with this bug report >>> and discussion, but... I think it should explicitly say there is >>> still a very small window for the lack of true synchronization to >>> cause a failure. And perhaps that the release_store() (or >>> store/release()) is not half of an acquire/release pair. >> >> Here's the existing comment: >> >> 286 // Clear the flag before we free the PerfData counters. Thus >> begins >> 287 // the race between this thread and another thread that >> has just >> 288 // queried PerfDataManager::has_PerfData() and gotten back >> 'true'. >> 289 // The hope is that the other thread will finish its PerfData >> 290 // manipulation before we free the memory. The two >> alternatives >> 291 // are 1) leak the PerfData memory or 2) do some form of >> ordered >> 292 // access before every PerfData operation. >> >> I think it pretty clearly states that there is still a race here. >> And I think that option 2 covers that we're not doing completely >> safe ordered access. I'm not sure how to make this comment more >> clear, but if you have specific suggestions... >> >> Dan >> >> >>> >>> Tom >> From gerald.thornbrugh at oracle.com Mon Aug 31 19:27:52 2015 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Mon, 31 Aug 2015 13:27:52 -0600 Subject: RFR: JDK-8134161 JVM is creating too many GC helper threads on T7/M7 linux/sparc platform Message-ID: <55E4AAB8.90908@oracle.com> Hi, There is a Linux/SPARC related issue in JDK8 and JDK9 where hotspot does not identify a SPARC T7/M7 platform correctly. Because hotspot does not identify the T7/M7 platform correctly it uses the default number of GC helper threads which is twice the number that is should use. The fix adds code to detect a M series platform from the /proc/cpuinfo file and sets the correct CPU features. Setting the correct features allow the correct number of GC helper threads to be allocated. The fix only impacts the Linux/SPARC code and does not impact any other platform/OS combination. I have completed successful JPRT test runs for each set of changes and have also verified that the change addresses the issue on Linux/SPARC M series hardware. Here is the bug: https://bugs.openjdk.java.net/browse/JDK-8134161 Here is the JDK9 webrev: http://cr.openjdk.java.net/~gthornbr/8134161/webrev/ Here is the JDK8 webrev: http://cr.openjdk.java.net/~gthornbr/8134674/webrev/ Please let me know if you have any questions or concerns. Thanks, Gerald Thornbrugh From coleen.phillimore at oracle.com Mon Aug 31 20:07:13 2015 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 31 Aug 2015 16:07:13 -0400 Subject: RFR (M) round 2 8061999 Enhance VM option parsing to allow options to be specified In-Reply-To: <32b8e18a-c363-4d5a-bb10-a54250cf4aa1@default> References: <32b8e18a-c363-4d5a-bb10-a54250cf4aa1@default> Message-ID: <55E4B3F1.3080803@oracle.com> Ron, I believe that the code in the insert function could be simpified as below. It doesn't need the count in the loop counter. 3451 for (int count = 0, i = 0; count < length; i++) { simpler loop just counts the the incoming arguments and pushes args_to_insert at position. for (int i = 0; i < args->nOptions; i++) { if (i == vm_args_position) { // insert the new options at position for (int j = 0; j < args_to_insert->nOptions; j++) { options->push(args_to_insert->options[j]); } } else { options->push(args->options[i]); } } I don't see when this gets freed and you use the value before it gets back to the caller, so I don't see why it needs to be strdup'ed. 3775 // Save a copy of vm_options_file since we don't know 3776 // when args will get freed by the caller. 3777 *vm_options_file = os::strdup((char *)tail); Also with -XX:Flags= 3755 // Save a copy of flags_file since we don't know 3756 // when args will get freed by the caller. 3757 *flags_file = strdup(tail); These options are strdup-ed in ScopedVMInitArgs::set_args() and deallocated in the ~ScopedVMInitArgs destructor so should not be freed during the lifetime of the Arguments::parse() function. I think you said you found problems in testing with flags_file being deallocated but I don't see where that would have been. thanks, Coleen On 8/28/15 9:56 AM, Ron Durbin wrote: > Here is the round 2 webrev for 8061999. > > Due to the large number of conflicts with other bug fixes in the cmd options > area and the resulting refactoring of this fix, a delta webrev is not provided > relative to round 1 because it wouldn't make any sense. > > Webrev link: http://cr.openjdk.java.net/~rdurbin/8061999_OCR2_JDK9_webrev > > RFE link: https://bugs.openjdk.java.net/browse/JDK-8061999 > > This RFE allows a file to be specified that holds VM Options that > would otherwise be specified on the command line or in an environment variable. > Only one options file may be specified on the command line and no options file > may be specified in either of the following environment variables > "JAVA_TOOL_OPTIONS" or "_JAVA_OPTIONS". > > The options file feature supports all VM options currently supported on > the command line, except the options file option. The option to specify an > options file is "-XX:VMOptionsFile=". > The options file feature supports an options file up to 1024 bytes in size, > > This feature has been tested on: > OS: > Solaris, MAC, Windows, Linux > Tests: > Manual unit tests > JPRT with -testset hotspot (including the SQE proposed test coverage for this feature.) > Aurora,(Big Apps, JTREG,Tonga), Runtime SVC Nightly From daniel.daugherty at oracle.com Mon Aug 31 22:51:02 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Aug 2015 16:51:02 -0600 Subject: RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc() In-Reply-To: <55DCD94A.30705@oracle.com> References: <55DCD94A.30705@oracle.com> Message-ID: <55E4DA56.8030002@oracle.com> Greetings, I've updated the "fix" for this bug based on code review comments received in round 0. JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() https://bugs.openjdk.java.net/browse/JDK-8049304 Webrev URL: http://cr.openjdk.java.net/~dcubed/8049304-webrev/1-jdk9-hs-rt/ The easiest way to re-review is to download the two patch files and view them in your favorite file merge tool: http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/hotspot.patch http://cr.openjdk.java.net/~dcubed/8049304-webrev/1-jdk9-hs-rt/hotspot.patch Testing: Aurora Adhoc RT-SVC nightly batch (in process) Aurora Adhoc vm.tmtools batch (in process) Kim's repro sequence for JDK-8049304 Kim's repro sequence for JDK-8129978 JPRT -testset hotspot Changes between round 0 and round 1: - add an 'is_safe' parameter to the OM_PERFDATA_OP macro; safepoint-safe callers can access _has_PerfData flag directly; non-safepoint-safe callers use a load-acquire to fetch the current _has_PerfData flag value - change PerfDataManager::destroy() to simply set _has_PerfData to zero (field is volatile) and then use a fence() to prevent any reordering of operations in any direction; it's only done once during VM shutdown so... - change perfMemory_exit() to only call PerfDataManager::destroy() when called at a safepoint and when the StatSample is not running; this means when the VM is aborting, we no longer have a race between the original crash report and this code path. I believe that I've addressed all comments from round 0. Thanks, in advance, for any comments, questions or suggestions. Dan On 8/25/15 3:08 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a "fix" for a long standing race between JVM shutdown and the > JVM statistics subsystem: > > JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc() > https://bugs.openjdk.java.net/browse/JDK-8049304 > > Webrev URL: > http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/ > > Testing: Aurora Adhoc RT-SVC nightly batch > Aurora Adhoc vm.tmtools batch > Kim's repro sequence for JDK-8049304 > Kim's repro sequence for JDK-8129978 > JPRT -testset hotspot > > This "fix": > > - adds a volatile flag to record whether PerfDataManager is holding > data (PerfData objects) > - adds PerfDataManager::has_PerfData() to return the flag > - changes the Java monitor subsystem's use of PerfData to > check both allocation of the monitor subsystem specific > PerfData object and the new PerfDataManager::has_PerfData() > return value > > If the global 'UsePerfData' option is false, the system works as > it did before. If 'UsePerfData' is true (the default on non-embedded > systems), the Java monitor subsystem will allocate a number of > PerfData objects to record information. The objects will record > information about Java monitor subsystem until the JVM shuts down. > > When the JVM starts to shutdown, the new PerfDataManager flag will > change to false and the Java monitor subsystem will stop using the > PerfData objects. This is the new behavior. As noted in the comments > I added to the code, the race is still present; I'm just changing > the order and the timing to reduce the likelihood of the crash. > > Thanks, in advance, for any comments, questions or suggestions. > > Dan > > > From daniel.daugherty at oracle.com Mon Aug 31 22:59:58 2015 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 31 Aug 2015 16:59:58 -0600 Subject: RFR: JDK-8134161 JVM is creating too many GC helper threads on T7/M7 linux/sparc platform In-Reply-To: <55E4AAB8.90908@oracle.com> References: <55E4AAB8.90908@oracle.com> Message-ID: <55E4DC6E.5060807@oracle.com> On 8/31/15 1:27 PM, Gerald Thornbrugh wrote: > Hi, > > There is a Linux/SPARC related issue in JDK8 and JDK9 where hotspot > does not identify a SPARC T7/M7 > platform correctly. Because hotspot does not identify the T7/M7 > platform correctly it uses the default > number of GC helper threads which is twice the number that is should > use. The fix adds code to > detect a M series platform from the /proc/cpuinfo file and sets the > correct CPU features. Setting the > correct features allow the correct number of GC helper threads to be > allocated. > > The fix only impacts the Linux/SPARC code and does not impact any > other platform/OS combination. > > I have completed successful JPRT test runs for each set of changes and > have also verified that the change > addresses the issue on Linux/SPARC M series hardware. > > Here is the bug: > > https://bugs.openjdk.java.net/browse/JDK-8134161 > > Here is the JDK9 webrev: > > http://cr.openjdk.java.net/~gthornbr/8134161/webrev/ > src/os_cpu/linux_sparc/vm/vm_version_linux_sparc.cpp No comments. > Here is the JDK8 webrev: > > http://cr.openjdk.java.net/~gthornbr/8134674/webrev/ > src/os_cpu/linux_sparc/vm/vm_version_linux_sparc.cpp No comments. Thumbs up on both webrevs. Dan > > Please let me know if you have any questions or concerns. > > Thanks, > > Gerald Thornbrugh From gerald.thornbrugh at oracle.com Mon Aug 31 23:00:08 2015 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Mon, 31 Aug 2015 17:00:08 -0600 Subject: RFR: JDK-8134161 JVM is creating too many GC helper threads on T7/M7 linux/sparc platform In-Reply-To: <55E4DC6E.5060807@oracle.com> References: <55E4AAB8.90908@oracle.com> <55E4DC6E.5060807@oracle.com> Message-ID: <55E4DC78.7030908@oracle.com> Hi Dan, Thanks! Jerry > On 8/31/15 1:27 PM, Gerald Thornbrugh wrote: >> Hi, >> >> There is a Linux/SPARC related issue in JDK8 and JDK9 where hotspot >> does not identify a SPARC T7/M7 >> platform correctly. Because hotspot does not identify the T7/M7 >> platform correctly it uses the default >> number of GC helper threads which is twice the number that is should >> use. The fix adds code to >> detect a M series platform from the /proc/cpuinfo file and sets the >> correct CPU features. Setting the >> correct features allow the correct number of GC helper threads to be >> allocated. >> >> The fix only impacts the Linux/SPARC code and does not impact any >> other platform/OS combination. >> >> I have completed successful JPRT test runs for each set of changes >> and have also verified that the change >> addresses the issue on Linux/SPARC M series hardware. >> >> Here is the bug: >> >> https://bugs.openjdk.java.net/browse/JDK-8134161 >> >> Here is the JDK9 webrev: >> >> http://cr.openjdk.java.net/~gthornbr/8134161/webrev/ >> > > src/os_cpu/linux_sparc/vm/vm_version_linux_sparc.cpp > No comments. > > >> Here is the JDK8 webrev: >> >> http://cr.openjdk.java.net/~gthornbr/8134674/webrev/ >> > > src/os_cpu/linux_sparc/vm/vm_version_linux_sparc.cpp > No comments. > > Thumbs up on both webrevs. > > Dan > > > >> >> Please let me know if you have any questions or concerns. >> >> Thanks, >> >> Gerald Thornbrugh > From vladimir.kozlov at oracle.com Mon Aug 31 23:01:37 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 31 Aug 2015 16:01:37 -0700 Subject: RFR: JDK-8134161 JVM is creating too many GC helper threads on T7/M7 linux/sparc platform In-Reply-To: <55E4AAB8.90908@oracle.com> References: <55E4AAB8.90908@oracle.com> Message-ID: <55E4DCD1.3060102@oracle.com> Looks good. Thanks, Vladimir On 8/31/15 12:27 PM, Gerald Thornbrugh wrote: > Hi, > > There is a Linux/SPARC related issue in JDK8 and JDK9 where hotspot does > not identify a SPARC T7/M7 > platform correctly. Because hotspot does not identify the T7/M7 > platform correctly it uses the default > number of GC helper threads which is twice the number that is should > use. The fix adds code to > detect a M series platform from the /proc/cpuinfo file and sets the > correct CPU features. Setting the > correct features allow the correct number of GC helper threads to be > allocated. > > The fix only impacts the Linux/SPARC code and does not impact any other > platform/OS combination. > > I have completed successful JPRT test runs for each set of changes and > have also verified that the change > addresses the issue on Linux/SPARC M series hardware. > > Here is the bug: > > https://bugs.openjdk.java.net/browse/JDK-8134161 > > Here is the JDK9 webrev: > > http://cr.openjdk.java.net/~gthornbr/8134161/webrev/ > > > Here is the JDK8 webrev: > > http://cr.openjdk.java.net/~gthornbr/8134674/webrev/ > > > Please let me know if you have any questions or concerns. > > Thanks, > > Gerald Thornbrugh From mikhailo.seledtsov at oracle.com Mon Aug 31 23:01:40 2015 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 31 Aug 2015 16:01:40 -0700 Subject: RFR(S): JDK-8133180 - [TESTBUG] runtime/SharedArchiveFile/SharedStrings.java failed with WhiteBox.class : no such file or directory In-Reply-To: <55E0EA1A.3080405@oracle.com> References: <55CE5C1C.7050002@oracle.com> <402AD339-8F17-44CB-9B91-36C27DE6437F@oracle.com> <55D28802.4070306@oracle.com> <55E0B89A.7060304@oracle.com> <55E0EA1A.3080405@oracle.com> Message-ID: <55E4DCD4.90206@oracle.com> Jiangli, When you have a chance could you please run a final review on the last change? http://cr.openjdk.java.net/~mseledtsov/8133180.01/ Also, could a Capital-R Reviewer review this change? (I believe I need a "R" reviewer for this change) Thank you, Misha On 8/28/2015 4:09 PM, Mikhailo Seledtsov wrote: > Jiangli, > > Thank you for a quick reply. > > I have updated the webrev accordingly: > http://cr.openjdk.java.net/~mseledtsov/8133180.01/ > > Testing: > - ran the test with b75 (internal and openjdk) > - ran the "easy reproducer" discussed in a bug > rm -Rf JT* test > jtreg //hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java > jtreg > //hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java > - RBT of CDS tests - in progress > > Thank you, > Misha > > On 8/28/2015 2:48 PM, Jiangli Zhou wrote: >> Hi Misha, >> >> Your proposal sounds good. >> >> Thanks, >> Jiangli >> >>> On Aug 28, 2015, at 12:38 PM, Mikhailo Seledtsov >>> wrote: >>> >>> Hi Jiangli, >>> >>> I am back to this task after working on other stuff. >>> As I started experimenting with your suggestions, I have found one >>> potential problem. >>> The fallback strategy that you suggest could create an ambiguity. >>> >>> For instance, there may be cases where both >>> @build+ClassFileInstaller and @compile is used. The CFI will place >>> classes in the current working directory, aka ".", where is @compile >>> will place them under JTwork/../currentTest/classes/.. >>> In such cases, the test author will probably need to create 2 jars, >>> one for classes in the local directory, and another for classes in >>> the "@compile", and may want to be explicit what goes where, instead >>> of JarBuilder trying to guess and leading to possible ambiguity and >>> confusion. >>> >>> The key to solving this problem is to never assume the location of >>> classes produced by @build. If using @build, the classes MUST be >>> copied to the "." using ClassFileInstaller, since @build does not >>> guarantee the presence of these classes in a specific location. >>> Perhaps, the best solution here is to require all classes used by >>> the JarBuilder to be in the "." directory, and require the test >>> author to use ClassFileInstaller when necessary. In such cases, if >>> the class is not in the test's current directory, the error will >>> become obvious immediately instead of under some occasional >>> conditions in the test system. It will also greatly simplify the >>> tests and will make them more robust. >>> >>> Please let me know comments/objections on such approach. >>> >>> Thank you, >>> Misha >>> >>> >>> On 8/17/2015 6:18 PM, Mikhailo Seledtsov wrote: >>>> Hi Jiangli, >>>> >>>> Thank you for your suggestion; it should make testing more robust. >>>> I will try your suggestion; if I do not see any undesired side >>>> effects I will rerun full testset and re-submit the updated review. >>>> >>>> Thank you, >>>> Misha >>>> >>>> On 8/14/2015 5:43 PM, Jiangli Zhou wrote: >>>>> Hi Misha, >>>>> >>>>> I have one suggestion. Instead of searching either the ?work? or >>>>> ?classes? directory depending on the ?classesInWorkDir? argument, >>>>> how about searching both directories? So the >>>>> BasicJarBuilder.build() method would use the ?classes? directory >>>>> first, then try the ?work? directory if the file cannot be found >>>>> in ?classes'. That might be a more robust solution. >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>> On Aug 14, 2015, at 2:22 PM, Mikhailo Seledtsov >>>>> wrote: >>>>> >>>>>> Please review this fix to the CDS test bug. See the comments in >>>>>> the bug for details. >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8133180 >>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8133180.00/ >>>>>> Testing: >>>>>> - ran the reproducer discussed in the bug description >>>>>> rm -Rf JT* test >>>>>> jtreg >>>>>> /media/data3/hg/9/work01/hs-rt/hotspot/test/testlibrary_tests/ctw/JarDirTest.java >>>>>> jtreg >>>>>> /media/data3/hg/9/work01/hs-rt/hotspot/test/runtime/SharedArchiveFile/SharedStrings.java >>>>>> >>>>>> - ran the CDS tests in concurrent mode >>>>>> - running CDS tests via multi-platform build-and-test system >>>>>> (in progress) >>>>>> >>>>>> Thank you, >>>>>> Misha > From bdrr1561 at gmail.com Mon Aug 3 13:34:33 2015 From: bdrr1561 at gmail.com (But) Date: Mon, 03 Aug 2015 13:34:33 -0000 Subject: [9] RFR (M) 8054888: Runtime: Add Diagnostic Command that prints the class hierarchy Message-ID: <8884700C-747F-43C2-B5C6-90246BDF3785@gmail.com> ???? ???? ??? iPhone ????? ??? From bdrr1561 at gmail.com Wed Aug 5 04:21:17 2015 From: bdrr1561 at gmail.com (Bdr Algamde) Date: Wed, 05 Aug 2015 04:21:17 -0000 Subject: [9] RFR (M) 8054888: Runtime: Add Diagnostic Command that prints the class hierarchy In-Reply-To: <8884700C-747F-43C2-B5C6-90246BDF3785@gmail.com> References: <8884700C-747F-43C2-B5C6-90246BDF3785@gmail.com> Message-ID: ??? ???????? ? ?????? ????, But ???: > > > ???? ???? ??? iPhone ????? ??? > -- ??? ??????? ?? Gmail Mobile From ydwchina at gmail.com Fri Aug 21 07:37:14 2015 From: ydwchina at gmail.com (deven you) Date: Fri, 21 Aug 2015 07:37:14 -0000 Subject: RFR: JDK-8080511 - Refresh of jimage support In-Reply-To: References: Message-ID: Hi Jim, I have one question. I see Hotspot already supports in decompressing compressed resource and there is a method newCompressedResource in jdk/src/java.base/share/classes/jdk/internal/jimage/ResourcePool.java for creating a compressed resource but I did not find any API uses this method and not find there is any compressed resource in bootmodules.jimage. What I want to know is 1. if I want to compress one resource in a certain module what are the steps? I assume I need write some code which first gets the plugin and compressed buffer and then pass to newCompressedResource? If there is some compressed zip or jar files in a certain module how the relevant code deals with this condition? 2. Any plan that bootmodules.jiamge or other jimage files will contain such compressed resources? Thanks a lot! 2015-06-18 8:08 GMT+08:00 Jim Laskey (Oracle) : > https://bugs.openjdk.java.net/browse/JDK-8080511 > > This is an long overdue refresh of the jimage support in the JDK9-dev > repo. This includes native support for reading jimage files, improved > jrt-fs (java runtime file system) support for retrieving modules and > packages from the runtime, and improved performance for langtools in the > presence of jrt-fs. > > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-top < > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-top> > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-jdk < > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-jdk> > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-hotspot < > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-hotspot> > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-langtools < > http://cr.openjdk.java.net/~jlaskey/hs-rt-jimage/webrev-langtools> > > > Details: > > - jrt-fs provides access, via the nio FileSystem API, to the classes in a > .jimage file, organized by module or by package. > - Shared code for jimage support converted to native. Currently residing > in hotspot, but will migrate to it?s own jdk library > https://bugs.openjdk.java.net/browse/JDK-8087181 < > https://bugs.openjdk.java.net/browse/JDK-8087181> > - A new archive abstraction for class/resource sources. > - java based implementation layer for jimage reading to allow backport to > JDK8 (jrt-fs.jar - IDE support.) > - JNI support for jimage into hotspot. > - White box tests written to exercise native jimage support. > >