From per.liden at oracle.com Mon Jun 2 08:38:44 2014 From: per.liden at oracle.com (Per Liden) Date: Mon, 02 Jun 2014 10:38:44 +0200 Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop() In-Reply-To: <538473BB.9070300@oracle.com> References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com> <537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com> Message-ID: <538C3814.9000906@oracle.com> Ping! /Per On 05/27/2014 01:15 PM, Per Liden wrote: > Hi, > > I did some additional testing and eyeballing of this fix and noticed > that it would be a good idea to also tell concurrent mark to abort, > otherwise we will always wait until concurrent mark has finished, which > is unnecessary (and could potentially take some time if the live set is > large). So, I added a call to _cm->set_has_aborted() to abort any > ongoing concurrent mark. > > Updated webrev: > http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ > > Diff against previous webrev: > http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/ > > Testing: > Wrote a simple test to provoke an concurrent mark followed by an > immediate exit. With the first version of the patch, we would always > wait until concurrent mark completes. Now it will instead show an > concurrent-mark-abort, which happens much earlier. > > /Per > > On 05/22/2014 03:10 PM, Per Liden wrote: >> Thanks Jon! >> >> /Per >> >> On 2014-05-20 16:45, Jon Masamitsu wrote: >>> Looks good. >>> >>> Reviewed. >>> >>> Jon >>> >>> On 05/20/2014 04:59 AM, Per Liden wrote: >>>> Looking for a couple of reviews in this patch. >>>> >>>> Summary: This patch re-enables the controlled stopping of G1's >>>> concurrent threads at VM shutdown. This could potentially cause hangs >>>> during VM shutdown because the G1 marking threads could get stuck in >>>> various places and fail to terminate. JDK-8040803 and JDK-8040804 >>>> fixed these issues, so this is the final step to re-enable the actual >>>> stopping of those threads. This patch also moves the call to >>>> CollectedHeap::stop() a few lines down to group the GC related stuff >>>> together. It also adjusts/removes some comments that are no longer >>>> correct. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807 >>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/ >>>> >>>> Testing: >>>> - GC nightlies. 5 tests in this suite used to timeout because of the >>>> issue with hanging threads. They now pass. >>>> - JPRT >>>> >>>> Thanks! >>>> /Per >>> >> > From serkanozal86 at hotmail.com Sun Jun 1 18:42:56 2014 From: serkanozal86 at hotmail.com (=?utf-8?B?c2Vya2FuIMO2emFs?=) Date: Sun, 1 Jun 2014 21:42:56 +0300 Subject: =?utf-8?Q?FW:_Hiding?= =?utf-8?Q?_Class_Def?= =?utf-8?Q?initions_f?= =?utf-8?Q?rom_Compac?= =?utf-8?Q?ting_At_GC?= =?utf-8?B?IEN5Y2xl4oCP4oCP?= In-Reply-To: References: Message-ID: Hi all, I am not sure that target of mail is this group or not but I don't know better one for asking :) I am currently working on an OffHeap solution and I have a problem with "Compact" phase of GC.As I see at "Compact" phase, location of classes may be changed. I tried class pinning with JNI by "NewGlobalRef" method but it doesn't prevent compacting. As I understood, it only hides object from garbage collected.In brief, is there any way to prevent compacting of any specific class defition (or object) at GC cycle?Is there any bit, offset or field (such as mark_oop) in object header to prevent compacting of fully from GC for any specific object or class? Thanks in advance. -- Serkan ?ZAL -------------- next part -------------- An HTML attachment was scrubbed... URL: From bengt.rutisson at oracle.com Mon Jun 2 14:30:34 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 02 Jun 2014 16:30:34 +0200 Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing of j.l.ref.Reference objects Message-ID: <538C8A8A.1080309@oracle.com> Hi all, Can I have a couple of reviews for this change? http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8043239 As described in the bug report the reference processor was missing a write barrier call when manipulating the discovered list. This has always been the case but it was hidden because at the end of the reference processing we went through the complete discovered list and dirtied all the missed cards because we did an (unnecessary) write barrier when we set the next field to point to be a self pointer pointing back at the reference object itself. The write barrier for setting the next field was removed since it was not needed, but that revealed the current bug. After some discussions and prototyping we came to the conclusion that there may be more barriers missing and that it is difficult to get the dirtying done the way our verification code assumes. A simpler solution seems to be to free the reference processing of all barriers and instead just make sure that we dirty all the right cards in the last pass. The proposed fix thus re-introduces the post barrier when we iterate over the discovered list. This time it uses the discovered field for the barrier to be more explicit about what is going on. Testing: JPRT, Kitchensink, 5 days GC test suite SPECjbb2013 Ad-hoc aurora run Specific reproducer that illustrated the problem. The specific reproducer was really good to pinpoint the problem but is hard to turn in to a JTreg test. Many thanks go to StefanK for helping out with creating the reproducer. Thanks, Bengt From per.liden at oracle.com Mon Jun 2 14:57:42 2014 From: per.liden at oracle.com (Per Liden) Date: Mon, 02 Jun 2014 16:57:42 +0200 Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing of j.l.ref.Reference objects In-Reply-To: <538C8A8A.1080309@oracle.com> References: <538C8A8A.1080309@oracle.com> Message-ID: <538C90E6.5080309@oracle.com> Looks good to me Bengt. Even if this means we sometimes dirty too many cards this looks like a much less error-prone approach, which I like. /Per On 06/02/2014 04:30 PM, Bengt Rutisson wrote: > > Hi all, > > Can I have a couple of reviews for this change? > > http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/ > > https://bugs.openjdk.java.net/browse/JDK-8043239 > > As described in the bug report the reference processor was missing a > write barrier call when manipulating the discovered list. This has > always been the case but it was hidden because at the end of the > reference processing we went through the complete discovered list and > dirtied all the missed cards because we did an (unnecessary) write > barrier when we set the next field to point to be a self pointer > pointing back at the reference object itself. > > The write barrier for setting the next field was removed since it was > not needed, but that revealed the current bug. After some discussions > and prototyping we came to the conclusion that there may be more > barriers missing and that it is difficult to get the dirtying done the > way our verification code assumes. A simpler solution seems to be to > free the reference processing of all barriers and instead just make sure > that we dirty all the right cards in the last pass. > > The proposed fix thus re-introduces the post barrier when we iterate > over the discovered list. This time it uses the discovered field for the > barrier to be more explicit about what is going on. > > Testing: > JPRT, > Kitchensink, 5 days > GC test suite > SPECjbb2013 > Ad-hoc aurora run > Specific reproducer that illustrated the problem. > > The specific reproducer was really good to pinpoint the problem but is > hard to turn in to a JTreg test. Many thanks go to StefanK for helping > out with creating the reproducer. > > Thanks, > Bengt From bengt.rutisson at oracle.com Mon Jun 2 15:22:30 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 02 Jun 2014 17:22:30 +0200 Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing of j.l.ref.Reference objects In-Reply-To: <538C90E6.5080309@oracle.com> References: <538C8A8A.1080309@oracle.com> <538C90E6.5080309@oracle.com> Message-ID: <538C96B6.6020705@oracle.com> On 6/2/14 4:57 PM, Per Liden wrote: > Looks good to me Bengt. > > Even if this means we sometimes dirty too many cards this looks like a > much less error-prone approach, which I like. Thanks for the quick review, Per! Bengt > > /Per > > On 06/02/2014 04:30 PM, Bengt Rutisson wrote: >> >> Hi all, >> >> Can I have a couple of reviews for this change? >> >> http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/ >> >> https://bugs.openjdk.java.net/browse/JDK-8043239 >> >> As described in the bug report the reference processor was missing a >> write barrier call when manipulating the discovered list. This has >> always been the case but it was hidden because at the end of the >> reference processing we went through the complete discovered list and >> dirtied all the missed cards because we did an (unnecessary) write >> barrier when we set the next field to point to be a self pointer >> pointing back at the reference object itself. >> >> The write barrier for setting the next field was removed since it was >> not needed, but that revealed the current bug. After some discussions >> and prototyping we came to the conclusion that there may be more >> barriers missing and that it is difficult to get the dirtying done the >> way our verification code assumes. A simpler solution seems to be to >> free the reference processing of all barriers and instead just make sure >> that we dirty all the right cards in the last pass. >> >> The proposed fix thus re-introduces the post barrier when we iterate >> over the discovered list. This time it uses the discovered field for the >> barrier to be more explicit about what is going on. >> >> Testing: >> JPRT, >> Kitchensink, 5 days >> GC test suite >> SPECjbb2013 >> Ad-hoc aurora run >> Specific reproducer that illustrated the problem. >> >> The specific reproducer was really good to pinpoint the problem but is >> hard to turn in to a JTreg test. Many thanks go to StefanK for helping >> out with creating the reproducer. >> >> Thanks, >> Bengt > From rednaxelafx at gmail.com Mon Jun 2 18:27:52 2014 From: rednaxelafx at gmail.com (Krystal Mok) Date: Mon, 2 Jun 2014 11:27:52 -0700 Subject: =?UTF-8?Q?Re=3A_FW=3A_Hiding_Class_Definitions_from_Compacting_At_?= =?UTF-8?Q?GC_Cycle=E2=80=8F=E2=80=8F?= In-Reply-To: References: Message-ID: Hi Serkan, Taobao developed something called "GCIH" (GC-Invisible Heap), which is also an off-heap solution, that might be similar to what you're trying to do. I was a part of the effort when I worked there. The JVM part of source code of a very very early version of the GCIH is available here: http://jvm.taobao.org/images/4/49/Jvm_gcih.patch We record the high-water mark whenever we touch a PermGen object when moving objects into GCIH: + if (p->is_klass() || p->is_perm()) { + if (GCInvisibleHeap::_top_klass_addr < p) { + GCInvisibleHeap::_top_klass_addr = p; + } + return; + } And then there were multiple ways to do things. In this version of GCIH we would traverse all objects in GCIH to fixup their klass pointers after the PermGen has been compacted. There was another version that would simply prevent the GC from compacting the part of PermGen below our high-water mark. I'm not sure how it evolved after I left, but the implementation is much more stable now, so they might have come up with a better way to do it. Nonetheless, all these solutions require customizing the JVM internals, which might not be the thing you want to do. If you target your off-heap solution to only Java 8 or above, and only targeting HotSpot, however, then you don't have to worry about metadata objects moving around (at least for now). That's because the PermGen is removed from GC and moved into a piece of native memory called "Metaspace", so they're not subject to GC compaction anymore. I should mention both JRockit and IBM J9 don't have a PermGen even before Java 8, and their metadata objects are not subject to compaction either. I guess you're trying to be JVM-implementation-agnostic here, but since not all JVMs do compaction, and not all JVMs support object pinning, there's no cross-JVM compatible API that allows you to prevent metadata object from being compacted. - Kris On Sun, Jun 1, 2014 at 11:42 AM, serkan ?zal wrote: > Hi all, > > I am not sure that target of mail is this group or not but I don't know > better one for asking :) > > I am currently working on an OffHeap solution and I have a problem with > "Compact" phase of GC.As I see at "Compact" phase, location of classes may > be changed. I tried class pinning with JNI by "NewGlobalRef" method but it > doesn't prevent compacting. As I understood, it only hides object from > garbage collected. > In brief, is there any way to prevent compacting of any specific class > defition (or object) at GC cycle?Is there any bit, offset or field (such as > mark_oop) in object header to prevent compacting of fully from GC for any > specific object or class? > > Thanks in advance. > > -- > > Serkan ?ZAL > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon.masamitsu at oracle.com Mon Jun 2 21:22:14 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Mon, 02 Jun 2014 14:22:14 -0700 Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop() In-Reply-To: <538C3814.9000906@oracle.com> References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com> <537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com> <538C3814.9000906@oracle.com> Message-ID: <538CEB06.7050707@oracle.com> On 06/02/2014 01:38 AM, Per Liden wrote: > Ping! > > /Per > > On 05/27/2014 01:15 PM, Per Liden wrote: >> Hi, >> >> I did some additional testing and eyeballing of this fix and noticed >> that it would be a good idea to also tell concurrent mark to abort, >> otherwise we will always wait until concurrent mark has finished, which >> is unnecessary (and could potentially take some time if the live set is >> large). So, I added a call to _cm->set_has_aborted() to abort any >> ongoing concurrent mark. >> >> Updated webrev: >> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ >> >> Diff against previous webrev: >> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/ Looks good. Reviewed. Jon >> >> Testing: >> Wrote a simple test to provoke an concurrent mark followed by an >> immediate exit. With the first version of the patch, we would always >> wait until concurrent mark completes. Now it will instead show an >> concurrent-mark-abort, which happens much earlier. >> >> /Per >> >> On 05/22/2014 03:10 PM, Per Liden wrote: >>> Thanks Jon! >>> >>> /Per >>> >>> On 2014-05-20 16:45, Jon Masamitsu wrote: >>>> Looks good. >>>> >>>> Reviewed. >>>> >>>> Jon >>>> >>>> On 05/20/2014 04:59 AM, Per Liden wrote: >>>>> Looking for a couple of reviews in this patch. >>>>> >>>>> Summary: This patch re-enables the controlled stopping of G1's >>>>> concurrent threads at VM shutdown. This could potentially cause hangs >>>>> during VM shutdown because the G1 marking threads could get stuck in >>>>> various places and fail to terminate. JDK-8040803 and JDK-8040804 >>>>> fixed these issues, so this is the final step to re-enable the actual >>>>> stopping of those threads. This patch also moves the call to >>>>> CollectedHeap::stop() a few lines down to group the GC related stuff >>>>> together. It also adjusts/removes some comments that are no longer >>>>> correct. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807 >>>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/ >>>>> >>>>> Testing: >>>>> - GC nightlies. 5 tests in this suite used to timeout because of the >>>>> issue with hanging threads. They now pass. >>>>> - JPRT >>>>> >>>>> Thanks! >>>>> /Per >>>> >>> >> > From vladimir.kozlov at oracle.com Mon Jun 2 21:55:15 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 02 Jun 2014 14:55:15 -0700 Subject: RFR(XS) : 8044575 : testlibrary_tests/whitebox/vm_flags/UintxTest.java failed: assert(!res || TypeEntriesAtCall::arguments_profiling_enabled()) failed: no profiling of arguments In-Reply-To: <538CE366.90806@oracle.com> References: <538CE366.90806@oracle.com> Message-ID: <538CF2C3.8090107@oracle.com> Hi Igor, Looks good to me but I would ask GC group to comment on this change. Thanks, Vladimir On 6/2/14 1:49 PM, Igor Ignatyev wrote: > webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/ > 4 lines changed: 0 ins; 2 del; 2 mod; > > Hi all, > > Please review patch: > > Problem: > the test changes 'TypeProfileLevel' via WhiteBox during execution, but > 'TypeProfileLevel' isn't supposed to be changed and there's the asserts > based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers > one of these asserts. > > Fix: > - as a flag to change, the test uses 'VerifyGCStartAt' instead of > 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during execution > - removed 'System.out.println' which was left by accident > > jbs: https://bugs.openjdk.java.net/browse/JDK-8044575 > testing: failing tests locally w/ different flags combinations From jon.masamitsu at oracle.com Mon Jun 2 22:25:43 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Mon, 02 Jun 2014 15:25:43 -0700 Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing of j.l.ref.Reference objects In-Reply-To: <538C8A8A.1080309@oracle.com> References: <538C8A8A.1080309@oracle.com> Message-ID: <538CF9E7.6020301@oracle.com> Bengt, http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/src/share/vm/memory/referenceProcessor.cpp.frames.html The change to always use "set_next_raw()" here 520 java_lang_ref_Reference::set_next_raw(_ref, NULL); 521 } else { 522 java_lang_ref_Reference::set_next(_ref, NULL); 523 } was always the correct thing to use? Does not have to do with extra / unneeded write barrier? Looks good. Reviewed. Jon On 06/02/2014 07:30 AM, Bengt Rutisson wrote: > > Hi all, > > Can I have a couple of reviews for this change? > > http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/ > > https://bugs.openjdk.java.net/browse/JDK-8043239 > > As described in the bug report the reference processor was missing a > write barrier call when manipulating the discovered list. This has > always been the case but it was hidden because at the end of the > reference processing we went through the complete discovered list and > dirtied all the missed cards because we did an (unnecessary) write > barrier when we set the next field to point to be a self pointer > pointing back at the reference object itself. > > The write barrier for setting the next field was removed since it was > not needed, but that revealed the current bug. After some discussions > and prototyping we came to the conclusion that there may be more > barriers missing and that it is difficult to get the dirtying done the > way our verification code assumes. A simpler solution seems to be to > free the reference processing of all barriers and instead just make > sure that we dirty all the right cards in the last pass. > > The proposed fix thus re-introduces the post barrier when we iterate > over the discovered list. This time it uses the discovered field for > the barrier to be more explicit about what is going on. > > Testing: > JPRT, > Kitchensink, 5 days > GC test suite > SPECjbb2013 > Ad-hoc aurora run > Specific reproducer that illustrated the problem. > > The specific reproducer was really good to pinpoint the problem but is > hard to turn in to a JTreg test. Many thanks go to StefanK for helping > out with creating the reproducer. > > Thanks, > Bengt From thomas.schatzl at oracle.com Tue Jun 3 07:15:54 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 03 Jun 2014 09:15:54 +0200 Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop() In-Reply-To: <538473BB.9070300@oracle.com> References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com> <537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com> Message-ID: <1401779754.2592.0.camel@cirrus> Hi, On Tue, 2014-05-27 at 13:15 +0200, Per Liden wrote: > Hi, > > I did some additional testing and eyeballing of this fix and noticed > that it would be a good idea to also tell concurrent mark to abort, > otherwise we will always wait until concurrent mark has finished, which > is unnecessary (and could potentially take some time if the live set is > large). So, I added a call to _cm->set_has_aborted() to abort any > ongoing concurrent mark. > > Updated webrev: > http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ > > Diff against previous webrev: > http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/ > > Testing: > Wrote a simple test to provoke an concurrent mark followed by an > immediate exit. With the first version of the patch, we would always > wait until concurrent mark completes. Now it will instead show an > concurrent-mark-abort, which happens much earlier. Looks okay. Thomas From bengt.rutisson at oracle.com Tue Jun 3 08:04:57 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 03 Jun 2014 10:04:57 +0200 Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop() In-Reply-To: <538C3814.9000906@oracle.com> References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com> <537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com> <538C3814.9000906@oracle.com> Message-ID: <538D81A9.9000109@oracle.com> Hi Per, Looks good. Bengt On 2014-06-02 10:38, Per Liden wrote: > Ping! > > /Per > > On 05/27/2014 01:15 PM, Per Liden wrote: >> Hi, >> >> I did some additional testing and eyeballing of this fix and noticed >> that it would be a good idea to also tell concurrent mark to abort, >> otherwise we will always wait until concurrent mark has finished, which >> is unnecessary (and could potentially take some time if the live set is >> large). So, I added a call to _cm->set_has_aborted() to abort any >> ongoing concurrent mark. >> >> Updated webrev: >> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ >> >> Diff against previous webrev: >> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/ >> >> Testing: >> Wrote a simple test to provoke an concurrent mark followed by an >> immediate exit. With the first version of the patch, we would always >> wait until concurrent mark completes. Now it will instead show an >> concurrent-mark-abort, which happens much earlier. >> >> /Per >> >> On 05/22/2014 03:10 PM, Per Liden wrote: >>> Thanks Jon! >>> >>> /Per >>> >>> On 2014-05-20 16:45, Jon Masamitsu wrote: >>>> Looks good. >>>> >>>> Reviewed. >>>> >>>> Jon >>>> >>>> On 05/20/2014 04:59 AM, Per Liden wrote: >>>>> Looking for a couple of reviews in this patch. >>>>> >>>>> Summary: This patch re-enables the controlled stopping of G1's >>>>> concurrent threads at VM shutdown. This could potentially cause hangs >>>>> during VM shutdown because the G1 marking threads could get stuck in >>>>> various places and fail to terminate. JDK-8040803 and JDK-8040804 >>>>> fixed these issues, so this is the final step to re-enable the actual >>>>> stopping of those threads. This patch also moves the call to >>>>> CollectedHeap::stop() a few lines down to group the GC related stuff >>>>> together. It also adjusts/removes some comments that are no longer >>>>> correct. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807 >>>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/ >>>>> >>>>> Testing: >>>>> - GC nightlies. 5 tests in this suite used to timeout because of the >>>>> issue with hanging threads. They now pass. >>>>> - JPRT >>>>> >>>>> Thanks! >>>>> /Per >>>> >>> >> > From bengt.rutisson at oracle.com Tue Jun 3 08:37:47 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 03 Jun 2014 10:37:47 +0200 Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing of j.l.ref.Reference objects In-Reply-To: <538CF9E7.6020301@oracle.com> References: <538C8A8A.1080309@oracle.com> <538CF9E7.6020301@oracle.com> Message-ID: <538D895B.7090903@oracle.com> Hi Jon, Thanks for the review! On 2014-06-03 00:25, Jon Masamitsu wrote: > Bengt, > > http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/src/share/vm/memory/referenceProcessor.cpp.frames.html > > > The change to always use "set_next_raw()" here > > 520 java_lang_ref_Reference::set_next_raw(_ref, NULL); > 521 } else { > 522 java_lang_ref_Reference::set_next(_ref, NULL); > 523 } > > was always the correct thing to use? Does not have to do with > extra / unneeded write barrier? Right. Since we are writing NULL we don't need a post barrier and for G1 it is important to avoid the post barrier because it will dirty cards in a way that will make the card table verification to fail. > > Looks good. > > Reviewed. Thanks! I will push this change, but Thomas suggested to move one of the comments that I added to be more visible. Here's what he suggested: http://cr.openjdk.java.net/~brutisso/8043239/webrev.00-01.diff/ Since it is only a comment change I will go ahead and push this. Would be nice to get nightly testing as soon as possible to be able to backport to 8u20. I hope that is ok with you. Here is the full webrev of what I will push: http://cr.openjdk.java.net/~brutisso/8043239/webrev.01/ Thanks, Bengt > > Jon > > On 06/02/2014 07:30 AM, Bengt Rutisson wrote: >> >> Hi all, >> >> Can I have a couple of reviews for this change? >> >> http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/ >> >> https://bugs.openjdk.java.net/browse/JDK-8043239 >> >> As described in the bug report the reference processor was missing a >> write barrier call when manipulating the discovered list. This has >> always been the case but it was hidden because at the end of the >> reference processing we went through the complete discovered list and >> dirtied all the missed cards because we did an (unnecessary) write >> barrier when we set the next field to point to be a self pointer >> pointing back at the reference object itself. >> >> The write barrier for setting the next field was removed since it was >> not needed, but that revealed the current bug. After some discussions >> and prototyping we came to the conclusion that there may be more >> barriers missing and that it is difficult to get the dirtying done >> the way our verification code assumes. A simpler solution seems to be >> to free the reference processing of all barriers and instead just >> make sure that we dirty all the right cards in the last pass. >> >> The proposed fix thus re-introduces the post barrier when we iterate >> over the discovered list. This time it uses the discovered field for >> the barrier to be more explicit about what is going on. >> >> Testing: >> JPRT, >> Kitchensink, 5 days >> GC test suite >> SPECjbb2013 >> Ad-hoc aurora run >> Specific reproducer that illustrated the problem. >> >> The specific reproducer was really good to pinpoint the problem but >> is hard to turn in to a JTreg test. Many thanks go to StefanK for >> helping out with creating the reproducer. >> >> Thanks, >> Bengt > From per.liden at oracle.com Tue Jun 3 08:50:34 2014 From: per.liden at oracle.com (Per Liden) Date: Tue, 03 Jun 2014 10:50:34 +0200 Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop() In-Reply-To: <538D81A9.9000109@oracle.com> References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com> <537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com> <538C3814.9000906@oracle.com> <538D81A9.9000109@oracle.com> Message-ID: <538D8C5A.6020909@oracle.com> Thanks Jon, Thomas and Bengt! /Per On 06/03/2014 10:04 AM, Bengt Rutisson wrote: > > Hi Per, > > Looks good. > > Bengt > > On 2014-06-02 10:38, Per Liden wrote: >> Ping! >> >> /Per >> >> On 05/27/2014 01:15 PM, Per Liden wrote: >>> Hi, >>> >>> I did some additional testing and eyeballing of this fix and noticed >>> that it would be a good idea to also tell concurrent mark to abort, >>> otherwise we will always wait until concurrent mark has finished, which >>> is unnecessary (and could potentially take some time if the live set is >>> large). So, I added a call to _cm->set_has_aborted() to abort any >>> ongoing concurrent mark. >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ >>> >>> Diff against previous webrev: >>> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/ >>> >>> Testing: >>> Wrote a simple test to provoke an concurrent mark followed by an >>> immediate exit. With the first version of the patch, we would always >>> wait until concurrent mark completes. Now it will instead show an >>> concurrent-mark-abort, which happens much earlier. >>> >>> /Per >>> >>> On 05/22/2014 03:10 PM, Per Liden wrote: >>>> Thanks Jon! >>>> >>>> /Per >>>> >>>> On 2014-05-20 16:45, Jon Masamitsu wrote: >>>>> Looks good. >>>>> >>>>> Reviewed. >>>>> >>>>> Jon >>>>> >>>>> On 05/20/2014 04:59 AM, Per Liden wrote: >>>>>> Looking for a couple of reviews in this patch. >>>>>> >>>>>> Summary: This patch re-enables the controlled stopping of G1's >>>>>> concurrent threads at VM shutdown. This could potentially cause hangs >>>>>> during VM shutdown because the G1 marking threads could get stuck in >>>>>> various places and fail to terminate. JDK-8040803 and JDK-8040804 >>>>>> fixed these issues, so this is the final step to re-enable the actual >>>>>> stopping of those threads. This patch also moves the call to >>>>>> CollectedHeap::stop() a few lines down to group the GC related stuff >>>>>> together. It also adjusts/removes some comments that are no longer >>>>>> correct. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807 >>>>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/ >>>>>> >>>>>> Testing: >>>>>> - GC nightlies. 5 tests in this suite used to timeout because of the >>>>>> issue with hanging threads. They now pass. >>>>>> - JPRT >>>>>> >>>>>> Thanks! >>>>>> /Per >>>>> >>>> >>> >> > From harvey at actenum.com Tue Jun 3 18:41:30 2014 From: harvey at actenum.com (Peter Harvey) Date: Tue, 3 Jun 2014 12:41:30 -0600 Subject: G1 GC consuming all CPU time Message-ID: I have an algorithm (at bottom of email) which builds a graph of 'Node' objects with random connections between them. It then repeatedly processes a queue of those Nodes, adding new Nodes to the queue as it goes. This is a single-threaded algorithm that will never terminate. Our actual production code is much more complex, but I've trimmed it down as much as possible. On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause the JRE to consume all 8 cores of my CPU. No other garbage collector does this. You can see the differences in CPU load in the example output below. It's also worth nothing that "-verbose:gc" with the G1 garbage collector prints nothing after my algorithm starts. Presumably the G1 garbage collector is doing something (concurrent mark?), but it's not printing anything about it. When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this (note the huge CPU load value which should not be this high for a single-threaded algorithm on an 8 core CPU): [GC pause (young) 62M->62M(254M), 0.0394214 secs] [GC pause (young) 73M->83M(508M), 0.0302781 secs] [GC pause (young) 106M->111M(1016M), 0.0442273 secs] [GC pause (young) 157M->161M(1625M), 0.0660902 secs] [GC pause (young) 235M->240M(2112M), 0.0907231 secs] [GC pause (young) 334M->337M(2502M), 0.1356917 secs] [GC pause (young) 448M->450M(2814M), 0.1219090 secs] [GC pause (young) 574M->577M(3064M), 0.1778062 secs] [GC pause (young) 712M->715M(3264M), 0.1878443 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.9196154547182949 Start Stop Sleep CPU Load Is 0.9150735995043818 ... When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like this: [GC 65536K->64198K(249344K), 0.0628289 secs] [GC 129734K->127974K(314880K), 0.1583369 secs] [Full GC 127974K->127630K(451072K), 0.9675224 secs] [GC 258702K->259102K(451072K), 0.3543645 secs] [Full GC 259102K->258701K(732672K), 1.8085702 secs] [GC 389773K->390181K(790528K), 0.3332060 secs] [GC 579109K->579717K(803328K), 0.5126388 secs] [Full GC 579717K->578698K(1300480K), 4.0647303 secs] [GC 780426K->780842K(1567232K), 0.4364933 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.03137771539054431 Start Stop Sleep CPU Load Is 0.032351299224373145 ... When run with VM args "-verbose:gc" I get output like this: [GC 69312K->67824K(251136K), 0.1533803 secs] [GC 137136K->135015K(251136K), 0.0970460 secs] [GC 137245K(251136K), 0.0095245 secs] [GC 204327K->204326K(274368K), 0.1056259 secs] [GC 273638K->273636K(343680K), 0.1081515 secs] [GC 342948K->342946K(412992K), 0.1181966 secs] [GC 412258K->412257K(482304K), 0.1126966 secs] [GC 481569K->481568K(551808K), 0.1156015 secs] [GC 550880K->550878K(620928K), 0.1184089 secs] [GC 620190K->620189K(690048K), 0.1209312 secs] [GC 689501K->689499K(759552K), 0.1199338 secs] [GC 758811K->758809K(828864K), 0.1162532 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.10791719146608299 Start [GC 821213K(828864K), 0.1966807 secs] Stop Sleep CPU Load Is 0.1540065314146181 Start Stop Sleep [GC 821213K(1328240K), 0.1962688 secs] CPU Load Is 0.08427292195744103 ... Why is the G1 garbage collector consuming so much CPU time? Is it stuck in the mark phase as I am modifying the graph structure? I'm not a subscriber to the list, so please CC me in any response. Thanks, Peter. -- import java.lang.management.ManagementFactory; import com.sun.management.OperatingSystemMXBean; import java.util.Random; @SuppressWarnings("restriction") public class Node { private static OperatingSystemMXBean os = (OperatingSystemMXBean) ManagementFactory.getOperatingSystemMXBean(); private Node next; private Node[] others = new Node[10]; public static void main(String[] args) throws InterruptedException { // Build a graph of Nodes Node head = buildGraph(); while (true) { // Print CPU load for this process System.out.println("CPU Load Is " + os.getProcessCpuLoad()); System.out.println(); // Modify the graph System.out.println("Start"); head = modifyGraph(head); System.out.println("Stop"); // Sleep, as otherwise we tend to DoS the host computer... System.out.println("Sleep"); Thread.sleep(1000); } } private static Node buildGraph() { // Create a collection of Node objects Node[] array = new Node[10000000]; for (int i = 0; i < array.length; i++) { array[i] = new Node(); } // Each Node refers to 10 other random Nodes Random random = new Random(12); for (int i = 0; i < array.length; i++) { for (int j = 0; j < array[i].others.length; j++) { int k = random.nextInt(array.length); array[i].others[j] = array[k]; } } // The first Node serves as the head of a queue return array[0]; } private static Node modifyGraph(Node head) { // Perform a million iterations for (int i = 0; i < 1000000; i++) { // Pop a Node off the head of the queue Node node = head; head = node.next; node.next = null; // Add the other Nodes to the head of the queue for (Node other : node.others) { other.next = head; head = other; } } return head; } } -- *Actenum Corporation* Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | www.actenum.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From harvey at actenum.com Tue Jun 3 18:49:09 2014 From: harvey at actenum.com (Peter Harvey) Date: Tue, 3 Jun 2014 12:49:09 -0600 Subject: G1 GC consuming all CPU time In-Reply-To: References: Message-ID: Small correction. The last example of output was with "-XX:+UseConcMarkSweepGC -verbose:gc". On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey wrote: > I have an algorithm (at bottom of email) which builds a graph of 'Node' > objects with random connections between them. It then repeatedly processes > a queue of those Nodes, adding new Nodes to the queue as it goes. This is a > single-threaded algorithm that will never terminate. Our actual production > code is much more complex, but I've trimmed it down as much as possible. > > On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause > the JRE to consume all 8 cores of my CPU. No other garbage collector does > this. You can see the differences in CPU load in the example output below. > It's also worth nothing that "-verbose:gc" with the G1 garbage collector > prints nothing after my algorithm starts. Presumably the G1 garbage > collector is doing something (concurrent mark?), but it's not printing > anything about it. > > When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this > (note the huge CPU load value which should not be this high for a > single-threaded algorithm on an 8 core CPU): > > [GC pause (young) 62M->62M(254M), 0.0394214 secs] > [GC pause (young) 73M->83M(508M), 0.0302781 secs] > [GC pause (young) 106M->111M(1016M), 0.0442273 secs] > [GC pause (young) 157M->161M(1625M), 0.0660902 secs] > [GC pause (young) 235M->240M(2112M), 0.0907231 secs] > [GC pause (young) 334M->337M(2502M), 0.1356917 secs] > [GC pause (young) 448M->450M(2814M), 0.1219090 secs] > [GC pause (young) 574M->577M(3064M), 0.1778062 secs] > [GC pause (young) 712M->715M(3264M), 0.1878443 secs] > CPU Load Is -1.0 > > Start > Stop > Sleep > CPU Load Is 0.9196154547182949 > > Start > Stop > Sleep > CPU Load Is 0.9150735995043818 > > ... > > > > When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like > this: > > [GC 65536K->64198K(249344K), 0.0628289 secs] > [GC 129734K->127974K(314880K), 0.1583369 secs] > [Full GC 127974K->127630K(451072K), 0.9675224 secs] > [GC 258702K->259102K(451072K), 0.3543645 secs] > [Full GC 259102K->258701K(732672K), 1.8085702 secs] > [GC 389773K->390181K(790528K), 0.3332060 secs] > [GC 579109K->579717K(803328K), 0.5126388 secs] > [Full GC 579717K->578698K(1300480K), 4.0647303 secs] > [GC 780426K->780842K(1567232K), 0.4364933 secs] > CPU Load Is -1.0 > > Start > Stop > Sleep > CPU Load Is 0.03137771539054431 > > Start > Stop > Sleep > CPU Load Is 0.032351299224373145 > > ... > > > > When run with VM args "-verbose:gc" I get output like this: > > [GC 69312K->67824K(251136K), 0.1533803 secs] > [GC 137136K->135015K(251136K), 0.0970460 secs] > [GC 137245K(251136K), 0.0095245 secs] > [GC 204327K->204326K(274368K), 0.1056259 secs] > [GC 273638K->273636K(343680K), 0.1081515 secs] > [GC 342948K->342946K(412992K), 0.1181966 secs] > [GC 412258K->412257K(482304K), 0.1126966 secs] > [GC 481569K->481568K(551808K), 0.1156015 secs] > [GC 550880K->550878K(620928K), 0.1184089 secs] > [GC 620190K->620189K(690048K), 0.1209312 secs] > [GC 689501K->689499K(759552K), 0.1199338 secs] > [GC 758811K->758809K(828864K), 0.1162532 secs] > CPU Load Is -1.0 > > Start > Stop > Sleep > CPU Load Is 0.10791719146608299 > > Start > [GC 821213K(828864K), 0.1966807 secs] > Stop > Sleep > CPU Load Is 0.1540065314146181 > > Start > Stop > Sleep > [GC 821213K(1328240K), 0.1962688 secs] > CPU Load Is 0.08427292195744103 > > ... > > > > Why is the G1 garbage collector consuming so much CPU time? Is it stuck in > the mark phase as I am modifying the graph structure? > > I'm not a subscriber to the list, so please CC me in any response. > > Thanks, > Peter. > > -- > > import java.lang.management.ManagementFactory; > import com.sun.management.OperatingSystemMXBean; > import java.util.Random; > > @SuppressWarnings("restriction") > public class Node { > private static OperatingSystemMXBean os = (OperatingSystemMXBean) > ManagementFactory.getOperatingSystemMXBean(); > > private Node next; > > private Node[] others = new Node[10]; > > public static void main(String[] args) throws InterruptedException { > > // Build a graph of Nodes > Node head = buildGraph(); > > while (true) { > // Print CPU load for this process > System.out.println("CPU Load Is " + os.getProcessCpuLoad()); > System.out.println(); > > // Modify the graph > System.out.println("Start"); > head = modifyGraph(head); > System.out.println("Stop"); > > // Sleep, as otherwise we tend to DoS the host computer... > System.out.println("Sleep"); > Thread.sleep(1000); > } > } > > private static Node buildGraph() { > > // Create a collection of Node objects > Node[] array = new Node[10000000]; > for (int i = 0; i < array.length; i++) { > array[i] = new Node(); > } > > // Each Node refers to 10 other random Nodes > Random random = new Random(12); > for (int i = 0; i < array.length; i++) { > for (int j = 0; j < array[i].others.length; j++) { > int k = random.nextInt(array.length); > array[i].others[j] = array[k]; > } > } > > // The first Node serves as the head of a queue > return array[0]; > } > > private static Node modifyGraph(Node head) { > > // Perform a million iterations > for (int i = 0; i < 1000000; i++) { > > // Pop a Node off the head of the queue > Node node = head; > head = node.next; > node.next = null; > > // Add the other Nodes to the head of the queue > for (Node other : node.others) { > other.next = head; > head = other; > } > } > return head; > } > > } > > -- > *Actenum Corporation* > Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | > www.actenum.com > -- *Actenum Corporation* Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | www.actenum.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From yiyeguhu at gmail.com Tue Jun 3 21:13:45 2014 From: yiyeguhu at gmail.com (Tao Mao) Date: Tue, 3 Jun 2014 14:13:45 -0700 Subject: G1 GC consuming all CPU time In-Reply-To: References: Message-ID: Hi Peter, What was your actual question? Try -XX:ParallelGCThreads= if you want less CPU usage from GC. Thanks. Tao On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey wrote: > Small correction. The last example of output was with > "-XX:+UseConcMarkSweepGC -verbose:gc". > > > On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey wrote: > >> I have an algorithm (at bottom of email) which builds a graph of 'Node' >> objects with random connections between them. It then repeatedly processes >> a queue of those Nodes, adding new Nodes to the queue as it goes. This is a >> single-threaded algorithm that will never terminate. Our actual production >> code is much more complex, but I've trimmed it down as much as possible. >> >> On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause >> the JRE to consume all 8 cores of my CPU. No other garbage collector does >> this. You can see the differences in CPU load in the example output below. >> It's also worth nothing that "-verbose:gc" with the G1 garbage collector >> prints nothing after my algorithm starts. Presumably the G1 garbage >> collector is doing something (concurrent mark?), but it's not printing >> anything about it. >> >> When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this >> (note the huge CPU load value which should not be this high for a >> single-threaded algorithm on an 8 core CPU): >> >> [GC pause (young) 62M->62M(254M), 0.0394214 secs] >> [GC pause (young) 73M->83M(508M), 0.0302781 secs] >> [GC pause (young) 106M->111M(1016M), 0.0442273 secs] >> [GC pause (young) 157M->161M(1625M), 0.0660902 secs] >> [GC pause (young) 235M->240M(2112M), 0.0907231 secs] >> [GC pause (young) 334M->337M(2502M), 0.1356917 secs] >> [GC pause (young) 448M->450M(2814M), 0.1219090 secs] >> [GC pause (young) 574M->577M(3064M), 0.1778062 secs] >> [GC pause (young) 712M->715M(3264M), 0.1878443 secs] >> CPU Load Is -1.0 >> >> Start >> Stop >> Sleep >> CPU Load Is 0.9196154547182949 >> >> Start >> Stop >> Sleep >> CPU Load Is 0.9150735995043818 >> >> ... >> >> >> >> When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like >> this: >> >> [GC 65536K->64198K(249344K), 0.0628289 secs] >> [GC 129734K->127974K(314880K), 0.1583369 secs] >> [Full GC 127974K->127630K(451072K), 0.9675224 secs] >> [GC 258702K->259102K(451072K), 0.3543645 secs] >> [Full GC 259102K->258701K(732672K), 1.8085702 secs] >> [GC 389773K->390181K(790528K), 0.3332060 secs] >> [GC 579109K->579717K(803328K), 0.5126388 secs] >> [Full GC 579717K->578698K(1300480K), 4.0647303 secs] >> [GC 780426K->780842K(1567232K), 0.4364933 secs] >> CPU Load Is -1.0 >> >> Start >> Stop >> Sleep >> CPU Load Is 0.03137771539054431 >> >> Start >> Stop >> Sleep >> CPU Load Is 0.032351299224373145 >> >> ... >> >> >> >> When run with VM args "-verbose:gc" I get output like this: >> >> [GC 69312K->67824K(251136K), 0.1533803 secs] >> [GC 137136K->135015K(251136K), 0.0970460 secs] >> [GC 137245K(251136K), 0.0095245 secs] >> [GC 204327K->204326K(274368K), 0.1056259 secs] >> [GC 273638K->273636K(343680K), 0.1081515 secs] >> [GC 342948K->342946K(412992K), 0.1181966 secs] >> [GC 412258K->412257K(482304K), 0.1126966 secs] >> [GC 481569K->481568K(551808K), 0.1156015 secs] >> [GC 550880K->550878K(620928K), 0.1184089 secs] >> [GC 620190K->620189K(690048K), 0.1209312 secs] >> [GC 689501K->689499K(759552K), 0.1199338 secs] >> [GC 758811K->758809K(828864K), 0.1162532 secs] >> CPU Load Is -1.0 >> >> Start >> Stop >> Sleep >> CPU Load Is 0.10791719146608299 >> >> Start >> [GC 821213K(828864K), 0.1966807 secs] >> Stop >> Sleep >> CPU Load Is 0.1540065314146181 >> >> Start >> Stop >> Sleep >> [GC 821213K(1328240K), 0.1962688 secs] >> CPU Load Is 0.08427292195744103 >> >> ... >> >> >> >> Why is the G1 garbage collector consuming so much CPU time? Is it stuck >> in the mark phase as I am modifying the graph structure? >> >> I'm not a subscriber to the list, so please CC me in any response. >> >> Thanks, >> Peter. >> >> -- >> >> import java.lang.management.ManagementFactory; >> import com.sun.management.OperatingSystemMXBean; >> import java.util.Random; >> >> @SuppressWarnings("restriction") >> public class Node { >> private static OperatingSystemMXBean os = (OperatingSystemMXBean) >> ManagementFactory.getOperatingSystemMXBean(); >> >> private Node next; >> >> private Node[] others = new Node[10]; >> >> public static void main(String[] args) throws InterruptedException { >> >> // Build a graph of Nodes >> Node head = buildGraph(); >> >> while (true) { >> // Print CPU load for this process >> System.out.println("CPU Load Is " + os.getProcessCpuLoad()); >> System.out.println(); >> >> // Modify the graph >> System.out.println("Start"); >> head = modifyGraph(head); >> System.out.println("Stop"); >> >> // Sleep, as otherwise we tend to DoS the host computer... >> System.out.println("Sleep"); >> Thread.sleep(1000); >> } >> } >> >> private static Node buildGraph() { >> >> // Create a collection of Node objects >> Node[] array = new Node[10000000]; >> for (int i = 0; i < array.length; i++) { >> array[i] = new Node(); >> } >> >> // Each Node refers to 10 other random Nodes >> Random random = new Random(12); >> for (int i = 0; i < array.length; i++) { >> for (int j = 0; j < array[i].others.length; j++) { >> int k = random.nextInt(array.length); >> array[i].others[j] = array[k]; >> } >> } >> >> // The first Node serves as the head of a queue >> return array[0]; >> } >> >> private static Node modifyGraph(Node head) { >> >> // Perform a million iterations >> for (int i = 0; i < 1000000; i++) { >> >> // Pop a Node off the head of the queue >> Node node = head; >> head = node.next; >> node.next = null; >> >> // Add the other Nodes to the head of the queue >> for (Node other : node.others) { >> other.next = head; >> head = other; >> } >> } >> return head; >> } >> >> } >> >> -- >> *Actenum Corporation* >> Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | >> www.actenum.com >> > > > > -- > *Actenum Corporation* > Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | > www.actenum.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yiyeguhu at gmail.com Tue Jun 3 21:16:52 2014 From: yiyeguhu at gmail.com (Tao Mao) Date: Tue, 3 Jun 2014 14:16:52 -0700 Subject: G1 GC consuming all CPU time In-Reply-To: References: Message-ID: And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more log. Thanks. -Tao On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao wrote: > Hi Peter, > > What was your actual question? Try -XX:ParallelGCThreads= if you > want less CPU usage from GC. > > Thanks. > Tao > > > On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey wrote: > >> Small correction. The last example of output was with >> "-XX:+UseConcMarkSweepGC -verbose:gc". >> >> >> On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey wrote: >> >>> I have an algorithm (at bottom of email) which builds a graph of 'Node' >>> objects with random connections between them. It then repeatedly processes >>> a queue of those Nodes, adding new Nodes to the queue as it goes. This is a >>> single-threaded algorithm that will never terminate. Our actual production >>> code is much more complex, but I've trimmed it down as much as possible. >>> >>> On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause >>> the JRE to consume all 8 cores of my CPU. No other garbage collector does >>> this. You can see the differences in CPU load in the example output below. >>> It's also worth nothing that "-verbose:gc" with the G1 garbage collector >>> prints nothing after my algorithm starts. Presumably the G1 garbage >>> collector is doing something (concurrent mark?), but it's not printing >>> anything about it. >>> >>> When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this >>> (note the huge CPU load value which should not be this high for a >>> single-threaded algorithm on an 8 core CPU): >>> >>> [GC pause (young) 62M->62M(254M), 0.0394214 secs] >>> [GC pause (young) 73M->83M(508M), 0.0302781 secs] >>> [GC pause (young) 106M->111M(1016M), 0.0442273 secs] >>> [GC pause (young) 157M->161M(1625M), 0.0660902 secs] >>> [GC pause (young) 235M->240M(2112M), 0.0907231 secs] >>> [GC pause (young) 334M->337M(2502M), 0.1356917 secs] >>> [GC pause (young) 448M->450M(2814M), 0.1219090 secs] >>> [GC pause (young) 574M->577M(3064M), 0.1778062 secs] >>> [GC pause (young) 712M->715M(3264M), 0.1878443 secs] >>> CPU Load Is -1.0 >>> >>> Start >>> Stop >>> Sleep >>> CPU Load Is 0.9196154547182949 >>> >>> Start >>> Stop >>> Sleep >>> CPU Load Is 0.9150735995043818 >>> >>> ... >>> >>> >>> >>> When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like >>> this: >>> >>> [GC 65536K->64198K(249344K), 0.0628289 secs] >>> [GC 129734K->127974K(314880K), 0.1583369 secs] >>> [Full GC 127974K->127630K(451072K), 0.9675224 secs] >>> [GC 258702K->259102K(451072K), 0.3543645 secs] >>> [Full GC 259102K->258701K(732672K), 1.8085702 secs] >>> [GC 389773K->390181K(790528K), 0.3332060 secs] >>> [GC 579109K->579717K(803328K), 0.5126388 secs] >>> [Full GC 579717K->578698K(1300480K), 4.0647303 secs] >>> [GC 780426K->780842K(1567232K), 0.4364933 secs] >>> CPU Load Is -1.0 >>> >>> Start >>> Stop >>> Sleep >>> CPU Load Is 0.03137771539054431 >>> >>> Start >>> Stop >>> Sleep >>> CPU Load Is 0.032351299224373145 >>> >>> ... >>> >>> >>> >>> When run with VM args "-verbose:gc" I get output like this: >>> >>> [GC 69312K->67824K(251136K), 0.1533803 secs] >>> [GC 137136K->135015K(251136K), 0.0970460 secs] >>> [GC 137245K(251136K), 0.0095245 secs] >>> [GC 204327K->204326K(274368K), 0.1056259 secs] >>> [GC 273638K->273636K(343680K), 0.1081515 secs] >>> [GC 342948K->342946K(412992K), 0.1181966 secs] >>> [GC 412258K->412257K(482304K), 0.1126966 secs] >>> [GC 481569K->481568K(551808K), 0.1156015 secs] >>> [GC 550880K->550878K(620928K), 0.1184089 secs] >>> [GC 620190K->620189K(690048K), 0.1209312 secs] >>> [GC 689501K->689499K(759552K), 0.1199338 secs] >>> [GC 758811K->758809K(828864K), 0.1162532 secs] >>> CPU Load Is -1.0 >>> >>> Start >>> Stop >>> Sleep >>> CPU Load Is 0.10791719146608299 >>> >>> Start >>> [GC 821213K(828864K), 0.1966807 secs] >>> Stop >>> Sleep >>> CPU Load Is 0.1540065314146181 >>> >>> Start >>> Stop >>> Sleep >>> [GC 821213K(1328240K), 0.1962688 secs] >>> CPU Load Is 0.08427292195744103 >>> >>> ... >>> >>> >>> >>> Why is the G1 garbage collector consuming so much CPU time? Is it stuck >>> in the mark phase as I am modifying the graph structure? >>> >>> I'm not a subscriber to the list, so please CC me in any response. >>> >>> Thanks, >>> Peter. >>> >>> -- >>> >>> import java.lang.management.ManagementFactory; >>> import com.sun.management.OperatingSystemMXBean; >>> import java.util.Random; >>> >>> @SuppressWarnings("restriction") >>> public class Node { >>> private static OperatingSystemMXBean os = (OperatingSystemMXBean) >>> ManagementFactory.getOperatingSystemMXBean(); >>> >>> private Node next; >>> >>> private Node[] others = new Node[10]; >>> >>> public static void main(String[] args) throws InterruptedException { >>> >>> // Build a graph of Nodes >>> Node head = buildGraph(); >>> >>> while (true) { >>> // Print CPU load for this process >>> System.out.println("CPU Load Is " + os.getProcessCpuLoad()); >>> System.out.println(); >>> >>> // Modify the graph >>> System.out.println("Start"); >>> head = modifyGraph(head); >>> System.out.println("Stop"); >>> >>> // Sleep, as otherwise we tend to DoS the host computer... >>> System.out.println("Sleep"); >>> Thread.sleep(1000); >>> } >>> } >>> >>> private static Node buildGraph() { >>> >>> // Create a collection of Node objects >>> Node[] array = new Node[10000000]; >>> for (int i = 0; i < array.length; i++) { >>> array[i] = new Node(); >>> } >>> >>> // Each Node refers to 10 other random Nodes >>> Random random = new Random(12); >>> for (int i = 0; i < array.length; i++) { >>> for (int j = 0; j < array[i].others.length; j++) { >>> int k = random.nextInt(array.length); >>> array[i].others[j] = array[k]; >>> } >>> } >>> >>> // The first Node serves as the head of a queue >>> return array[0]; >>> } >>> >>> private static Node modifyGraph(Node head) { >>> >>> // Perform a million iterations >>> for (int i = 0; i < 1000000; i++) { >>> >>> // Pop a Node off the head of the queue >>> Node node = head; >>> head = node.next; >>> node.next = null; >>> >>> // Add the other Nodes to the head of the queue >>> for (Node other : node.others) { >>> other.next = head; >>> head = other; >>> } >>> } >>> return head; >>> } >>> >>> } >>> >>> -- >>> *Actenum Corporation* >>> Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | >>> www.actenum.com >>> >> >> >> >> -- >> *Actenum Corporation* >> Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | >> www.actenum.com >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harvey at actenum.com Tue Jun 3 21:43:03 2014 From: harvey at actenum.com (Peter Harvey) Date: Tue, 3 Jun 2014 15:43:03 -0600 Subject: G1 GC consuming all CPU time In-Reply-To: References: Message-ID: Thanks for the response. Here are the additional logs. 0.094: [GC pause (young), 0.0347877 secs] [Parallel Time: 34.1 ms, GC Workers: 8] [GC Worker Start (ms): Min: 94.2, Avg: 104.4, Max: 126.4, Diff: 32.2] [Ext Root Scanning (ms): Min: 0.0, Avg: 3.3, Max: 25.0, Diff: 25.0, Sum: 26.6] [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 5.3, Diff: 5.3, Sum: 16.7] [Processed Buffers: Min: 0, Avg: 2.3, Max: 9, Diff: 9, Sum: 18] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 1.8, Avg: 18.3, Max: 29.9, Diff: 28.2, Sum: 146.4] [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.6] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 1.9, Avg: 23.8, Max: 34.1, Diff: 32.2, Sum: 190.4] [GC Worker End (ms): Min: 128.2, Avg: 128.3, Max: 128.3, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.0 ms] [Other: 0.6 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.3 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 24.0M(24.0M)->0.0B(11.0M) Survivors: 0.0B->3072.0K Heap: 62.1M(254.0M)->62.2M(254.0M)] [Times: user=0.09 sys=0.03, real=0.04 secs] 0.131: [GC pause (young), 0.0295093 secs] [Parallel Time: 28.1 ms, GC Workers: 8] [GC Worker Start (ms): Min: 130.9, Avg: 135.5, Max: 158.7, Diff: 27.8] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 1.2] [Update RS (ms): Min: 0.0, Avg: 11.4, Max: 27.5, Diff: 27.5, Sum: 90.8] [Processed Buffers: Min: 0, Avg: 23.8, Max: 42, Diff: 42, Sum: 190] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 11.7, Max: 17.1, Diff: 17.1, Sum: 93.8] [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.7] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 0.2, Avg: 23.5, Max: 28.1, Diff: 27.8, Sum: 187.7] [GC Worker End (ms): Min: 159.0, Avg: 159.0, Max: 159.0, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 1.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 11.0M(11.0M)->0.0B(23.0M) Survivors: 3072.0K->2048.0K Heap: 73.2M(254.0M)->82.7M(508.0M)] [Times: user=0.19 sys=0.00, real=0.03 secs] 0.166: [GC pause (young), 0.0385523 secs] [Parallel Time: 35.9 ms, GC Workers: 8] [GC Worker Start (ms): Min: 166.4, Avg: 169.8, Max: 192.4, Diff: 25.9] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 1.9] [Update RS (ms): Min: 0.0, Avg: 10.9, Max: 31.9, Diff: 31.9, Sum: 87.2] [Processed Buffers: Min: 0, Avg: 14.6, Max: 26, Diff: 26, Sum: 117] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Object Copy (ms): Min: 3.5, Avg: 21.4, Max: 27.0, Diff: 23.4, Sum: 171.1] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 10.0, Avg: 32.6, Max: 35.9, Diff: 25.9, Sum: 260.7] [GC Worker End (ms): Min: 202.3, Avg: 202.4, Max: 202.4, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.0 ms] [Other: 2.6 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 23.0M(23.0M)->0.0B(46.0M) Survivors: 2048.0K->4096.0K Heap: 105.7M(508.0M)->110.1M(1016.0M)] [Times: user=0.19 sys=0.00, real=0.04 secs] 0.222: [GC pause (young), 0.0558720 secs] [Parallel Time: 53.0 ms, GC Workers: 8] [GC Worker Start (ms): Min: 222.0, Avg: 222.2, Max: 222.5, Diff: 0.5] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 1.5] [Update RS (ms): Min: 7.7, Avg: 8.7, Max: 10.9, Diff: 3.2, Sum: 69.4] [Processed Buffers: Min: 7, Avg: 8.5, Max: 12, Diff: 5, Sum: 68] [Scan RS (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.6, Sum: 2.3] [Object Copy (ms): Min: 41.7, Avg: 43.6, Max: 44.3, Diff: 2.7, Sum: 348.5] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 52.4, Avg: 52.7, Max: 52.9, Diff: 0.5, Sum: 421.8] [GC Worker End (ms): Min: 274.9, Avg: 274.9, Max: 274.9, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.0 ms] [Other: 2.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 46.0M(46.0M)->0.0B(74.0M) Survivors: 4096.0K->7168.0K Heap: 156.1M(1016.0M)->158.6M(1625.0M)] [Times: user=0.48 sys=0.01, real=0.06 secs] 0.328: [GC pause (young), 0.0853794 secs] [Parallel Time: 82.8 ms, GC Workers: 8] [GC Worker Start (ms): Min: 327.9, Avg: 330.8, Max: 351.1, Diff: 23.2] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] [Update RS (ms): Min: 0.0, Avg: 5.5, Max: 8.3, Diff: 8.3, Sum: 43.9] [Processed Buffers: Min: 0, Avg: 2.3, Max: 3, Diff: 3, Sum: 18] [Scan RS (ms): Min: 0.0, Avg: 2.2, Max: 3.3, Diff: 3.3, Sum: 17.4] [Object Copy (ms): Min: 59.5, Avg: 71.8, Max: 73.7, Diff: 14.2, Sum: 574.7] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 59.5, Avg: 79.8, Max: 82.7, Diff: 23.2, Sum: 638.4] [GC Worker End (ms): Min: 410.6, Avg: 410.7, Max: 410.7, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.6 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 74.0M(74.0M)->0.0B(94.0M) Survivors: 7168.0K->11.0M Heap: 232.6M(1625.0M)->237.6M(2112.0M)] [Times: user=0.59 sys=0.00, real=0.09 secs] 0.447: [GC pause (young), 0.1239103 secs] [Parallel Time: 121.5 ms, GC Workers: 8] [GC Worker Start (ms): Min: 447.5, Avg: 447.7, Max: 448.5, Diff: 0.9] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.9] [Update RS (ms): Min: 26.5, Avg: 28.2, Max: 28.7, Diff: 2.2, Sum: 225.7] [Processed Buffers: Min: 38, Avg: 39.8, Max: 44, Diff: 6, Sum: 318] [Scan RS (ms): Min: 0.3, Avg: 0.7, Max: 1.9, Diff: 1.6, Sum: 5.3] [Object Copy (ms): Min: 92.1, Avg: 92.2, Max: 92.3, Diff: 0.2, Sum: 737.5] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 120.6, Avg: 121.4, Max: 121.5, Diff: 0.9, Sum: 970.8] [GC Worker End (ms): Min: 569.0, Avg: 569.0, Max: 569.0, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 94.0M(94.0M)->0.0B(111.0M) Survivors: 11.0M->14.0M Heap: 331.6M(2112.0M)->334.6M(2502.0M)] [Times: user=0.80 sys=0.05, real=0.12 secs] 0.599: [GC pause (young), 0.1479438 secs] [Parallel Time: 145.7 ms, GC Workers: 8] [GC Worker Start (ms): Min: 599.4, Avg: 599.5, Max: 599.8, Diff: 0.4] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.9] [Update RS (ms): Min: 41.8, Avg: 43.0, Max: 44.0, Diff: 2.1, Sum: 343.6] [Processed Buffers: Min: 67, Avg: 70.9, Max: 73, Diff: 6, Sum: 567] [Scan RS (ms): Min: 0.0, Avg: 0.8, Max: 1.9, Diff: 1.9, Sum: 6.2] [Object Copy (ms): Min: 101.3, Avg: 101.6, Max: 101.7, Diff: 0.3, Sum: 812.6] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 145.2, Avg: 145.6, Max: 145.6, Diff: 0.4, Sum: 1164.6] [GC Worker End (ms): Min: 745.1, Avg: 745.1, Max: 745.1, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.2 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 111.0M(111.0M)->0.0B(124.0M) Survivors: 14.0M->16.0M Heap: 445.6M(2502.0M)->448.6M(2814.0M)] [Times: user=1.20 sys=0.05, real=0.15 secs] 0.787: [GC pause (young), 0.1625321 secs] [Parallel Time: 160.0 ms, GC Workers: 8] [GC Worker Start (ms): Min: 786.6, Avg: 786.7, Max: 786.9, Diff: 0.4] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.8] [Update RS (ms): Min: 46.4, Avg: 47.0, Max: 49.0, Diff: 2.5, Sum: 376.0] [Processed Buffers: Min: 75, Avg: 78.0, Max: 79, Diff: 4, Sum: 624] [Scan RS (ms): Min: 0.0, Avg: 0.9, Max: 1.5, Diff: 1.5, Sum: 7.4] [Object Copy (ms): Min: 110.6, Avg: 111.7, Max: 112.0, Diff: 1.4, Sum: 893.5] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3] [GC Worker Total (ms): Min: 159.6, Avg: 159.9, Max: 160.0, Diff: 0.4, Sum: 1279.0] [GC Worker End (ms): Min: 946.5, Avg: 946.5, Max: 946.6, Diff: 0.1] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.2 ms] [Eden: 124.0M(124.0M)->0.0B(135.0M) Survivors: 16.0M->18.0M Heap: 572.6M(2814.0M)->576.6M(3064.0M)] [Times: user=1.37 sys=0.00, real=0.16 secs] 0.981: [GC pause (young), 0.2063055 secs] [Parallel Time: 204.1 ms, GC Workers: 8] [GC Worker Start (ms): Min: 980.8, Avg: 980.9, Max: 981.0, Diff: 0.2] [Ext Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.3, Diff: 0.2, Sum: 2.1] [Update RS (ms): Min: 55.9, Avg: 57.8, Max: 58.8, Diff: 2.9, Sum: 462.8] [Processed Buffers: Min: 100, Avg: 101.5, Max: 103, Diff: 3, Sum: 812] [Scan RS (ms): Min: 0.0, Avg: 1.0, Max: 3.1, Diff: 3.1, Sum: 8.3] [Object Copy (ms): Min: 144.7, Avg: 144.8, Max: 144.9, Diff: 0.1, Sum: 1158.3] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.3] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 203.8, Avg: 204.0, Max: 204.0, Diff: 0.2, Sum: 1631.9] [GC Worker End (ms): Min: 1184.9, Avg: 1184.9, Max: 1184.9, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 135.0M(135.0M)->0.0B(143.0M) Survivors: 18.0M->20.0M Heap: 711.6M(3064.0M)->714.6M(3264.0M)] [Times: user=1.40 sys=0.11, real=0.21 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.9166222455142531 Start Stop Sleep CPU Load Is 0.907013989900451 Start Stop Sleep CPU Load Is 0.9085635227776081 Start Stop Sleep CPU Load Is 0.909945506396622 Note that all the logged GC occurs during the construction of my graph of Nodes, which is *before* my algorithm (modifyGraph) starts, There is no log of GC activity once the algorithm starts, but there is significant (100%) CPU usage. My questions are: - Why is the G1 garbage collector consuming so much CPU time? What is it doing? - Why is the G1 garbage collector not logging anything? The only reason I even know it's the garbage collector consuming my CPU time is that (a) I only see this behaviour when the G1 collector is enabled and (b) the load on the CPU correlates with the value of -XX:ParallelGCThreads. - Are there particular object-graph structures that the G1 garbage collector will struggle with? Should complex graphs be considered bad coding practice? - How can I write my code to avoid this behaviour in the G1 garbage collector? For example, if all my Nodes are in an array, will this fix it? - Should this be considered a bug in the G1 garbage collector? This is far beyond 'a small increase in CPU usage'. Just to demonstrate the issue further, I timed my calls to modifyGraph() and trialled different GC parameters: - -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU load was 0.1247 - -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU load was 0.5960 - -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU load was 0.9184 - -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119 regardless of the setting of -XX:ParallelGCThreads So using the CMS GC made my application 44x faster (14.815 seconds versus 0.322 seconds) and placed 1/8th of the load (0.9184 versus 0.1119) on the CPU. If my code represents some kind of hypothetical worst case for the G1 garbage collector, I think it should be documented and/or fixed somehow. Regards, Peter. On Tue, Jun 3, 2014 at 3:16 PM, Tao Mao wrote: > And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more log. > Thanks. -Tao > > > On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao wrote: > >> Hi Peter, >> >> What was your actual question? Try -XX:ParallelGCThreads= if you >> want less CPU usage from GC. >> >> Thanks. >> Tao >> >> >> On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey wrote: >> >>> Small correction. The last example of output was with >>> "-XX:+UseConcMarkSweepGC -verbose:gc". >>> >>> >>> On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey >>> wrote: >>> >>>> I have an algorithm (at bottom of email) which builds a graph of 'Node' >>>> objects with random connections between them. It then repeatedly processes >>>> a queue of those Nodes, adding new Nodes to the queue as it goes. This is a >>>> single-threaded algorithm that will never terminate. Our actual production >>>> code is much more complex, but I've trimmed it down as much as possible. >>>> >>>> On Windows 7 with JRE 7u60, enabling the G1 garbage collector will >>>> cause the JRE to consume all 8 cores of my CPU. No other garbage collector >>>> does this. You can see the differences in CPU load in the example output >>>> below. It's also worth nothing that "-verbose:gc" with the G1 garbage >>>> collector prints nothing after my algorithm starts. Presumably the G1 >>>> garbage collector is doing something (concurrent mark?), but it's not >>>> printing anything about it. >>>> >>>> When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this >>>> (note the huge CPU load value which should not be this high for a >>>> single-threaded algorithm on an 8 core CPU): >>>> >>>> [GC pause (young) 62M->62M(254M), 0.0394214 secs] >>>> [GC pause (young) 73M->83M(508M), 0.0302781 secs] >>>> [GC pause (young) 106M->111M(1016M), 0.0442273 secs] >>>> [GC pause (young) 157M->161M(1625M), 0.0660902 secs] >>>> [GC pause (young) 235M->240M(2112M), 0.0907231 secs] >>>> [GC pause (young) 334M->337M(2502M), 0.1356917 secs] >>>> [GC pause (young) 448M->450M(2814M), 0.1219090 secs] >>>> [GC pause (young) 574M->577M(3064M), 0.1778062 secs] >>>> [GC pause (young) 712M->715M(3264M), 0.1878443 secs] >>>> CPU Load Is -1.0 >>>> >>>> Start >>>> Stop >>>> Sleep >>>> CPU Load Is 0.9196154547182949 >>>> >>>> Start >>>> Stop >>>> Sleep >>>> CPU Load Is 0.9150735995043818 >>>> >>>> ... >>>> >>>> >>>> >>>> When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output >>>> like this: >>>> >>>> [GC 65536K->64198K(249344K), 0.0628289 secs] >>>> [GC 129734K->127974K(314880K), 0.1583369 secs] >>>> [Full GC 127974K->127630K(451072K), 0.9675224 secs] >>>> [GC 258702K->259102K(451072K), 0.3543645 secs] >>>> [Full GC 259102K->258701K(732672K), 1.8085702 secs] >>>> [GC 389773K->390181K(790528K), 0.3332060 secs] >>>> [GC 579109K->579717K(803328K), 0.5126388 secs] >>>> [Full GC 579717K->578698K(1300480K), 4.0647303 secs] >>>> [GC 780426K->780842K(1567232K), 0.4364933 secs] >>>> CPU Load Is -1.0 >>>> >>>> Start >>>> Stop >>>> Sleep >>>> CPU Load Is 0.03137771539054431 >>>> >>>> Start >>>> Stop >>>> Sleep >>>> CPU Load Is 0.032351299224373145 >>>> >>>> ... >>>> >>>> >>>> >>>> When run with VM args "-verbose:gc" I get output like this: >>>> >>>> [GC 69312K->67824K(251136K), 0.1533803 secs] >>>> [GC 137136K->135015K(251136K), 0.0970460 secs] >>>> [GC 137245K(251136K), 0.0095245 secs] >>>> [GC 204327K->204326K(274368K), 0.1056259 secs] >>>> [GC 273638K->273636K(343680K), 0.1081515 secs] >>>> [GC 342948K->342946K(412992K), 0.1181966 secs] >>>> [GC 412258K->412257K(482304K), 0.1126966 secs] >>>> [GC 481569K->481568K(551808K), 0.1156015 secs] >>>> [GC 550880K->550878K(620928K), 0.1184089 secs] >>>> [GC 620190K->620189K(690048K), 0.1209312 secs] >>>> [GC 689501K->689499K(759552K), 0.1199338 secs] >>>> [GC 758811K->758809K(828864K), 0.1162532 secs] >>>> CPU Load Is -1.0 >>>> >>>> Start >>>> Stop >>>> Sleep >>>> CPU Load Is 0.10791719146608299 >>>> >>>> Start >>>> [GC 821213K(828864K), 0.1966807 secs] >>>> Stop >>>> Sleep >>>> CPU Load Is 0.1540065314146181 >>>> >>>> Start >>>> Stop >>>> Sleep >>>> [GC 821213K(1328240K), 0.1962688 secs] >>>> CPU Load Is 0.08427292195744103 >>>> >>>> ... >>>> >>>> >>>> >>>> Why is the G1 garbage collector consuming so much CPU time? Is it stuck >>>> in the mark phase as I am modifying the graph structure? >>>> >>>> I'm not a subscriber to the list, so please CC me in any response. >>>> >>>> Thanks, >>>> Peter. >>>> >>>> -- >>>> >>>> import java.lang.management.ManagementFactory; >>>> import com.sun.management.OperatingSystemMXBean; >>>> import java.util.Random; >>>> >>>> @SuppressWarnings("restriction") >>>> public class Node { >>>> private static OperatingSystemMXBean os = (OperatingSystemMXBean) >>>> ManagementFactory.getOperatingSystemMXBean(); >>>> >>>> private Node next; >>>> >>>> private Node[] others = new Node[10]; >>>> >>>> public static void main(String[] args) throws InterruptedException { >>>> >>>> // Build a graph of Nodes >>>> Node head = buildGraph(); >>>> >>>> while (true) { >>>> // Print CPU load for this process >>>> System.out.println("CPU Load Is " + os.getProcessCpuLoad()); >>>> System.out.println(); >>>> >>>> // Modify the graph >>>> System.out.println("Start"); >>>> head = modifyGraph(head); >>>> System.out.println("Stop"); >>>> >>>> // Sleep, as otherwise we tend to DoS the host computer... >>>> System.out.println("Sleep"); >>>> Thread.sleep(1000); >>>> } >>>> } >>>> >>>> private static Node buildGraph() { >>>> >>>> // Create a collection of Node objects >>>> Node[] array = new Node[10000000]; >>>> for (int i = 0; i < array.length; i++) { >>>> array[i] = new Node(); >>>> } >>>> >>>> // Each Node refers to 10 other random Nodes >>>> Random random = new Random(12); >>>> for (int i = 0; i < array.length; i++) { >>>> for (int j = 0; j < array[i].others.length; j++) { >>>> int k = random.nextInt(array.length); >>>> array[i].others[j] = array[k]; >>>> } >>>> } >>>> >>>> // The first Node serves as the head of a queue >>>> return array[0]; >>>> } >>>> >>>> private static Node modifyGraph(Node head) { >>>> >>>> // Perform a million iterations >>>> for (int i = 0; i < 1000000; i++) { >>>> >>>> // Pop a Node off the head of the queue >>>> Node node = head; >>>> head = node.next; >>>> node.next = null; >>>> >>>> // Add the other Nodes to the head of the queue >>>> for (Node other : node.others) { >>>> other.next = head; >>>> head = other; >>>> } >>>> } >>>> return head; >>>> } >>>> >>>> } >>>> >>>> -- >>>> *Actenum Corporation* >>>> Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | >>>> www.actenum.com >>>> >>> >>> >>> >>> -- >>> *Actenum Corporation* >>> Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | >>> www.actenum.com >>> >> >> > -- *Actenum Corporation* Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | www.actenum.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From claes.redestad at oracle.com Tue Jun 3 22:57:45 2014 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 04 Jun 2014 00:57:45 +0200 Subject: G1 GC consuming all CPU time In-Reply-To: References: Message-ID: <538E52E9.2010904@oracle.com> Hi, guessing it's due to the concurrent GC threads tripping over themselves: the microbenchmark is creating one, big linked structure that will occupy most of the old gen, and then you're doing intense pointer updates which will trigger scans and updates of remembered sets etc. I actually don't know half the details and am mostly just guessing. :-) Converted your micro to a JMH micro to ease with experimenting[1] (hope you don't mind) then verified the regression reproduces: Parallel: java -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.* ~1625 ops/ms G1: java -XX:+UseG1GC -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.* ~12 ops/ms Testing my hunch, let's try forcing the concurrent refinement to use only one thread: java -XX:+UseG1GC -XX:-G1UseAdaptiveConcRefinement -XX:G1ConcRefinementThreads=1 -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.* ~1550 ops/ms I guess we have a winner! I won't hazard to try and answer your questions about how this should be resolved - perhaps the adaptive policy can detect this corner case and scale down the number of refinement threads when they start interfering with each other, or something. /Claes [1] package org.sample; import org.openjdk.jmh.annotations.GenerateMicroBenchmark; import org.openjdk.jmh.annotations.Scope; import org.openjdk.jmh.annotations.State; import java.util.Random; @State(Scope.Thread) public class G1GraphBench { private static class Node { private Node next; private Node[] others = new Node[10]; } Node head = buildGraph(); private static Node buildGraph() { // Create a collection of Node objects Node[] array = new Node[10000000]; for (int i = 0; i < array.length; i++) { array[i] = new Node(); } // Each Node refers to 10 other random Nodes Random random = new Random(12); for (int i = 0; i < array.length; i++) { for (int j = 0; j < array[i].others.length; j++) { int k = random.nextInt(array.length); array[i].others[j] = array[k]; } } // The first Node serves as the head of a queue return array[0]; } @GenerateMicroBenchmark public Node nodeBench() { Node node = head; head = node.next; node.next = null; // Add the other Nodes to the head of the queue for (Node other : node.others) { other.next = head; head = other; } return head; } } On 2014-06-03 23:43, Peter Harvey wrote: > Thanks for the response. Here are the additional logs. > > 0.094: [GC pause (young), 0.0347877 secs] > [Parallel Time: 34.1 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 94.2, Avg: 104.4, Max: 126.4, > Diff: 32.2] > [Ext Root Scanning (ms): Min: 0.0, Avg: 3.3, Max: 25.0, > Diff: 25.0, Sum: 26.6] > [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 5.3, Diff: 5.3, > Sum: 16.7] > [Processed Buffers: Min: 0, Avg: 2.3, Max: 9, Diff: 9, > Sum: 18] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.0] > [Object Copy (ms): Min: 1.8, Avg: 18.3, Max: 29.9, Diff: > 28.2, Sum: 146.4] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, > Sum: 0.6] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.1] > [GC Worker Total (ms): Min: 1.9, Avg: 23.8, Max: 34.1, Diff: > 32.2, Sum: 190.4] > [GC Worker End (ms): Min: 128.2, Avg: 128.3, Max: 128.3, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.0 ms] > [Other: 0.6 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.3 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.0 ms] > [Eden: 24.0M(24.0M)->0.0B(11.0M) Survivors: 0.0B->3072.0K Heap: > 62.1M(254.0M)->62.2M(254.0M)] > [Times: user=0.09 sys=0.03, real=0.04 secs] > 0.131: [GC pause (young), 0.0295093 secs] > [Parallel Time: 28.1 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 130.9, Avg: 135.5, Max: 158.7, > Diff: 27.8] > [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: > 0.4, Sum: 1.2] > [Update RS (ms): Min: 0.0, Avg: 11.4, Max: 27.5, Diff: 27.5, > Sum: 90.8] > [Processed Buffers: Min: 0, Avg: 23.8, Max: 42, Diff: 42, > Sum: 190] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.0] > [Object Copy (ms): Min: 0.0, Avg: 11.7, Max: 17.1, Diff: > 17.1, Sum: 93.8] > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, > Sum: 1.7] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.1] > [GC Worker Total (ms): Min: 0.2, Avg: 23.5, Max: 28.1, Diff: > 27.8, Sum: 187.7] > [GC Worker End (ms): Min: 159.0, Avg: 159.0, Max: 159.0, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.1 ms] > [Other: 1.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.0 ms] > [Eden: 11.0M(11.0M)->0.0B(23.0M) Survivors: 3072.0K->2048.0K > Heap: 73.2M(254.0M)->82.7M(508.0M)] > [Times: user=0.19 sys=0.00, real=0.03 secs] > 0.166: [GC pause (young), 0.0385523 secs] > [Parallel Time: 35.9 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 166.4, Avg: 169.8, Max: 192.4, > Diff: 25.9] > [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: > 0.4, Sum: 1.9] > [Update RS (ms): Min: 0.0, Avg: 10.9, Max: 31.9, Diff: 31.9, > Sum: 87.2] > [Processed Buffers: Min: 0, Avg: 14.6, Max: 26, Diff: 26, > Sum: 117] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.1] > [Object Copy (ms): Min: 3.5, Avg: 21.4, Max: 27.0, Diff: > 23.4, Sum: 171.1] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, > Sum: 0.4] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.1] > [GC Worker Total (ms): Min: 10.0, Avg: 32.6, Max: 35.9, > Diff: 25.9, Sum: 260.7] > [GC Worker End (ms): Min: 202.3, Avg: 202.4, Max: 202.4, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.0 ms] > [Other: 2.6 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.0 ms] > [Eden: 23.0M(23.0M)->0.0B(46.0M) Survivors: 2048.0K->4096.0K > Heap: 105.7M(508.0M)->110.1M(1016.0M)] > [Times: user=0.19 sys=0.00, real=0.04 secs] > 0.222: [GC pause (young), 0.0558720 secs] > [Parallel Time: 53.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 222.0, Avg: 222.2, Max: 222.5, > Diff: 0.5] > [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: > 0.4, Sum: 1.5] > [Update RS (ms): Min: 7.7, Avg: 8.7, Max: 10.9, Diff: 3.2, > Sum: 69.4] > [Processed Buffers: Min: 7, Avg: 8.5, Max: 12, Diff: 5, > Sum: 68] > [Scan RS (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.6, Sum: > 2.3] > [Object Copy (ms): Min: 41.7, Avg: 43.6, Max: 44.3, Diff: > 2.7, Sum: 348.5] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.2] > [GC Worker Total (ms): Min: 52.4, Avg: 52.7, Max: 52.9, > Diff: 0.5, Sum: 421.8] > [GC Worker End (ms): Min: 274.9, Avg: 274.9, Max: 274.9, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.0 ms] > [Other: 2.8 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.0 ms] > [Eden: 46.0M(46.0M)->0.0B(74.0M) Survivors: 4096.0K->7168.0K > Heap: 156.1M(1016.0M)->158.6M(1625.0M)] > [Times: user=0.48 sys=0.01, real=0.06 secs] > 0.328: [GC pause (young), 0.0853794 secs] > [Parallel Time: 82.8 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 327.9, Avg: 330.8, Max: 351.1, > Diff: 23.2] > [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: > 0.3, Sum: 2.0] > [Update RS (ms): Min: 0.0, Avg: 5.5, Max: 8.3, Diff: 8.3, > Sum: 43.9] > [Processed Buffers: Min: 0, Avg: 2.3, Max: 3, Diff: 3, > Sum: 18] > [Scan RS (ms): Min: 0.0, Avg: 2.2, Max: 3.3, Diff: 3.3, Sum: > 17.4] > [Object Copy (ms): Min: 59.5, Avg: 71.8, Max: 73.7, Diff: > 14.2, Sum: 574.7] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.2] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.2] > [GC Worker Total (ms): Min: 59.5, Avg: 79.8, Max: 82.7, > Diff: 23.2, Sum: 638.4] > [GC Worker End (ms): Min: 410.6, Avg: 410.7, Max: 410.7, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.1 ms] > [Other: 2.6 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.1 ms] > [Eden: 74.0M(74.0M)->0.0B(94.0M) Survivors: 7168.0K->11.0M > Heap: 232.6M(1625.0M)->237.6M(2112.0M)] > [Times: user=0.59 sys=0.00, real=0.09 secs] > 0.447: [GC pause (young), 0.1239103 secs] > [Parallel Time: 121.5 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 447.5, Avg: 447.7, Max: 448.5, > Diff: 0.9] > [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: > 0.3, Sum: 1.9] > [Update RS (ms): Min: 26.5, Avg: 28.2, Max: 28.7, Diff: 2.2, > Sum: 225.7] > [Processed Buffers: Min: 38, Avg: 39.8, Max: 44, Diff: 6, > Sum: 318] > [Scan RS (ms): Min: 0.3, Avg: 0.7, Max: 1.9, Diff: 1.6, Sum: > 5.3] > [Object Copy (ms): Min: 92.1, Avg: 92.2, Max: 92.3, Diff: > 0.2, Sum: 737.5] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, > Sum: 0.3] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.2] > [GC Worker Total (ms): Min: 120.6, Avg: 121.4, Max: 121.5, > Diff: 0.9, Sum: 970.8] > [GC Worker End (ms): Min: 569.0, Avg: 569.0, Max: 569.0, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.1 ms] > [Other: 2.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.1 ms] > [Eden: 94.0M(94.0M)->0.0B(111.0M) Survivors: 11.0M->14.0M Heap: > 331.6M(2112.0M)->334.6M(2502.0M)] > [Times: user=0.80 sys=0.05, real=0.12 secs] > 0.599: [GC pause (young), 0.1479438 secs] > [Parallel Time: 145.7 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 599.4, Avg: 599.5, Max: 599.8, > Diff: 0.4] > [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: > 0.3, Sum: 1.9] > [Update RS (ms): Min: 41.8, Avg: 43.0, Max: 44.0, Diff: 2.1, > Sum: 343.6] > [Processed Buffers: Min: 67, Avg: 70.9, Max: 73, Diff: 6, > Sum: 567] > [Scan RS (ms): Min: 0.0, Avg: 0.8, Max: 1.9, Diff: 1.9, Sum: > 6.2] > [Object Copy (ms): Min: 101.3, Avg: 101.6, Max: 101.7, Diff: > 0.3, Sum: 812.6] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.1] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.2] > [GC Worker Total (ms): Min: 145.2, Avg: 145.6, Max: 145.6, > Diff: 0.4, Sum: 1164.6] > [GC Worker End (ms): Min: 745.1, Avg: 745.1, Max: 745.1, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.1 ms] > [Other: 2.2 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.1 ms] > [Eden: 111.0M(111.0M)->0.0B(124.0M) Survivors: 14.0M->16.0M > Heap: 445.6M(2502.0M)->448.6M(2814.0M)] > [Times: user=1.20 sys=0.05, real=0.15 secs] > 0.787: [GC pause (young), 0.1625321 secs] > [Parallel Time: 160.0 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 786.6, Avg: 786.7, Max: 786.9, > Diff: 0.4] > [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: > 0.3, Sum: 1.8] > [Update RS (ms): Min: 46.4, Avg: 47.0, Max: 49.0, Diff: 2.5, > Sum: 376.0] > [Processed Buffers: Min: 75, Avg: 78.0, Max: 79, Diff: 4, > Sum: 624] > [Scan RS (ms): Min: 0.0, Avg: 0.9, Max: 1.5, Diff: 1.5, Sum: > 7.4] > [Object Copy (ms): Min: 110.6, Avg: 111.7, Max: 112.0, Diff: > 1.4, Sum: 893.5] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.1] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: > 0.1, Sum: 0.3] > [GC Worker Total (ms): Min: 159.6, Avg: 159.9, Max: 160.0, > Diff: 0.4, Sum: 1279.0] > [GC Worker End (ms): Min: 946.5, Avg: 946.5, Max: 946.6, > Diff: 0.1] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.1 ms] > [Other: 2.4 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.2 ms] > [Eden: 124.0M(124.0M)->0.0B(135.0M) Survivors: 16.0M->18.0M > Heap: 572.6M(2814.0M)->576.6M(3064.0M)] > [Times: user=1.37 sys=0.00, real=0.16 secs] > 0.981: [GC pause (young), 0.2063055 secs] > [Parallel Time: 204.1 ms, GC Workers: 8] > [GC Worker Start (ms): Min: 980.8, Avg: 980.9, Max: 981.0, > Diff: 0.2] > [Ext Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.3, Diff: > 0.2, Sum: 2.1] > [Update RS (ms): Min: 55.9, Avg: 57.8, Max: 58.8, Diff: 2.9, > Sum: 462.8] > [Processed Buffers: Min: 100, Avg: 101.5, Max: 103, Diff: > 3, Sum: 812] > [Scan RS (ms): Min: 0.0, Avg: 1.0, Max: 3.1, Diff: 3.1, Sum: > 8.3] > [Object Copy (ms): Min: 144.7, Avg: 144.8, Max: 144.9, Diff: > 0.1, Sum: 1158.3] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.3] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.2] > [GC Worker Total (ms): Min: 203.8, Avg: 204.0, Max: 204.0, > Diff: 0.2, Sum: 1631.9] > [GC Worker End (ms): Min: 1184.9, Avg: 1184.9, Max: 1184.9, > Diff: 0.0] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.1 ms] > [Other: 2.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.1 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.1 ms] > [Eden: 135.0M(135.0M)->0.0B(143.0M) Survivors: 18.0M->20.0M > Heap: 711.6M(3064.0M)->714.6M(3264.0M)] > [Times: user=1.40 sys=0.11, real=0.21 secs] > CPU Load Is -1.0 > > Start > Stop > Sleep > CPU Load Is 0.9166222455142531 > > Start > Stop > Sleep > CPU Load Is 0.907013989900451 > > Start > Stop > Sleep > CPU Load Is 0.9085635227776081 > > Start > Stop > Sleep > CPU Load Is 0.909945506396622 > > > > Note that all the logged GC occurs during the construction of my graph > of Nodes, which is /before/ my algorithm (modifyGraph) starts, There > is no log of GC activity once the algorithm starts, but there is > significant (100%) CPU usage. > > My questions are: > > * Why is the G1 garbage collector consuming so much CPU time? What > is it doing? > * Why is the G1 garbage collector not logging anything? The only > reason I even know it's the garbage collector consuming my CPU > time is that (a) I only see this behaviour when the G1 collector > is enabled and (b) the load on the CPU correlates with the value > of -XX:ParallelGCThreads. > * Are there particular object-graph structures that the G1 garbage > collector will struggle with? Should complex graphs be considered > bad coding practice? > * How can I write my code to avoid this behaviour in the G1 garbage > collector? For example, if all my Nodes are in an array, will this > fix it? > * Should this be considered a bug in the G1 garbage collector? This > is far beyond 'a small increase in CPU usage'. > > Just to demonstrate the issue further, I timed my calls to > modifyGraph() and trialled different GC parameters: > > * -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU > load was 0.1247 > * -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU > load was 0.5960 > * -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU > load was 0.9184 > * -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119 > regardless of the setting of -XX:ParallelGCThreads > > So using the CMS GC made my application 44x faster (14.815 seconds > versus 0.322 seconds) and placed 1/8th of the load (0.9184 versus > 0.1119) on the CPU. > > If my code represents some kind of hypothetical worst case for the G1 > garbage collector, I think it should be documented and/or fixed somehow. > > Regards, > Peter. > > > > On Tue, Jun 3, 2014 at 3:16 PM, Tao Mao > wrote: > > And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more > log. Thanks. -Tao > > > On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao > wrote: > > Hi Peter, > > What was your actual question? > Try -XX:ParallelGCThreads= if you want less CPU usage > from GC. > > Thanks. > Tao > > > On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey > > wrote: > > Small correction. The last example of output was with > "-XX:+UseConcMarkSweepGC -verbose:gc". > > > On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey > > wrote: > > I have an algorithm (at bottom of email) which builds > a graph of 'Node' objects with random connections > between them. It then repeatedly processes a queue of > those Nodes, adding new Nodes to the queue as it goes. > This is a single-threaded algorithm that will never > terminate. Our actual production code is much more > complex, but I've trimmed it down as much as possible. > > On Windows 7 with JRE 7u60, enabling the G1 garbage > collector will cause the JRE to consume all 8 cores of > my CPU. No other garbage collector does this. You can > see the differences in CPU load in the example output > below. It's also worth nothing that "-verbose:gc" with > the G1 garbage collector prints nothing after my > algorithm starts. Presumably the G1 garbage collector > is doing something (concurrent mark?), but it's not > printing anything about it. > > When run with VM args "-XX:+UseG1GC -verbose:gc" I get > output like this (note the huge CPU load value which > should not be this high for a single-threaded > algorithm on an 8 core CPU): > > [GC pause (young) 62M->62M(254M), 0.0394214 secs] > [GC pause (young) 73M->83M(508M), 0.0302781 secs] > [GC pause (young) 106M->111M(1016M), 0.0442273 secs] > [GC pause (young) 157M->161M(1625M), 0.0660902 secs] > [GC pause (young) 235M->240M(2112M), 0.0907231 secs] > [GC pause (young) 334M->337M(2502M), 0.1356917 secs] > [GC pause (young) 448M->450M(2814M), 0.1219090 secs] > [GC pause (young) 574M->577M(3064M), 0.1778062 secs] > [GC pause (young) 712M->715M(3264M), 0.1878443 secs] > CPU Load Is -1.0 > > Start > Stop > Sleep > CPU Load Is 0.9196154547182949 > > Start > Stop > Sleep > CPU Load Is 0.9150735995043818 > > ... > > > > When run with VM args "-XX:+UseParallelGC -verbose:gc" > I get output like this: > > [GC 65536K->64198K(249344K), 0.0628289 secs] > [GC 129734K->127974K(314880K), 0.1583369 secs] > [Full GC 127974K->127630K(451072K), 0.9675224 secs] > [GC 258702K->259102K(451072K), 0.3543645 secs] > [Full GC 259102K->258701K(732672K), 1.8085702 secs] > [GC 389773K->390181K(790528K), 0.3332060 secs] > [GC 579109K->579717K(803328K), 0.5126388 secs] > [Full GC 579717K->578698K(1300480K), 4.0647303 secs] > [GC 780426K->780842K(1567232K), 0.4364933 secs] > CPU Load Is -1.0 > > Start > Stop > Sleep > CPU Load Is 0.03137771539054431 > > Start > Stop > Sleep > CPU Load Is 0.032351299224373145 > > ... > > > > When run with VM args "-verbose:gc" I get output like > this: > > [GC 69312K->67824K(251136K), 0.1533803 secs] > [GC 137136K->135015K(251136K), 0.0970460 secs] > [GC 137245K(251136K), 0.0095245 secs] > [GC 204327K->204326K(274368K), 0.1056259 secs] > [GC 273638K->273636K(343680K), 0.1081515 secs] > [GC 342948K->342946K(412992K), 0.1181966 secs] > [GC 412258K->412257K(482304K), 0.1126966 secs] > [GC 481569K->481568K(551808K), 0.1156015 secs] > [GC 550880K->550878K(620928K), 0.1184089 secs] > [GC 620190K->620189K(690048K), 0.1209312 secs] > [GC 689501K->689499K(759552K), 0.1199338 secs] > [GC 758811K->758809K(828864K), 0.1162532 secs] > CPU Load Is -1.0 > > Start > Stop > Sleep > CPU Load Is 0.10791719146608299 > > Start > [GC 821213K(828864K), 0.1966807 secs] > Stop > Sleep > CPU Load Is 0.1540065314146181 > > Start > Stop > Sleep > [GC 821213K(1328240K), 0.1962688 secs] > CPU Load Is 0.08427292195744103 > > ... > > > > Why is the G1 garbage collector consuming so much CPU > time? Is it stuck in the mark phase as I am modifying > the graph structure? > > I'm not a subscriber to the list, so please CC me in > any response. > > Thanks, > Peter. > > -- > > import java.lang.management.ManagementFactory; > import com.sun.management.OperatingSystemMXBean; > import java.util.Random; > > @SuppressWarnings("restriction") > public class Node { > private static OperatingSystemMXBean os = > (OperatingSystemMXBean) > ManagementFactory.getOperatingSystemMXBean(); > > private Node next; > > private Node[] others = new Node[10]; > > public static void main(String[] args) throws > InterruptedException { > > // Build a graph of Nodes > Node head = buildGraph(); > > while (true) { > // Print CPU load for this process > System.out.println("CPU Load Is " + > os.getProcessCpuLoad()); > System.out.println(); > > // Modify the graph > System.out.println("Start"); > head = modifyGraph(head); > System.out.println("Stop"); > > // Sleep, as otherwise we tend to DoS the host computer... > System.out.println("Sleep"); > Thread.sleep(1000); > } > } > > private static Node buildGraph() { > > // Create a collection of Node objects > Node[] array = new Node[10000000]; > for (int i = 0; i < array.length; i++) { > array[i] = new Node(); > } > > // Each Node refers to 10 other random Nodes > Random random = new Random(12); > for (int i = 0; i < array.length; i++) { > for (int j = 0; j < array[i].others.length; j++) { > int k = random.nextInt(array.length); > array[i].others[j] = array[k]; > } > } > > // The first Node serves as the head of a queue > return array[0]; > } > > private static Node modifyGraph(Node head) { > > // Perform a million iterations > for (int i = 0; i < 1000000; i++) { > > // Pop a Node off the head of the queue > Node node = head; > head = node.next; > node.next = null; > > // Add the other Nodes to the head of the queue > for (Node other : node.others) { > other.next = head; > head = other; > } > } > return head; > } > > } > > -- > *Actenum Corporation* > Peter Harvey | Cell: 780.729.8192 > | harvey at actenum.com | > www.actenum.com > > > > > -- > *Actenum Corporation* > Peter Harvey | Cell: 780.729.8192 | > harvey at actenum.com | > www.actenum.com > > > > > > > -- > *Actenum Corporation* > Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com > | www.actenum.com From claes.redestad at oracle.com Tue Jun 3 23:28:49 2014 From: claes.redestad at oracle.com (claes.redestad) Date: Wed, 04 Jun 2014 01:28:49 +0200 Subject: G1 GC consuming all CPU time Message-ID: At least CPU load is down, suggesting its no longer a concurrency issue. One thing that comes to mind is that G1 emits costly write and read barriers that heavily penalize interpreted code, while JMH generally avoid that benchmarking trap. Try extracting the loop body in your test into a method to help the JIT along and see if that evens out the playing field? /Claes. -------- Originalmeddelande -------- Fr?n: Peter Harvey Datum:04-06-2014 01:12 (GMT+01:00) Till: Claes Redestad Kopia: hotspot-gc-dev Rubrik: Re: G1 GC consuming all CPU time Here's my list of benchmarks from the previous email: -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU load was 0.1247 -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU load was 0.5960 -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU load was 0.9184 -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119 regardless of the setting of -XX:ParallelGCThreads And here's using those new parameters with my original code: -XX:+UseG1GC -XX:-G1UseAdaptiveConcRefinement -XX:G1ConcRefinementThreads=1 took 53.077 seconds and CPU load was 0.1237 I may be completely misunderstanding the implications of your JMH tests, but those parameters don't seem to improve overall application performance when using my original code. It looks to me like the use of the G1 garbage collector is somehow also slowing down the application itself (but not necessarily your JMH test?). Not only are the GC threads tripping over each other, they seem to be tripping up the main application thread too. Regards, Peter. On Tue, Jun 3, 2014 at 4:57 PM, Claes Redestad wrote: Hi, guessing it's due to the concurrent GC threads tripping over themselves: the microbenchmark is creating one, big linked structure that will occupy most of the old gen, and then you're doing intense pointer updates which will trigger scans and updates of remembered sets etc. I actually don't know half the details and am mostly just guessing. :-) Converted your micro to a JMH micro to ease with experimenting[1] (hope you don't mind) then verified the regression reproduces: Parallel: java -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.* ~1625 ops/ms G1: java -XX:+UseG1GC -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.* ~12 ops/ms Testing my hunch, let's try forcing the concurrent refinement to use only one thread: java -XX:+UseG1GC -XX:-G1UseAdaptiveConcRefinement -XX:G1ConcRefinementThreads=1 -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.* ~1550 ops/ms I guess we have a winner! I won't hazard to try and answer your questions about how this should be resolved - perhaps the adaptive policy can detect this corner case and scale down the number of refinement threads when they start interfering with each other, or something. /Claes [1] package org.sample; import org.openjdk.jmh.annotations.GenerateMicroBenchmark; import org.openjdk.jmh.annotations.Scope; import org.openjdk.jmh.annotations.State; import java.util.Random; @State(Scope.Thread) public class G1GraphBench { private static class Node { private Node next; private Node[] others = new Node[10]; } Node head = buildGraph(); private static Node buildGraph() { // Create a collection of Node objects Node[] array = new Node[10000000]; for (int i = 0; i < array.length; i++) { array[i] = new Node(); } // Each Node refers to 10 other random Nodes Random random = new Random(12); for (int i = 0; i < array.length; i++) { for (int j = 0; j < array[i].others.length; j++) { int k = random.nextInt(array.length); array[i].others[j] = array[k]; } } // The first Node serves as the head of a queue return array[0]; } @GenerateMicroBenchmark public Node nodeBench() { Node node = head; head = node.next; node.next = null; // Add the other Nodes to the head of the queue for (Node other : node.others) { other.next = head; head = other; } return head; } } On 2014-06-03 23:43, Peter Harvey wrote: Thanks for the response. Here are the additional logs. 0.094: [GC pause (young), 0.0347877 secs] [Parallel Time: 34.1 ms, GC Workers: 8] [GC Worker Start (ms): Min: 94.2, Avg: 104.4, Max: 126.4, Diff: 32.2] [Ext Root Scanning (ms): Min: 0.0, Avg: 3.3, Max: 25.0, Diff: 25.0, Sum: 26.6] [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 5.3, Diff: 5.3, Sum: 16.7] [Processed Buffers: Min: 0, Avg: 2.3, Max: 9, Diff: 9, Sum: 18] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 1.8, Avg: 18.3, Max: 29.9, Diff: 28.2, Sum: 146.4] [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.6] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 1.9, Avg: 23.8, Max: 34.1, Diff: 32.2, Sum: 190.4] [GC Worker End (ms): Min: 128.2, Avg: 128.3, Max: 128.3, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.0 ms] [Other: 0.6 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.3 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 24.0M(24.0M)->0.0B(11.0M) Survivors: 0.0B->3072.0K Heap: 62.1M(254.0M)->62.2M(254.0M)] [Times: user=0.09 sys=0.03, real=0.04 secs] 0.131: [GC pause (young), 0.0295093 secs] [Parallel Time: 28.1 ms, GC Workers: 8] [GC Worker Start (ms): Min: 130.9, Avg: 135.5, Max: 158.7, Diff: 27.8] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 1.2] [Update RS (ms): Min: 0.0, Avg: 11.4, Max: 27.5, Diff: 27.5, Sum: 90.8] [Processed Buffers: Min: 0, Avg: 23.8, Max: 42, Diff: 42, Sum: 190] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 11.7, Max: 17.1, Diff: 17.1, Sum: 93.8] [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.7] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 0.2, Avg: 23.5, Max: 28.1, Diff: 27.8, Sum: 187.7] [GC Worker End (ms): Min: 159.0, Avg: 159.0, Max: 159.0, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 1.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 11.0M(11.0M)->0.0B(23.0M) Survivors: 3072.0K->2048.0K Heap: 73.2M(254.0M)->82.7M(508.0M)] [Times: user=0.19 sys=0.00, real=0.03 secs] 0.166: [GC pause (young), 0.0385523 secs] [Parallel Time: 35.9 ms, GC Workers: 8] [GC Worker Start (ms): Min: 166.4, Avg: 169.8, Max: 192.4, Diff: 25.9] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 1.9] [Update RS (ms): Min: 0.0, Avg: 10.9, Max: 31.9, Diff: 31.9, Sum: 87.2] [Processed Buffers: Min: 0, Avg: 14.6, Max: 26, Diff: 26, Sum: 117] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Object Copy (ms): Min: 3.5, Avg: 21.4, Max: 27.0, Diff: 23.4, Sum: 171.1] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 10.0, Avg: 32.6, Max: 35.9, Diff: 25.9, Sum: 260.7] [GC Worker End (ms): Min: 202.3, Avg: 202.4, Max: 202.4, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.0 ms] [Other: 2.6 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 23.0M(23.0M)->0.0B(46.0M) Survivors: 2048.0K->4096.0K Heap: 105.7M(508.0M)->110.1M(1016.0M)] [Times: user=0.19 sys=0.00, real=0.04 secs] 0.222: [GC pause (young), 0.0558720 secs] [Parallel Time: 53.0 ms, GC Workers: 8] [GC Worker Start (ms): Min: 222.0, Avg: 222.2, Max: 222.5, Diff: 0.5] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 1.5] [Update RS (ms): Min: 7.7, Avg: 8.7, Max: 10.9, Diff: 3.2, Sum: 69.4] [Processed Buffers: Min: 7, Avg: 8.5, Max: 12, Diff: 5, Sum: 68] [Scan RS (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.6, Sum: 2.3] [Object Copy (ms): Min: 41.7, Avg: 43.6, Max: 44.3, Diff: 2.7, Sum: 348.5] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 52.4, Avg: 52.7, Max: 52.9, Diff: 0.5, Sum: 421.8] [GC Worker End (ms): Min: 274.9, Avg: 274.9, Max: 274.9, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.0 ms] [Other: 2.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 46.0M(46.0M)->0.0B(74.0M) Survivors: 4096.0K->7168.0K Heap: 156.1M(1016.0M)->158.6M(1625.0M)] [Times: user=0.48 sys=0.01, real=0.06 secs] 0.328: [GC pause (young), 0.0853794 secs] [Parallel Time: 82.8 ms, GC Workers: 8] [GC Worker Start (ms): Min: 327.9, Avg: 330.8, Max: 351.1, Diff: 23.2] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] [Update RS (ms): Min: 0.0, Avg: 5.5, Max: 8.3, Diff: 8.3, Sum: 43.9] [Processed Buffers: Min: 0, Avg: 2.3, Max: 3, Diff: 3, Sum: 18] [Scan RS (ms): Min: 0.0, Avg: 2.2, Max: 3.3, Diff: 3.3, Sum: 17.4] [Object Copy (ms): Min: 59.5, Avg: 71.8, Max: 73.7, Diff: 14.2, Sum: 574.7] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 59.5, Avg: 79.8, Max: 82.7, Diff: 23.2, Sum: 638.4] [GC Worker End (ms): Min: 410.6, Avg: 410.7, Max: 410.7, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.6 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 74.0M(74.0M)->0.0B(94.0M) Survivors: 7168.0K->11.0M Heap: 232.6M(1625.0M)->237.6M(2112.0M)] [Times: user=0.59 sys=0.00, real=0.09 secs] 0.447: [GC pause (young), 0.1239103 secs] [Parallel Time: 121.5 ms, GC Workers: 8] [GC Worker Start (ms): Min: 447.5, Avg: 447.7, Max: 448.5, Diff: 0.9] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.9] [Update RS (ms): Min: 26.5, Avg: 28.2, Max: 28.7, Diff: 2.2, Sum: 225.7] [Processed Buffers: Min: 38, Avg: 39.8, Max: 44, Diff: 6, Sum: 318] [Scan RS (ms): Min: 0.3, Avg: 0.7, Max: 1.9, Diff: 1.6, Sum: 5.3] [Object Copy (ms): Min: 92.1, Avg: 92.2, Max: 92.3, Diff: 0.2, Sum: 737.5] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 120.6, Avg: 121.4, Max: 121.5, Diff: 0.9, Sum: 970.8] [GC Worker End (ms): Min: 569.0, Avg: 569.0, Max: 569.0, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 94.0M(94.0M)->0.0B(111.0M) Survivors: 11.0M->14.0M Heap: 331.6M(2112.0M)->334.6M(2502.0M)] [Times: user=0.80 sys=0.05, real=0.12 secs] 0.599: [GC pause (young), 0.1479438 secs] [Parallel Time: 145.7 ms, GC Workers: 8] [GC Worker Start (ms): Min: 599.4, Avg: 599.5, Max: 599.8, Diff: 0.4] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.9] [Update RS (ms): Min: 41.8, Avg: 43.0, Max: 44.0, Diff: 2.1, Sum: 343.6] [Processed Buffers: Min: 67, Avg: 70.9, Max: 73, Diff: 6, Sum: 567] [Scan RS (ms): Min: 0.0, Avg: 0.8, Max: 1.9, Diff: 1.9, Sum: 6.2] [Object Copy (ms): Min: 101.3, Avg: 101.6, Max: 101.7, Diff: 0.3, Sum: 812.6] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 145.2, Avg: 145.6, Max: 145.6, Diff: 0.4, Sum: 1164.6] [GC Worker End (ms): Min: 745.1, Avg: 745.1, Max: 745.1, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.2 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 111.0M(111.0M)->0.0B(124.0M) Survivors: 14.0M->16.0M Heap: 445.6M(2502.0M)->448.6M(2814.0M)] [Times: user=1.20 sys=0.05, real=0.15 secs] 0.787: [GC pause (young), 0.1625321 secs] [Parallel Time: 160.0 ms, GC Workers: 8] [GC Worker Start (ms): Min: 786.6, Avg: 786.7, Max: 786.9, Diff: 0.4] [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.8] [Update RS (ms): Min: 46.4, Avg: 47.0, Max: 49.0, Diff: 2.5, Sum: 376.0] [Processed Buffers: Min: 75, Avg: 78.0, Max: 79, Diff: 4, Sum: 624] [Scan RS (ms): Min: 0.0, Avg: 0.9, Max: 1.5, Diff: 1.5, Sum: 7.4] [Object Copy (ms): Min: 110.6, Avg: 111.7, Max: 112.0, Diff: 1.4, Sum: 893.5] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3] [GC Worker Total (ms): Min: 159.6, Avg: 159.9, Max: 160.0, Diff: 0.4, Sum: 1279.0] [GC Worker End (ms): Min: 946.5, Avg: 946.5, Max: 946.6, Diff: 0.1] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.2 ms] [Eden: 124.0M(124.0M)->0.0B(135.0M) Survivors: 16.0M->18.0M Heap: 572.6M(2814.0M)->576.6M(3064.0M)] [Times: user=1.37 sys=0.00, real=0.16 secs] 0.981: [GC pause (young), 0.2063055 secs] [Parallel Time: 204.1 ms, GC Workers: 8] [GC Worker Start (ms): Min: 980.8, Avg: 980.9, Max: 981.0, Diff: 0.2] [Ext Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.3, Diff: 0.2, Sum: 2.1] [Update RS (ms): Min: 55.9, Avg: 57.8, Max: 58.8, Diff: 2.9, Sum: 462.8] [Processed Buffers: Min: 100, Avg: 101.5, Max: 103, Diff: 3, Sum: 812] [Scan RS (ms): Min: 0.0, Avg: 1.0, Max: 3.1, Diff: 3.1, Sum: 8.3] [Object Copy (ms): Min: 144.7, Avg: 144.8, Max: 144.9, Diff: 0.1, Sum: 1158.3] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.3] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 203.8, Avg: 204.0, Max: 204.0, Diff: 0.2, Sum: 1631.9] [GC Worker End (ms): Min: 1184.9, Avg: 1184.9, Max: 1184.9, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.1 ms] [Other: 2.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.1 ms] [Eden: 135.0M(135.0M)->0.0B(143.0M) Survivors: 18.0M->20.0M Heap: 711.6M(3064.0M)->714.6M(3264.0M)] [Times: user=1.40 sys=0.11, real=0.21 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.9166222455142531 Start Stop Sleep CPU Load Is 0.907013989900451 Start Stop Sleep CPU Load Is 0.9085635227776081 Start Stop Sleep CPU Load Is 0.909945506396622 Note that all the logged GC occurs during the construction of my graph of Nodes, which is /before/ my algorithm (modifyGraph) starts, There is no log of GC activity once the algorithm starts, but there is significant (100%) CPU usage. My questions are: * Why is the G1 garbage collector consuming so much CPU time? What is it doing? * Why is the G1 garbage collector not logging anything? The only reason I even know it's the garbage collector consuming my CPU time is that (a) I only see this behaviour when the G1 collector is enabled and (b) the load on the CPU correlates with the value of -XX:ParallelGCThreads. * Are there particular object-graph structures that the G1 garbage collector will struggle with? Should complex graphs be considered bad coding practice? * How can I write my code to avoid this behaviour in the G1 garbage collector? For example, if all my Nodes are in an array, will this fix it? * Should this be considered a bug in the G1 garbage collector? This is far beyond 'a small increase in CPU usage'. Just to demonstrate the issue further, I timed my calls to modifyGraph() and trialled different GC parameters: * -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU load was 0.1247 * -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU load was 0.5960 * -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU load was 0.9184 * -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119 regardless of the setting of -XX:ParallelGCThreads So using the CMS GC made my application 44x faster (14.815 seconds versus 0.322 seconds) and placed 1/8th of the load (0.9184 versus 0.1119) on the CPU. If my code represents some kind of hypothetical worst case for the G1 garbage collector, I think it should be documented and/or fixed somehow. Regards, Peter. On Tue, Jun 3, 2014 at 3:16 PM, Tao Mao > wrote: And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more log. Thanks. -Tao On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao > wrote: Hi Peter, What was your actual question? Try -XX:ParallelGCThreads= if you want less CPU usage from GC. Thanks. Tao On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey > wrote: Small correction. The last example of output was with "-XX:+UseConcMarkSweepGC -verbose:gc". On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey > wrote: I have an algorithm (at bottom of email) which builds a graph of 'Node' objects with random connections between them. It then repeatedly processes a queue of those Nodes, adding new Nodes to the queue as it goes. This is a single-threaded algorithm that will never terminate. Our actual production code is much more complex, but I've trimmed it down as much as possible. On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause the JRE to consume all 8 cores of my CPU. No other garbage collector does this. You can see the differences in CPU load in the example output below. It's also worth nothing that "-verbose:gc" with the G1 garbage collector prints nothing after my algorithm starts. Presumably the G1 garbage collector is doing something (concurrent mark?), but it's not printing anything about it. When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this (note the huge CPU load value which should not be this high for a single-threaded algorithm on an 8 core CPU): [GC pause (young) 62M->62M(254M), 0.0394214 secs] [GC pause (young) 73M->83M(508M), 0.0302781 secs] [GC pause (young) 106M->111M(1016M), 0.0442273 secs] [GC pause (young) 157M->161M(1625M), 0.0660902 secs] [GC pause (young) 235M->240M(2112M), 0.0907231 secs] [GC pause (young) 334M->337M(2502M), 0.1356917 secs] [GC pause (young) 448M->450M(2814M), 0.1219090 secs] [GC pause (young) 574M->577M(3064M), 0.1778062 secs] [GC pause (young) 712M->715M(3264M), 0.1878443 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.9196154547182949 Start Stop Sleep CPU Load Is 0.9150735995043818 ... When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like this: [GC 65536K->64198K(249344K), 0.0628289 secs] [GC 129734K->127974K(314880K), 0.1583369 secs] [Full GC 127974K->127630K(451072K), 0.9675224 secs] [GC 258702K->259102K(451072K), 0.3543645 secs] [Full GC 259102K->258701K(732672K), 1.8085702 secs] [GC 389773K->390181K(790528K), 0.3332060 secs] [GC 579109K->579717K(803328K), 0.5126388 secs] [Full GC 579717K->578698K(1300480K), 4.0647303 secs] [GC 780426K->780842K(1567232K), 0.4364933 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.03137771539054431 Start Stop Sleep CPU Load Is 0.032351299224373145 ... When run with VM args "-verbose:gc" I get output like this: [GC 69312K->67824K(251136K), 0.1533803 secs] [GC 137136K->135015K(251136K), 0.0970460 secs] [GC 137245K(251136K), 0.0095245 secs] [GC 204327K->204326K(274368K), 0.1056259 secs] [GC 273638K->273636K(343680K), 0.1081515 secs] [GC 342948K->342946K(412992K), 0.1181966 secs] [GC 412258K->412257K(482304K), 0.1126966 secs] [GC 481569K->481568K(551808K), 0.1156015 secs] [GC 550880K->550878K(620928K), 0.1184089 secs] [GC 620190K->620189K(690048K), 0.1209312 secs] [GC 689501K->689499K(759552K), 0.1199338 secs] [GC 758811K->758809K(828864K), 0.1162532 secs] CPU Load Is -1.0 Start Stop Sleep CPU Load Is 0.10791719146608299 Start [GC 821213K(828864K), 0.1966807 secs] Stop Sleep CPU Load Is 0.1540065314146181 Start Stop Sleep [GC 821213K(1328240K), 0.1962688 secs] CPU Load Is 0.08427292195744103 ... Why is the G1 garbage collector consuming so much CPU time? Is it stuck in the mark phase as I am modifying the graph structure? I'm not a subscriber to the list, so please CC me in any response. Thanks, Peter. -- import java.lang.management.ManagementFactory; import com.sun.management.OperatingSystemMXBean; import java.util.Random; @SuppressWarnings("restriction") public class Node { private static OperatingSystemMXBean os = (OperatingSystemMXBean) ManagementFactory.getOperatingSystemMXBean(); private Node next; private Node[] others = new Node[10]; public static void main(String[] args) throws InterruptedException { // Build a graph of Nodes Node head = buildGraph(); while (true) { // Print CPU load for this process System.out.println("CPU Load Is " + os.getProcessCpuLoad()); System.out.println(); // Modify the graph System.out.println("Start"); head = modifyGraph(head); System.out.println("Stop"); // Sleep, as otherwise we tend to DoS the host computer... System.out.println("Sleep"); Thread.sleep(1000); } } private static Node buildGraph() { // Create a collection of Node objects Node[] array = new Node[10000000]; for (int i = 0; i < array.length; i++) { array[i] = new Node(); } // Each Node refers to 10 other random Nodes Random random = new Random(12); for (int i = 0; i < array.length; i++) { for (int j = 0; j < array[i].others.length; j++) { int k = random.nextInt(array.length); array[i].others[j] = array[k]; } } // The first Node serves as the head of a queue return array[0]; } private static Node modifyGraph(Node head) { // Perform a million iterations for (int i = 0; i < 1000000; i++) { // Pop a Node off the head of the queue Node node = head; head = node.next; node.next = null; // Add the other Nodes to the head of the queue for (Node other : node.others) { other.next = head; head = other; } } return head; } } -- *Actenum Corporation* Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | www.actenum.com -- *Actenum Corporation* Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | www.actenum.com -- *Actenum Corporation* Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | www.actenum.com -- Actenum Corporation Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com | www.actenum.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From per.liden at oracle.com Wed Jun 4 09:21:56 2014 From: per.liden at oracle.com (Per Liden) Date: Wed, 04 Jun 2014 11:21:56 +0200 Subject: RFR(s): 8044768: Backout fix for JDK-8040807 Message-ID: <538EE534.2000600@oracle.com> Hi, Requesting reviews on this anti-delta to backout the fix for JDK-8040807. It turns out that there are still issues here, which for some reason didn't show up in the testing I did :( Bug: https://bugs.openjdk.java.net/browse/JDK-8044768 Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/ Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807 Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ Original review: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html Thanks! /Per From bengt.rutisson at oracle.com Wed Jun 4 10:35:02 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 04 Jun 2014 12:35:02 +0200 Subject: RFR(s): 8044768: Backout fix for JDK-8040807 In-Reply-To: <538EE534.2000600@oracle.com> References: <538EE534.2000600@oracle.com> Message-ID: <538EF656.1030306@oracle.com> Hi Per, Looks good. Bengt On 2014-06-04 11:21, Per Liden wrote: > Hi, > > Requesting reviews on this anti-delta to backout the fix for > JDK-8040807. It turns out that there are still issues here, which for > some reason didn't show up in the testing I did :( > > Bug: https://bugs.openjdk.java.net/browse/JDK-8044768 > Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/ > > Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807 > Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ > Original review: > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html > > Thanks! > /Per From erik.helin at oracle.com Wed Jun 4 12:02:45 2014 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 04 Jun 2014 14:02:45 +0200 Subject: RFR(s): 8044768: Backout fix for JDK-8040807 In-Reply-To: <538EE534.2000600@oracle.com> References: <538EE534.2000600@oracle.com> Message-ID: <1938981.x4JAnBs7PM@ehelin-desktop> Hi Per, looks good. Thanks, Erik On Wednesday 04 June 2014 11.21.56 Per Liden wrote: > Hi, > > Requesting reviews on this anti-delta to backout the fix for > JDK-8040807. It turns out that there are still issues here, which for > some reason didn't show up in the testing I did :( > > Bug: https://bugs.openjdk.java.net/browse/JDK-8044768 > Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/ > > Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807 > Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ > Original review: > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html > > Thanks! > /Per From per.liden at oracle.com Wed Jun 4 12:14:36 2014 From: per.liden at oracle.com (Per Liden) Date: Wed, 04 Jun 2014 14:14:36 +0200 Subject: RFR(s): 8044768: Backout fix for JDK-8040807 In-Reply-To: <1938981.x4JAnBs7PM@ehelin-desktop> References: <538EE534.2000600@oracle.com> <1938981.x4JAnBs7PM@ehelin-desktop> Message-ID: <538F0DAB.6050004@oracle.com> Thanks Bengt, Erik! /Per On 2014-06-04 14:02, Erik Helin wrote: > Hi Per, > > looks good. > > Thanks, > Erik > > On Wednesday 04 June 2014 11.21.56 Per Liden wrote: >> Hi, >> >> Requesting reviews on this anti-delta to backout the fix for >> JDK-8040807. It turns out that there are still issues here, which for >> some reason didn't show up in the testing I did :( >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8044768 >> Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/ >> >> Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807 >> Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/ >> Original review: >> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html >> >> Thanks! >> /Per From andrey.x.zakharov at oracle.com Wed Jun 4 14:40:47 2014 From: andrey.x.zakharov at oracle.com (Andrey Zakharov) Date: Wed, 04 Jun 2014 18:40:47 +0400 Subject: RFR: 8041506 - The test gc/g1/TestHumongousShrinkHeap.java reports that memory is not de-committed In-Reply-To: <53835E9B.7080102@oracle.com> References: <5374A502.5030409@oracle.com> <53835E9B.7080102@oracle.com> Message-ID: <538F2FEF.30105@oracle.com> Hi, Dmitry. Thanks for corrections. Here is updated webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/ testing: http://aurora.ru.oracle.com/functional/faces/ChessBoard.xhtml?reportName=J2SEFailures¶meters=[batchNames]501230.ute.hs_jtreg.accept.full bug: https://bugs.openjdk.java.net/browse/JDK-8041946 On 26.05.2014 19:32, Dmitry Fazunenko wrote: > Hi Andrey, > > Sorry, it took too long from me to review you change. > I have several comments: > - overall fix looks good > - I think you need to change the subject: you fix 8041946, not 8041506 > - Replace 'TestHumongousShrinkHeap' with 'TestShrinkDefragmentedHeap' > - Make MemoryUsagePrinter as static inner class of test (to avoid > possible conflicts with other tests) > - It would be good if you provide more text description to the test, like > * allocate small objects mixed with humongous ones > "ssssHssssHssssHssssHssssH" > * release all allocated object except the last humongous one > ".............................................H" > * invoke gc and check that memory returned to the system (amount of > committed memory got down) Done > - I'm not sure that you can predict the expected amount of committed > memory at the end... I wouldn't use the expectedCommitted in the test > (there are many memory consumers, not only your test, so the final > committed should be either less or greater than expectedCommitted ) Well, I have tested it a lot with JFR command line options, on all platforms. I found a lag with JMX on Solaris, and just put sleep before measure. Also I replaced run/othervm with ProcessBuilder. I'm planning to replace it in other early our CMM tests. > - I think you don't need to touch 'test/TEST.groups'. There is > :needs_g1gc tests group (hs/test/closed/TEST.group) which lists all g1 > specific tests. > - Please provide information on how you tested your change. http://aurora.ru.oracle.com/functional/faces/ChessBoard.xhtml?reportName=J2SEFailures¶meters=[batchNames]501230.ute.hs_jtreg.accept.full Thanks > > Thanks, > Dima > > > On 15.05.2014 15:29, Andrey Zakharov wrote: >> Hi. >> To proper testing of free list sorting we need to defragment memory >> with small young and humongous objects >> This is test scenario: >> Make enough space for new objects to prevent it going old. >> - allocate bunch of small objects, and a bit of humongous >> several times. >> >> Free almost all of allocated stuff. Check that heap shrinks after GC. >> >> webrev: http://cr.openjdk.java.net/~jwilhelm/8041506/webrev.02/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8041506 >> >> Thanks. >> > From jon.masamitsu at oracle.com Thu Jun 5 17:53:40 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 05 Jun 2014 10:53:40 -0700 Subject: G1 GC consuming all CPU time Message-ID: <5390AEA4.4030605@oracle.com> Forwarding this for Peter Harvey because I don't know what happened to it (while it was waiting to be moderated). ============================================== Hi, I've constructed a more practical microbenchmark that demonstrates the previously-described issue in the G1 collector. The code at the end of this email will repeatedly prepend a million randomly-valued nodes to a doubly-linked list, and then apply a simple merge sort to that list. The merge sort manipulates many reference values, resulting in the same issues as described earlier. Using just -XX:+UseG1GC with no other options, the VM will seemingly try to balance the number of concurrent refinement threads. But no matter how many threads it chooses to use, performance is significantly degraded when compare to the CMS collector. When using the CMS collector my microbenchmark has output like: Took 752 to prepend 1000000 and then sort all 1000000 Took 2114 to prepend 1000000 and then sort all 2000000 Took 2672 to prepend 1000000 and then sort all 3000000 Took 2752 to prepend 1000000 and then sort all 4000000 Took 2056 to prepend 1000000 and then sort all 5000000 When using the G1 collector my microbenchmark has output like: Took 1693 to prepend 1000000 and then sort all 1000000 Took 5774 to prepend 1000000 and then sort all 2000000 Took 9546 to prepend 1000000 and then sort all 3000000 Took 15480 to prepend 1000000 and then sort all 4000000 Took 20235 to prepend 1000000 and then sort all 5000000 With the -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeRSetStats -XX:G1SummarizeRSetStatsPeriod=1 options enabled I get diagnostic output like: Concurrent RS processed 29981518 cards Of 117869 completed buffers: 18647 ( 15.8%) by conc RS threads. 99222 ( 84.2%) by mutator threads. Concurrent RS processed 68819227 cards Of 273465 completed buffers: 164272 ( 60.1%) by conc RS threads. 109193 ( 39.9%) by mutator threads. My original code was an extreme corner case of graph manipulation (though yes, we do ship a commercial product with that kind of code in it). I hope that 'merge sort on a linked list of random data' can serve as a more useful example of where the G1 collector will not perform well. From what I understand, any algorithm that modifies a large number of references connecting many small objects will bring out this behaviour in the G1 collector. For example, I would suspect that large reference-based heap structures (where inserted nodes have random values) may also cause issues for the G1 collector. Regards, Peter. ---- package linkedlist; import java.util.Random; public class List { // Node in the linked list public final static class Node { Node prev; Node next; double value; } // Random number generator private final Random random = new Random(12); // Dummy node for the head private final Node head = new Node(); // Split the list at the given node, and sort the right-hand side private static Node splitAndSort(Node node, boolean ascending) { // Split the list at the given node if (node.prev != null) node.prev.next = null; node.prev = null; // Ensure we have at LEAST two elements if (node.next == null) return node; // Find the midpoint to split the list Node mid = node.next; Node end = node.next; do { end = end.next; if (end != null) { end = end.next; mid = mid.next; } } while (end != null); // Sort the two sides Node list2 = splitAndSort(mid, ascending); Node list1 = splitAndSort(node, ascending); // Merge the two lists (setting prev only) node = null; while (true) { if (list1 == null) { list2.prev = node; node = list2; break; } else if (list2 == null) { list1.prev = node; node = list1; break; } else if (ascending == (list1.value < list2.value)) { list2.prev = node; node = list2; list2 = list2.next; } else { list1.prev = node; node = list1; list1 = list1.next; } } // Fix all the nexts (based on the prevs) while (node.prev != null) { node.prev.next = node; node = node.prev; } return node; } // Sort the nodes in ascending order public void sortNodes() { if (head.next != null) { head.next = splitAndSort(head.next, true); head.next.prev = head; } } // Prepend a number of nodes with random values public void prependNodes(int count) { for (int i = 0; i < count; i++) { Node node = new Node(); if (head.next != null) { node.next = head.next; head.next.prev = node; } node.value = random.nextDouble(); node.prev = head; head.next = node; } } public static void main(String[] args) { List list = new List(); int count = 0; long start = System.currentTimeMillis(); while (true) { // Append a million random entries list.prependNodes(1000000); count += 1000000; // Sort the entire list list.sortNodes(); // Print the time taken for this pass long end = System.currentTimeMillis(); System.out.println("Took " + (end - start) + " to prepend 1000000 and then sort all " + count); start = end; } } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From John.Coomes at oracle.com Fri Jun 6 14:06:11 2014 From: John.Coomes at oracle.com (John Coomes) Date: Fri, 6 Jun 2014 07:06:11 -0700 Subject: RFR(S): 8026396 - Remove information duplication in the collector policy In-Reply-To: <5362F22A.1030505@oracle.com> References: <5362F22A.1030505@oracle.com> Message-ID: <21393.51923.883180.555814@mykonos.us.oracle.com> Jesper Wilhelmsson (jesper.wilhelmsson at oracle.com) wrote: > Hi, > > Another step towards cleaner collector policy code. > > This cleanup removes the need to keep the generation sizing flags in sync with > the collector policy version of the same variables during setup. The collector > policy variables are initialized in the start and then used throughout the setup > code. In the end we write the values back to the flags if needed. > > This change builds upon the merged collector policy (8027643) currently in review. > > Webrev: http://cr.openjdk.java.net/~jwilhelm/8026396/webrev/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8026396 This looks good to me. There seems to be a chance for underflow here: 534 _min_gen1_size = MIN2(_initial_gen1_size, _min_heap_byte_size - _min_gen0_size); but your change is ok, since that also existed in the original code. -John From jesper.wilhelmsson at oracle.com Mon Jun 9 07:45:55 2014 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Mon, 09 Jun 2014 09:45:55 +0200 Subject: RFR(S): 8026396 - Remove information duplication in the collector policy In-Reply-To: <21393.51923.883180.555814@mykonos.us.oracle.com> References: <5362F22A.1030505@oracle.com> <21393.51923.883180.555814@mykonos.us.oracle.com> Message-ID: <53956633.5050107@oracle.com> Thanks John! I'll have a look at the underflow issue. /Jesper John Coomes skrev 6/6/14 16:06: > Jesper Wilhelmsson (jesper.wilhelmsson at oracle.com) wrote: >> Hi, >> >> Another step towards cleaner collector policy code. >> >> This cleanup removes the need to keep the generation sizing flags in sync with >> the collector policy version of the same variables during setup. The collector >> policy variables are initialized in the start and then used throughout the setup >> code. In the end we write the values back to the flags if needed. >> >> This change builds upon the merged collector policy (8027643) currently in review. >> >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8026396/webrev/ >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8026396 > > This looks good to me. > > There seems to be a chance for underflow here: > > 534 _min_gen1_size = MIN2(_initial_gen1_size, _min_heap_byte_size - _min_gen0_size); > > but your change is ok, since that also existed in the original code. > > -John > From andrey.x.zakharov at oracle.com Mon Jun 9 14:31:29 2014 From: andrey.x.zakharov at oracle.com (Andrey Zakharov) Date: Mon, 09 Jun 2014 18:31:29 +0400 Subject: RFR: 8041946 - CMM Testing: 8u40 an allocated humongous object at the end of the heap should not prevents shrinking the heap Message-ID: <5395C541.1000207@oracle.com> Hi, everyone! Please, review this test for new feature in G1 - sorted free list which make possible shrinking of the defragmented heap. To proper testing of free list sorting we need to defragment memory with small young and humongous objects. This is test scenario: - make enough space for new objects to prevent it going old. - allocate bunch of small objects, and a bit of humongous several times (ssssHssssHssssHssssHssssHssssHssssHssssH) - free almost all of allocated stuff. Check that heap shrinks after GC. (-----------H) Webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8041946 I have tested it along all major platforms and it works fine. There is lag on Solaris MXBeans about memory usage, so I need sleep by 1s. It will be very nicely if somebody advice me about method which "flush" memory usage info to remove this sleep. Thanks. From igor.ignatyev at oracle.com Tue Jun 10 14:48:52 2014 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 10 Jun 2014 18:48:52 +0400 Subject: RFR(XS) : 8044575 : testlibrary_tests/whitebox/vm_flags/UintxTest.java failed: assert(!res || TypeEntriesAtCall::arguments_profiling_enabled()) failed: no profiling of arguments In-Reply-To: <538CF2C3.8090107@oracle.com> References: <538CE366.90806@oracle.com> <538CF2C3.8090107@oracle.com> Message-ID: <53971AD4.2000001@oracle.com> Hi GC-people, Could some of you look at the change? Igor On 06/03/2014 01:55 AM, Vladimir Kozlov wrote: > Hi Igor, > > Looks good to me but I would ask GC group to comment on this change. > > Thanks, > Vladimir > > On 6/2/14 1:49 PM, Igor Ignatyev wrote: >> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/ >> 4 lines changed: 0 ins; 2 del; 2 mod; >> >> Hi all, >> >> Please review patch: >> >> Problem: >> the test changes 'TypeProfileLevel' via WhiteBox during execution, but >> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts >> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers >> one of these asserts. >> >> Fix: >> - as a flag to change, the test uses 'VerifyGCStartAt' instead of >> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during execution >> - removed 'System.out.println' which was left by accident >> >> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575 >> testing: failing tests locally w/ different flags combinations From dmitry.fazunenko at oracle.com Tue Jun 10 15:08:37 2014 From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko) Date: Tue, 10 Jun 2014 19:08:37 +0400 Subject: RFR(XS) : 8044575 : testlibrary_tests/whitebox/vm_flags/UintxTest.java failed: assert(!res || TypeEntriesAtCall::arguments_profiling_enabled()) failed: no profiling of arguments In-Reply-To: <53971AD4.2000001@oracle.com> References: <538CE366.90806@oracle.com> <538CF2C3.8090107@oracle.com> <53971AD4.2000001@oracle.com> Message-ID: <53971F75.9050403@oracle.com> Looks good to me. On 10.06.2014 18:48, Igor Ignatyev wrote: > Hi GC-people, > > Could some of you look at the change? > > Igor > > On 06/03/2014 01:55 AM, Vladimir Kozlov wrote: >> Hi Igor, >> >> Looks good to me but I would ask GC group to comment on this change. >> >> Thanks, >> Vladimir >> >> On 6/2/14 1:49 PM, Igor Ignatyev wrote: >>> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/ >>> 4 lines changed: 0 ins; 2 del; 2 mod; >>> >>> Hi all, >>> >>> Please review patch: >>> >>> Problem: >>> the test changes 'TypeProfileLevel' via WhiteBox during execution, but >>> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts >>> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers >>> one of these asserts. >>> >>> Fix: >>> - as a flag to change, the test uses 'VerifyGCStartAt' instead of >>> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during >>> execution >>> - removed 'System.out.println' which was left by accident >>> >>> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575 >>> testing: failing tests locally w/ different flags combinations From jon.masamitsu at oracle.com Tue Jun 10 16:09:49 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 10 Jun 2014 09:09:49 -0700 Subject: RFR(XS) : 8044575 : testlibrary_tests/whitebox/vm_flags/UintxTest.java failed: assert(!res || TypeEntriesAtCall::arguments_profiling_enabled()) failed: no profiling of arguments In-Reply-To: <53971AD4.2000001@oracle.com> References: <538CE366.90806@oracle.com> <538CF2C3.8090107@oracle.com> <53971AD4.2000001@oracle.com> Message-ID: <53972DCD.5030105@oracle.com> Igor, Does it matter that VerifyGCStartAt is a diagnostic flag? diagnostic(uintx, VerifyGCStartAt, 0, \ "GC invoke count where +VerifyBefore/AfterGC kicks in") \ Otherwise, looks good. Jon On 06/10/2014 07:48 AM, Igor Ignatyev wrote: > Hi GC-people, > > Could some of you look at the change? > > Igor > > On 06/03/2014 01:55 AM, Vladimir Kozlov wrote: >> Hi Igor, >> >> Looks good to me but I would ask GC group to comment on this change. >> >> Thanks, >> Vladimir >> >> On 6/2/14 1:49 PM, Igor Ignatyev wrote: >>> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/ >>> 4 lines changed: 0 ins; 2 del; 2 mod; >>> >>> Hi all, >>> >>> Please review patch: >>> >>> Problem: >>> the test changes 'TypeProfileLevel' via WhiteBox during execution, but >>> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts >>> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers >>> one of these asserts. >>> >>> Fix: >>> - as a flag to change, the test uses 'VerifyGCStartAt' instead of >>> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during >>> execution >>> - removed 'System.out.println' which was left by accident >>> >>> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575 >>> testing: failing tests locally w/ different flags combinations From igor.ignatyev at oracle.com Tue Jun 10 16:26:25 2014 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 10 Jun 2014 20:26:25 +0400 Subject: RFR(XS) : 8044575 : testlibrary_tests/whitebox/vm_flags/UintxTest.java failed: assert(!res || TypeEntriesAtCall::arguments_profiling_enabled()) failed: no profiling of arguments In-Reply-To: <53972DCD.5030105@oracle.com> References: <538CE366.90806@oracle.com> <538CF2C3.8090107@oracle.com> <53971AD4.2000001@oracle.com> <53972DCD.5030105@oracle.com> Message-ID: <539731B1.3000702@oracle.com> Jon, > Does it matter that VerifyGCStartAt is a diagnostic flag? no it doesn't. Jon/Vladimir/Dima, thanks for review. Igor On 06/10/2014 08:09 PM, Jon Masamitsu wrote: > Igor, > > Does it matter that VerifyGCStartAt is a diagnostic flag? > > diagnostic(uintx, VerifyGCStartAt, > 0, \ > "GC invoke count where +VerifyBefore/AfterGC kicks in") \ > > Otherwise, looks good. > > Jon > > On 06/10/2014 07:48 AM, Igor Ignatyev wrote: >> Hi GC-people, >> >> Could some of you look at the change? >> >> Igor >> >> On 06/03/2014 01:55 AM, Vladimir Kozlov wrote: >>> Hi Igor, >>> >>> Looks good to me but I would ask GC group to comment on this change. >>> >>> Thanks, >>> Vladimir >>> >>> On 6/2/14 1:49 PM, Igor Ignatyev wrote: >>>> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/ >>>> 4 lines changed: 0 ins; 2 del; 2 mod; >>>> >>>> Hi all, >>>> >>>> Please review patch: >>>> >>>> Problem: >>>> the test changes 'TypeProfileLevel' via WhiteBox during execution, but >>>> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts >>>> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers >>>> one of these asserts. >>>> >>>> Fix: >>>> - as a flag to change, the test uses 'VerifyGCStartAt' instead of >>>> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during >>>> execution >>>> - removed 'System.out.println' which was left by accident >>>> >>>> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575 >>>> testing: failing tests locally w/ different flags combinations > From bengt.rutisson at oracle.com Wed Jun 11 10:33:01 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 11 Jun 2014 12:33:01 +0200 Subject: RFR (S): JDK-8046518: G1: Double calls to register_concurrent_cycle_end() Message-ID: <5398305D.7070800@oracle.com> Hi all, Can I have a review for this change? http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8046518 Background: When we abort a concurrent cycle due to a Full GC in G1 we call ConcurrentMark::abort(). That will set _has_aborted flag and then call register_concurrent_cycle_end(). The concurrent marking thread will see the _has_aborted flag in its ConcurrentMarkThread::run() method, abort the execution and then call register_concurrent_cycle_end(). Currently this works since the code inside register_concurrent_cycle_end() is guarded by _concurrent_cycle_started which it then resets. So, the double calls will not necessarily result in too much extra work being done. But one of the things that register_concurrent_cycle_end() does is to call report_gc_end() on the concurrent GC tracer. That prevents further use of it for this GC. This means that inside the ConcurrentMarkThread::run() method we can not rely on the tracer. Removing the call to register_concurrent_cycle_end() in ConcurrentMark::abort() and relying on the call in ConcurrentMarkThread::run() seems to be a reasonable approach. Thanks, Bengt -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Wed Jun 11 11:10:43 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 11 Jun 2014 13:10:43 +0200 Subject: RFR (S): JDK-8046518: G1: Double calls to register_concurrent_cycle_end() In-Reply-To: <5398305D.7070800@oracle.com> References: <5398305D.7070800@oracle.com> Message-ID: <53983933.2020204@oracle.com> On 2014-06-11 12:33, Bengt Rutisson wrote: > > Hi all, > > Can I have a review for this change? > > http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/ > > https://bugs.openjdk.java.net/browse/JDK-8046518 > > Background: > When we abort a concurrent cycle due to a Full GC in G1 we call > ConcurrentMark::abort(). That will set _has_aborted flag and then call > register_concurrent_cycle_end(). > > The concurrent marking thread will see the _has_aborted flag in its > ConcurrentMarkThread::run() method, abort the execution and then call > register_concurrent_cycle_end(). > > Currently this works since the code inside > register_concurrent_cycle_end() is guarded by > _concurrent_cycle_started which it then resets. So, the double calls > will not necessarily result in too much extra work being done. But one > of the things that register_concurrent_cycle_end() does is to call > report_gc_end() on the concurrent GC tracer. That prevents further use > of it for this GC. This means that inside the > ConcurrentMarkThread::run() method we can not rely on the tracer. > > Removing the call to register_concurrent_cycle_end() in > ConcurrentMark::abort() and relying on the call in > ConcurrentMarkThread::run() seems to be a reasonable approach. The double call was deliberately put there to make sure that we end the tracing of the concurrent GC before starting to trace teh Full GC. Why do you need to change this? I guess it has to do with your other GCId changes? thanks, StefanK > > Thanks, > Bengt -------------- next part -------------- An HTML attachment was scrubbed... URL: From jesper.wilhelmsson at oracle.com Wed Jun 11 13:19:49 2014 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 11 Jun 2014 15:19:49 +0200 Subject: RFR: 8041946 - CMM Testing: 8u40 an allocated humongous object at the end of the heap should not prevents shrinking the heap In-Reply-To: <5395C541.1000207@oracle.com> References: <5395C541.1000207@oracle.com> Message-ID: <53985775.2090609@oracle.com> Hi Andrey, As it is used, the constant MINIMAL_HEAP_SIZE does not define the minimal heap size but the minimal young gen size. Would you consider calling it MINIMAL_YOUNG_SIZE instead? Besides that it looks ok. /Jesper Andrey Zakharov skrev 9/6/14 16:31: > Hi, everyone! > Please, review this test for new feature in G1 - sorted free list which make > possible shrinking of the defragmented heap. > To proper testing of free list sorting we need to defragment memory with small > young and humongous objects. > This is test scenario: > - make enough space for new objects to prevent it going old. > - allocate bunch of small objects, and a bit of humongous several times > (ssssHssssHssssHssssHssssHssssHssssHssssH) > - free almost all of allocated stuff. Check that heap shrinks after GC. > (-----------H) > > Webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8041946 > > I have tested it along all major platforms and it works fine. There is lag on > Solaris MXBeans about memory usage, so I need sleep by 1s. > It will be very nicely if somebody advice me about method which "flush" memory > usage info to remove this sleep. > Thanks. > > > > > From bengt.rutisson at oracle.com Wed Jun 11 14:22:37 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 11 Jun 2014 16:22:37 +0200 Subject: RFR (S): JDK-8046518: G1: Double calls to register_concurrent_cycle_end() In-Reply-To: <53983933.2020204@oracle.com> References: <5398305D.7070800@oracle.com> <53983933.2020204@oracle.com> Message-ID: <5398662D.5000600@oracle.com> Hi Stefan, Thanks for looking at this! On 6/11/14 1:10 PM, Stefan Karlsson wrote: > > On 2014-06-11 12:33, Bengt Rutisson wrote: >> >> Hi all, >> >> Can I have a review for this change? >> >> http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/ >> >> https://bugs.openjdk.java.net/browse/JDK-8046518 >> >> Background: >> When we abort a concurrent cycle due to a Full GC in G1 we call >> ConcurrentMark::abort(). That will set _has_aborted flag and then >> call register_concurrent_cycle_end(). >> >> The concurrent marking thread will see the _has_aborted flag in its >> ConcurrentMarkThread::run() method, abort the execution and then call >> register_concurrent_cycle_end(). >> >> Currently this works since the code inside >> register_concurrent_cycle_end() is guarded by >> _concurrent_cycle_started which it then resets. So, the double calls >> will not necessarily result in too much extra work being done. But >> one of the things that register_concurrent_cycle_end() does is to >> call report_gc_end() on the concurrent GC tracer. That prevents >> further use of it for this GC. This means that inside the >> ConcurrentMarkThread::run() method we can not rely on the tracer. >> >> Removing the call to register_concurrent_cycle_end() in >> ConcurrentMark::abort() and relying on the call in >> ConcurrentMarkThread::run() seems to be a reasonable approach. > > The double call was deliberately put there to make sure that we end > the tracing of the concurrent GC before starting to trace teh Full GC. I figured there was a reason. I just couldn't remember. We would get overlapping GC events without this extra call. Thanks for pointing that out! > Why do you need to change this? I guess it has to do with your other > GCId changes? Right. It is for the GCId change. The problem is that calling register_concurrent_cycle_end() will reset the GCId to be -1. When we get to the logging, which is done in ConcurrentMarkThread::run(), I want to add the GCId to this log entry: if (cm()->has_aborted()) { if (G1Log::fine()) { gclog_or_tty->gclog_stamp(g1h->gc_tracer_cm()->gc_id()); gclog_or_tty->print_cr("[GC concurrent-mark-abort]"); } } But with the current code the GCId is always -1 here. I guess one workaround I can do is to in abort() store the last aborted GC id and use that for logging. It just seems a bit fragile that we reset the concurrent gc tracer while we still have the concurrent mark running. Bengt > > thanks, > StefanK > >> >> Thanks, >> Bengt > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrey.x.zakharov at oracle.com Wed Jun 11 15:39:16 2014 From: andrey.x.zakharov at oracle.com (Andrey Zakharov) Date: Wed, 11 Jun 2014 19:39:16 +0400 Subject: RFR: 8041946 - CMM Testing: 8u40 an allocated humongous object at the end of the heap should not prevents shrinking the heap In-Reply-To: <53985775.2090609@oracle.com> References: <5395C541.1000207@oracle.com> <53985775.2090609@oracle.com> Message-ID: <53987824.8010305@oracle.com> Hi, Jesper. Thanks for point. Here is updated webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.01/ Changed const name from MINIMAL_HEAP_SIZE to MINIMAL_YOUNG_SIZE Tested locally as very minor changes Thanks. On 11.06.2014 17:19, Jesper Wilhelmsson wrote: > Hi Andrey, > > As it is used, the constant MINIMAL_HEAP_SIZE does not define the > minimal heap size but the minimal young gen size. Would you consider > calling it MINIMAL_YOUNG_SIZE instead? > > Besides that it looks ok. > /Jesper > > Andrey Zakharov skrev 9/6/14 16:31: >> Hi, everyone! >> Please, review this test for new feature in G1 - sorted free list >> which make >> possible shrinking of the defragmented heap. >> To proper testing of free list sorting we need to defragment memory >> with small >> young and humongous objects. >> This is test scenario: >> - make enough space for new objects to prevent it going old. >> - allocate bunch of small objects, and a bit of humongous several >> times >> (ssssHssssHssssHssssHssssHssssHssssHssssH) >> - free almost all of allocated stuff. Check that heap shrinks after >> GC. >> (-----------H) >> >> Webrev: >> http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8041946 >> >> I have tested it along all major platforms and it works fine. There >> is lag on >> Solaris MXBeans about memory usage, so I need sleep by 1s. >> It will be very nicely if somebody advice me about method which >> "flush" memory >> usage info to remove this sleep. >> Thanks. >> >> >> >> >> From jesper.wilhelmsson at oracle.com Wed Jun 11 18:49:51 2014 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 11 Jun 2014 20:49:51 +0200 Subject: RFR: 8041946 - CMM Testing: 8u40 an allocated humongous object at the end of the heap should not prevents shrinking the heap In-Reply-To: <53987824.8010305@oracle.com> References: <5395C541.1000207@oracle.com> <53985775.2090609@oracle.com> <53987824.8010305@oracle.com> Message-ID: <5398A4CF.9040408@oracle.com> Looks good! /Jesper Andrey Zakharov skrev 11/6/14 17:39: > Hi, Jesper. Thanks for point. > Here is updated webrev: > > http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.01/ > > Changed const name from MINIMAL_HEAP_SIZE to MINIMAL_YOUNG_SIZE > Tested locally as very minor changes > > Thanks. > > > > On 11.06.2014 17:19, Jesper Wilhelmsson wrote: >> Hi Andrey, >> >> As it is used, the constant MINIMAL_HEAP_SIZE does not define the minimal heap >> size but the minimal young gen size. Would you consider calling it >> MINIMAL_YOUNG_SIZE instead? >> >> Besides that it looks ok. >> /Jesper >> >> Andrey Zakharov skrev 9/6/14 16:31: >>> Hi, everyone! >>> Please, review this test for new feature in G1 - sorted free list which make >>> possible shrinking of the defragmented heap. >>> To proper testing of free list sorting we need to defragment memory with small >>> young and humongous objects. >>> This is test scenario: >>> - make enough space for new objects to prevent it going old. >>> - allocate bunch of small objects, and a bit of humongous several times >>> (ssssHssssHssssHssssHssssHssssHssssHssssH) >>> - free almost all of allocated stuff. Check that heap shrinks after GC. >>> (-----------H) >>> >>> Webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8041946 >>> >>> I have tested it along all major platforms and it works fine. There is lag on >>> Solaris MXBeans about memory usage, so I need sleep by 1s. >>> It will be very nicely if somebody advice me about method which "flush" memory >>> usage info to remove this sleep. >>> Thanks. >>> >>> >>> >>> >>> > From bengt.rutisson at oracle.com Thu Jun 12 08:35:37 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 12 Jun 2014 10:35:37 +0200 Subject: RFR (S): JDK-8046518: G1: Double calls to register_concurrent_cycle_end() In-Reply-To: <5398662D.5000600@oracle.com> References: <5398305D.7070800@oracle.com> <53983933.2020204@oracle.com> <5398662D.5000600@oracle.com> Message-ID: <53996659.9000703@oracle.com> Hi all, I'm withdrawing this review request. I closed the bug as will not fix. Bengt On 2014-06-11 16:22, Bengt Rutisson wrote: > > Hi Stefan, > > Thanks for looking at this! > > On 6/11/14 1:10 PM, Stefan Karlsson wrote: >> >> On 2014-06-11 12:33, Bengt Rutisson wrote: >>> >>> Hi all, >>> >>> Can I have a review for this change? >>> >>> http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/ >>> >>> https://bugs.openjdk.java.net/browse/JDK-8046518 >>> >>> Background: >>> When we abort a concurrent cycle due to a Full GC in G1 we call >>> ConcurrentMark::abort(). That will set _has_aborted flag and then >>> call register_concurrent_cycle_end(). >>> >>> The concurrent marking thread will see the _has_aborted flag in its >>> ConcurrentMarkThread::run() method, abort the execution and then >>> call register_concurrent_cycle_end(). >>> >>> Currently this works since the code inside >>> register_concurrent_cycle_end() is guarded by >>> _concurrent_cycle_started which it then resets. So, the double calls >>> will not necessarily result in too much extra work being done. But >>> one of the things that register_concurrent_cycle_end() does is to >>> call report_gc_end() on the concurrent GC tracer. That prevents >>> further use of it for this GC. This means that inside the >>> ConcurrentMarkThread::run() method we can not rely on the tracer. >>> >>> Removing the call to register_concurrent_cycle_end() in >>> ConcurrentMark::abort() and relying on the call in >>> ConcurrentMarkThread::run() seems to be a reasonable approach. >> >> The double call was deliberately put there to make sure that we end >> the tracing of the concurrent GC before starting to trace teh Full GC. > > I figured there was a reason. I just couldn't remember. We would get > overlapping GC events without this extra call. Thanks for pointing > that out! > >> Why do you need to change this? I guess it has to do with your other >> GCId changes? > > Right. It is for the GCId change. The problem is that calling > register_concurrent_cycle_end() will reset the GCId to be -1. When we > get to the logging, which is done in ConcurrentMarkThread::run(), I > want to add the GCId to this log entry: > > if (cm()->has_aborted()) { > if (G1Log::fine()) { > gclog_or_tty->gclog_stamp(g1h->gc_tracer_cm()->gc_id()); > gclog_or_tty->print_cr("[GC concurrent-mark-abort]"); > } > } > > But with the current code the GCId is always -1 here. > > I guess one workaround I can do is to in abort() store the last > aborted GC id and use that for logging. It just seems a bit fragile > that we reset the concurrent gc tracer while we still have the > concurrent mark running. > > Bengt > > >> >> thanks, >> StefanK >> >>> >>> Thanks, >>> Bengt >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Thu Jun 12 08:47:35 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 12 Jun 2014 10:47:35 +0200 Subject: RFR: 8046670: Make CMS metadata aware closures applicable for other collectors Message-ID: <53996927.1000102@oracle.com> Hi all, Please, review this patch to make the metadata-tracing oop closures used by CMS available to other collectors. This patch is needed by the G1 Class Unloading work. http://cr.openjdk.java.net/~stefank/8046670/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8046670 thanks, StefanK From per.liden at oracle.com Thu Jun 12 10:09:46 2014 From: per.liden at oracle.com (Per Liden) Date: Thu, 12 Jun 2014 12:09:46 +0200 Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop() Message-ID: <53997C6A.2010209@oracle.com> Hi, Here's another (hopefully last) attempt at fixing issue with stopping G1's concurrent threads at VM shutdown. Bug: https://bugs.openjdk.java.net/browse/JDK-8044796 Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/ The previous attempt tried to abort any ongoing concurrent mark to speed up the shutdown phase. This turned out to be a bad idea as it opened up another race, which could result in threads getting stuck again. So, this time I just wait for concurrent mark to complete before terminating. We've talked internally here about some alternatives to force an abort, but it seems all alternatives complicates the code way too much and introduces new states which is hard to verify and it just isn't worth it. What worries me a bit is that the problems potentially introduced by a change like this are very hard to detect as they tend to be race conditions and show up only now and then. The previous fix had gone through a fair bit of testing without showing any problems. This new fix has gone thought 5 iterations of GC nightlies (Aurora adhoc submissions), 3 iterations of gc-test-suite and passed all JTReg G1 tests. About the fix. Since I no longer try to abort concurrent work the stop() function became just a call to stop_conc_gc_threads(). Since stop_conc_gc_threads() isn't used anywhere else I simply moved its contents to stop() and removed stop_conc_gc_threads(). Thanks! /Per From stefan.karlsson at oracle.com Thu Jun 12 10:40:42 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 12 Jun 2014 12:40:42 +0200 Subject: RFR: 8046670: Make CMS metadata aware closures applicable for other collectors In-Reply-To: <53996927.1000102@oracle.com> References: <53996927.1000102@oracle.com> Message-ID: <539983AA.50604@oracle.com> On 2014-06-12 10:47, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to make the metadata-tracing oop closures > used by CMS available to other collectors. This patch is needed by the > G1 Class Unloading work. > > http://cr.openjdk.java.net/~stefank/8046670/webrev.00/ New patch: http://cr.openjdk.java.net/~stefank/8046670/webrev.01/ The old patch didn't include the new iterator.inline.hpp file. I've added the file and made sure that we include it where needed. I've verified that this builds without precompiled header. I've also verified that we unload classes when running Kitchensink with CMS. thanks, StefanK > https://bugs.openjdk.java.net/browse/JDK-8046670 > > thanks, > StefanK From bengt.rutisson at oracle.com Thu Jun 12 11:24:57 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 12 Jun 2014 13:24:57 +0200 Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop() In-Reply-To: <53997C6A.2010209@oracle.com> References: <53997C6A.2010209@oracle.com> Message-ID: <53998E09.6020107@oracle.com> Hi Per, Thanks for doing such thorough testing! As far as I can tell this looks good. Bengt On 2014-06-12 12:09, Per Liden wrote: > Hi, > > Here's another (hopefully last) attempt at fixing issue with stopping > G1's concurrent threads at VM shutdown. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8044796 > Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/ > > The previous attempt tried to abort any ongoing concurrent mark to > speed up the shutdown phase. This turned out to be a bad idea as it > opened up another race, which could result in threads getting stuck > again. So, this time I just wait for concurrent mark to complete > before terminating. We've talked internally here about some > alternatives to force an abort, but it seems all alternatives > complicates the code way too much and introduces new states which is > hard to verify and it just isn't worth it. > > What worries me a bit is that the problems potentially introduced by a > change like this are very hard to detect as they tend to be race > conditions and show up only now and then. The previous fix had gone > through a fair bit of testing without showing any problems. This new > fix has gone thought 5 iterations of GC nightlies (Aurora adhoc > submissions), 3 iterations of gc-test-suite and passed all JTReg G1 > tests. > > About the fix. Since I no longer try to abort concurrent work the > stop() function became just a call to stop_conc_gc_threads(). Since > stop_conc_gc_threads() isn't used anywhere else I simply moved its > contents to stop() and removed stop_conc_gc_threads(). > > Thanks! > /Per From per.liden at oracle.com Thu Jun 12 11:34:30 2014 From: per.liden at oracle.com (Per Liden) Date: Thu, 12 Jun 2014 13:34:30 +0200 Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop() In-Reply-To: <53998E09.6020107@oracle.com> References: <53997C6A.2010209@oracle.com> <53998E09.6020107@oracle.com> Message-ID: <53999046.6060809@oracle.com> Thanks for reviewing Bengt! /Per On 06/12/2014 01:24 PM, Bengt Rutisson wrote: > > Hi Per, > > Thanks for doing such thorough testing! > > As far as I can tell this looks good. > > Bengt > > > On 2014-06-12 12:09, Per Liden wrote: >> Hi, >> >> Here's another (hopefully last) attempt at fixing issue with stopping >> G1's concurrent threads at VM shutdown. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8044796 >> Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/ >> >> The previous attempt tried to abort any ongoing concurrent mark to >> speed up the shutdown phase. This turned out to be a bad idea as it >> opened up another race, which could result in threads getting stuck >> again. So, this time I just wait for concurrent mark to complete >> before terminating. We've talked internally here about some >> alternatives to force an abort, but it seems all alternatives >> complicates the code way too much and introduces new states which is >> hard to verify and it just isn't worth it. >> >> What worries me a bit is that the problems potentially introduced by a >> change like this are very hard to detect as they tend to be race >> conditions and show up only now and then. The previous fix had gone >> through a fair bit of testing without showing any problems. This new >> fix has gone thought 5 iterations of GC nightlies (Aurora adhoc >> submissions), 3 iterations of gc-test-suite and passed all JTReg G1 >> tests. >> >> About the fix. Since I no longer try to abort concurrent work the >> stop() function became just a call to stop_conc_gc_threads(). Since >> stop_conc_gc_threads() isn't used anywhere else I simply moved its >> contents to stop() and removed stop_conc_gc_threads(). >> >> Thanks! >> /Per > From graham at vast.com Fri Jun 13 04:16:48 2014 From: graham at vast.com (graham sanderson) Date: Thu, 12 Jun 2014 23:16:48 -0500 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled Message-ID: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> Hi, I hope this is the right list for this question: I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19 I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release) I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection (is this your understanding too?) We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses we already have -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled we have seen a few long(isn) initial marks, so -XX:+CMSParallelInitialMarkEnabled sounds good as for -XX:+CMSEdenChunksRecordAlways my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk? Thanks, Graham P.S. less relevant I think, but our old generation is 16g P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1574 bytes Desc: not available URL: From erik.helin at oracle.com Fri Jun 13 08:53:49 2014 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 13 Jun 2014 10:53:49 +0200 Subject: RFR: 8046670: Make CMS metadata aware closures applicable for other collectors In-Reply-To: <539983AA.50604@oracle.com> References: <53996927.1000102@oracle.com> <539983AA.50604@oracle.com> Message-ID: <2138756.9fqg2igiCv@ehelin-desktop> Hi Stefan, looks good, reviewed. Thanks, Erik On Thursday 12 June 2014 12.40.42 Stefan Karlsson wrote: > On 2014-06-12 10:47, Stefan Karlsson wrote: > > Hi all, > > > > Please, review this patch to make the metadata-tracing oop closures > > used by CMS available to other collectors. This patch is needed by the > > G1 Class Unloading work. > > > > http://cr.openjdk.java.net/~stefank/8046670/webrev.00/ > > New patch: > http://cr.openjdk.java.net/~stefank/8046670/webrev.01/ > > The old patch didn't include the new iterator.inline.hpp file. I've > added the file and made sure that we include it where needed. I've > verified that this builds without precompiled header. > > I've also verified that we unload classes when running Kitchensink with CMS. > > thanks, > StefanK > > > https://bugs.openjdk.java.net/browse/JDK-8046670 > > > > thanks, > > StefanK From stefan.johansson at oracle.com Fri Jun 13 10:34:50 2014 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 13 Jun 2014 12:34:50 +0200 Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop() In-Reply-To: <53997C6A.2010209@oracle.com> References: <53997C6A.2010209@oracle.com> Message-ID: <539AD3CA.1070403@oracle.com> Hi Per, The change looks good. Hopefully there are no more rare corner cases to trip over and if there are I think it's good to get the change in to find them. StefanJ On 2014-06-12 12:09, Per Liden wrote: > Hi, > > Here's another (hopefully last) attempt at fixing issue with stopping > G1's concurrent threads at VM shutdown. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8044796 > Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/ > > The previous attempt tried to abort any ongoing concurrent mark to > speed up the shutdown phase. This turned out to be a bad idea as it > opened up another race, which could result in threads getting stuck > again. So, this time I just wait for concurrent mark to complete > before terminating. We've talked internally here about some > alternatives to force an abort, but it seems all alternatives > complicates the code way too much and introduces new states which is > hard to verify and it just isn't worth it. > > What worries me a bit is that the problems potentially introduced by a > change like this are very hard to detect as they tend to be race > conditions and show up only now and then. The previous fix had gone > through a fair bit of testing without showing any problems. This new > fix has gone thought 5 iterations of GC nightlies (Aurora adhoc > submissions), 3 iterations of gc-test-suite and passed all JTReg G1 > tests. > > About the fix. Since I no longer try to abort concurrent work the > stop() function became just a call to stop_conc_gc_threads(). Since > stop_conc_gc_threads() isn't used anywhere else I simply moved its > contents to stop() and removed stop_conc_gc_threads(). > > Thanks! > /Per From per.liden at oracle.com Fri Jun 13 11:32:39 2014 From: per.liden at oracle.com (Per Liden) Date: Fri, 13 Jun 2014 13:32:39 +0200 Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop() In-Reply-To: <539AD3CA.1070403@oracle.com> References: <53997C6A.2010209@oracle.com> <539AD3CA.1070403@oracle.com> Message-ID: <539AE157.2000809@oracle.com> Thanks Stefan! /Per On 06/13/2014 12:34 PM, Stefan Johansson wrote: > Hi Per, > > The change looks good. Hopefully there are no more rare corner cases to > trip over and if there are I think it's good to get the change in to > find them. > > StefanJ > > On 2014-06-12 12:09, Per Liden wrote: >> Hi, >> >> Here's another (hopefully last) attempt at fixing issue with stopping >> G1's concurrent threads at VM shutdown. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8044796 >> Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/ >> >> The previous attempt tried to abort any ongoing concurrent mark to >> speed up the shutdown phase. This turned out to be a bad idea as it >> opened up another race, which could result in threads getting stuck >> again. So, this time I just wait for concurrent mark to complete >> before terminating. We've talked internally here about some >> alternatives to force an abort, but it seems all alternatives >> complicates the code way too much and introduces new states which is >> hard to verify and it just isn't worth it. >> >> What worries me a bit is that the problems potentially introduced by a >> change like this are very hard to detect as they tend to be race >> conditions and show up only now and then. The previous fix had gone >> through a fair bit of testing without showing any problems. This new >> fix has gone thought 5 iterations of GC nightlies (Aurora adhoc >> submissions), 3 iterations of gc-test-suite and passed all JTReg G1 >> tests. >> >> About the fix. Since I no longer try to abort concurrent work the >> stop() function became just a call to stop_conc_gc_threads(). Since >> stop_conc_gc_threads() isn't used anywhere else I simply moved its >> contents to stop() and removed stop_conc_gc_threads(). >> >> Thanks! >> /Per > From graham at vast.com Fri Jun 13 15:48:42 2014 From: graham at vast.com (graham sanderson) Date: Fri, 13 Jun 2014 10:48:42 -0500 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled Message-ID: <22EB562F-F475-4946-911B-0475E02A8837@vast.com> Apologies wrong mailing list - resent over in hotspot-gc-use (hopefully this email attachers itself to the right thread) -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1574 bytes Desc: not available URL: From jon.masamitsu at oracle.com Mon Jun 16 18:27:43 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Mon, 16 Jun 2014 11:27:43 -0700 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled In-Reply-To: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> Message-ID: <539F371F.10008@oracle.com> On 06/12/2014 09:16 PM, graham sanderson wrote: > Hi, I hope this is the right list for this question: > > I was investigating abortable preclean timeouts in our app (and > associated long remark pause) so had a look at the old jdk6 code I had > on my box, wondered about recording eden chunks during certain eden > slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS > bump), and saw what looked perfect in the latest code, so was excited > to install 1.7.0_60-b19 > > I wanted to ask what you consider the stability of these two options > to be (I?m pretty sure at least the first one is new in this release) > > I have just installed locally on my mac, and am aware of > http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I > could reproduce, however I wasn?t able to reproduce it without > -XX:-UseCMSCompactAtFullCollection (is this your understanding too?) Yes. > > We are running our application with 8 gig young generation (6.4g > eden), on boxes with 32 cores? so parallelism is good for short pauses > > we already have > > -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled > > we have seen a few long(isn) initial marks, so > > -XX:+CMSParallelInitialMarkEnabled sounds good > > as for > > -XX:+CMSEdenChunksRecordAlways > > my question is: what constitutes a slow path such an eden chunk is > potentially recorded? TLAB allocation, or more horrific things; > basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) > is it likely that I?ll actually get less samples using > -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I > would with sampling, or put another way? what sort of app allocation > patterns if any might avoid the slow path altogether and might leave > me with just one chunk? Fast path allocation is done from TLAB's. If you have to get a new TLAB, the call to get the new TLAB comes from compiled code but the call is into the JVM and that is the slow path where the sampling is done. Jon > > Thanks, > > Graham > > P.S. less relevant I think, but our old generation is 16g > P.P.S. I suspect the abortable preclean timeouts mostly happen after a > burst of very high allocation rate followed by an almost complete > lull? this is one of the patterns that can happen in our application -------------- next part -------------- An HTML attachment was scrubbed... URL: From graham at vast.com Mon Jun 16 18:54:14 2014 From: graham at vast.com (graham sanderson) Date: Mon, 16 Jun 2014 13:54:14 -0500 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled In-Reply-To: <539F371F.10008@oracle.com> References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> <539F371F.10008@oracle.com> Message-ID: <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com> Thanks Jon; that?s exactly what i was hoping On Jun 16, 2014, at 1:27 PM, Jon Masamitsu wrote: > > On 06/12/2014 09:16 PM, graham sanderson wrote: >> Hi, I hope this is the right list for this question: >> >> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19 >> >> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release) >> >> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection (is this >> your understanding too?) > > Yes. > >> >> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses >> >> we already have >> >> -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC >> -XX:+CMSParallelRemarkEnabled >> >> we have seen a few long(isn) initial marks, so >> >> -XX:+CMSParallelInitialMarkEnabled sounds good >> >> as for >> >> -XX:+CMSEdenChunksRecordAlways >> >> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk? > > Fast path allocation is done from TLAB's. If you have to get > a new TLAB, the call to get the new TLAB comes from compiled > code but the call is into the JVM and that is the slow path where > the sampling is done. > > Jon > >> >> Thanks, >> >> Graham >> >> P.S. less relevant I think, but our old generation is 16g >> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1574 bytes Desc: not available URL: From sbergman at redhat.com Tue Jun 17 07:15:16 2014 From: sbergman at redhat.com (Stephan Bergmann) Date: Tue, 17 Jun 2014 09:15:16 +0200 Subject: History of finalizer execution and gc progress? Message-ID: <539FEB04.90704@redhat.com> Hi all, Does anybody recollect historical details of how execution of (potentially long-running) finalizers impacted overall gc progress? From the behavior of a small test program run on OpenJDK 8, it looks like recent JVMs at least offload all finalizer calls to a single dedicated thread, so that a blocking finalizer blocks finalization (and thus reclamation) of other garbage objects with explicit finalizers, but reclamation of other garbage proceeds unhindered. But how was the behavior in the past? Was it so that in older JVMs (still in use around 2005) execution of a blocking finalizer could block reclamation of /all/ garbage, even of those objects that did not have explicit finalizers? (I'm asking because in LibreOffice we have a dedicated thread to which we offload the actual work done by certain objects' finalize methods, introduced around 2005 to work around memory starvation in case one of those finalizers took too long. But I can't remember whether that was because no garbage at all was reclaimed in such a scenario---and we could drop our additional thread again today---, or because it blocked finalization of unrelated objects with explicit finalizers---in which case we would need to keep our additional thread.) Stephan From kirk at kodewerk.com Tue Jun 17 09:00:45 2014 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Tue, 17 Jun 2014 11:00:45 +0200 Subject: History of finalizer execution and gc progress? In-Reply-To: <539FEB04.90704@redhat.com> References: <539FEB04.90704@redhat.com> Message-ID: <1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com> Hi Stephan, finalization uses a helper thread to do the actual work so there should be no direct impact. However, finalization needs to complete before a collection can finally reclaim the memory used by the object. Until then the object will need to be processed as an object waiting for finalization. If your finalize method kills the helper thread, I think you have to wait for a GC cycle to end for a new finalization sequence to be triggered and that could be disruptive. That said, your (albeit brief) description of the problem suggests that you can manage object clean up on your own which implies that you don?t need finalization. Regards, Kirk On Jun 17, 2014, at 9:15 AM, Stephan Bergmann wrote: > Hi all, > > Does anybody recollect historical details of how execution of (potentially long-running) finalizers impacted overall gc progress? > > From the behavior of a small test program run on OpenJDK 8, it looks like recent JVMs at least offload all finalizer calls to a single dedicated thread, so that a blocking finalizer blocks finalization (and thus reclamation) of other garbage objects with explicit finalizers, but reclamation of other garbage proceeds unhindered. > > But how was the behavior in the past? Was it so that in older JVMs (still in use around 2005) execution of a blocking finalizer could block reclamation of /all/ garbage, even of those objects that did not have explicit finalizers? > > (I'm asking because in LibreOffice we have a dedicated thread to which we offload the actual work done by certain objects' finalize methods, introduced around 2005 to work around memory starvation in case one of those finalizers took too long. But I can't remember whether that was because no garbage at all was reclaimed in such a scenario---and we could drop our additional thread again today---, or because it blocked finalization of unrelated objects with explicit finalizers---in which case we would need to keep our additional thread.) > > Stephan From per.liden at oracle.com Tue Jun 17 12:12:08 2014 From: per.liden at oracle.com (Per Liden) Date: Tue, 17 Jun 2014 14:12:08 +0200 Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not in strong code roots for region Message-ID: <53A03098.2070207@oracle.com> Could I please have this fix reviewed. Summary: nmethods are only registered with the heap if nmethod::detect_scavenge_root_oops() returns true. However, in case the nmethod only contains oops to humongous objects detect_scavenge_root_oops() will return false and the nmethod will not be registered. This will later cause heap verification to fail. There are several ways in which this can be fixed. One alternative is to adjust the verification to ignore humongous oops (since these objects will never move). Another alternative is to just register the method regardless of what detect_scavenge_root_oops() says. Since we might want to allow humongous objects to move in the future this is the proposed fix. Bug: https://bugs.openjdk.java.net/browse/JDK-8046231 Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/ Testing: * gc-test-suite * manual ad-hoc testing Thanks! /Per From sbergman at redhat.com Tue Jun 17 12:47:59 2014 From: sbergman at redhat.com (Stephan Bergmann) Date: Tue, 17 Jun 2014 14:47:59 +0200 Subject: History of finalizer execution and gc progress? In-Reply-To: <1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com> References: <539FEB04.90704@redhat.com> <1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com> Message-ID: <53A038FF.5010001@redhat.com> On 06/17/2014 11:00 AM, Kirk Pepperdine wrote: > finalization uses a helper thread to do the actual work so there should be no direct impact. However, finalization needs to complete before a collection can finally reclaim the memory used by the object. Until then the object will need to be processed as an object waiting for finalization. Sure, but that doesn't address my question, whether in ca. 2005 JVMs one "malicious" blocking finalizer invocation could have blocked /all/ garbage reclamation (incl. of objects without explicit finalizers). > If your finalize method kills the helper thread, I think you have to wait for a GC cycle to end for a new finalization sequence to be triggered and that could be disruptive. No killing of any JVM's helper threads involved. > That said, your (albeit brief) description of the problem suggests that you can manage object clean up on your own which implies that you don?t need finalization. No. Let me rephrase: Assume we continuously create and let become unreachable again three kinds of objects. A objects don't have explicit finalizers; B objects have explicit finalizers (that are "well behaved" and execute quickly); C objects have explicit finalizers that are "malicious" and can take arbitrarily long to execute. Now, the question is whether it would have been common behavior for a ca. 2005 JVM that a very long-running finalizer execution for a C object would have prevented timely reclamation of A objects (in addition to reclamation of B objects and other C objects). Stephan > On Jun 17, 2014, at 9:15 AM, Stephan Bergmann wrote: >> Does anybody recollect historical details of how execution of (potentially long-running) finalizers impacted overall gc progress? >> >> From the behavior of a small test program run on OpenJDK 8, it looks like recent JVMs at least offload all finalizer calls to a single dedicated thread, so that a blocking finalizer blocks finalization (and thus reclamation) of other garbage objects with explicit finalizers, but reclamation of other garbage proceeds unhindered. >> >> But how was the behavior in the past? Was it so that in older JVMs (still in use around 2005) execution of a blocking finalizer could block reclamation of /all/ garbage, even of those objects that did not have explicit finalizers? >> >> (I'm asking because in LibreOffice we have a dedicated thread to which we offload the actual work done by certain objects' finalize methods, introduced around 2005 to work around memory starvation in case one of those finalizers took too long. But I can't remember whether that was because no garbage at all was reclaimed in such a scenario---and we could drop our additional thread again today---, or because it blocked finalization of unrelated objects with explicit finalizers---in which case we would need to keep our additional thread.) From kirk at kodewerk.com Tue Jun 17 13:05:10 2014 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Tue, 17 Jun 2014 15:05:10 +0200 Subject: History of finalizer execution and gc progress? In-Reply-To: <53A038FF.5010001@redhat.com> References: <539FEB04.90704@redhat.com> <1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com> <53A038FF.5010001@redhat.com> Message-ID: <5A9D4815-07DE-4799-8E13-C295924B0D78@kodewerk.com> Hi Stephan, >> finalization uses a helper thread to do the actual work so there should be no direct impact. However, finalization needs to complete before a collection can finally reclaim the memory used by the object. Until then the object will need to be processed as an object waiting for finalization. > > Sure, but that doesn't address my question, whether in ca. 2005 JVMs one "malicious" blocking finalizer invocation could have blocked /all/ garbage reclamation (incl. of objects without explicit finalizers). To clarify my answer, no, not that I?ve seen in the code or have experienced. > >> If your finalize method kills the helper thread, I think you have to wait for a GC cycle to end for a new finalization sequence to be triggered and that could be disruptive. > > No killing of any JVM's helper threads involved. > >> That said, your (albeit brief) description of the problem suggests that you can manage object clean up on your own which implies that you don?t need finalization. > > No. Let me rephrase: Assume we continuously create and let become unreachable again three kinds of objects. A objects don't have explicit finalizers; B objects have explicit finalizers (that are "well behaved" and execute quickly); C objects have explicit finalizers that are "malicious" and can take arbitrarily long to execute. > > Now, the question is whether it would have been common behavior for a ca. 2005 JVM that a very long-running finalizer execution for a C object would have prevented timely reclamation of A objects (in addition to reclamation of B objects and other C objects). I might wrap C in a PhantomReference and deal with them on my own if; 1) I was unable to deal with them as a closable resource (IOWs, I didn?t have a handle on the complete lifecycle for what ever reason), it was disruptive to having B finalized. Regards, Kirk From jon.masamitsu at oracle.com Tue Jun 17 20:29:20 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 17 Jun 2014 13:29:20 -0700 Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not in strong code roots for region In-Reply-To: <53A03098.2070207@oracle.com> References: <53A03098.2070207@oracle.com> Message-ID: <53A0A520.8010203@oracle.com> On 6/17/2014 5:12 AM, Per Liden wrote: > Could I please have this fix reviewed. > > Summary: nmethods are only registered with the heap if > nmethod::detect_scavenge_root_oops() returns true. However, in case > the nmethod only contains oops to humongous objects > detect_scavenge_root_oops() will return false and the nmethod will not > be registered. This will later cause heap verification to fail. > > There are several ways in which this can be fixed. One alternative is > to adjust the verification to ignore humongous oops (since these > objects will never move). Another alternative is to just register the > method regardless of what detect_scavenge_root_oops() says. Since we > might want to allow humongous objects to move in the future this is > the proposed fix. Per, Do you have any measurements on how many more nmethods get registered with this approach (registering an nmethod regardless return from detect_scavenge_root_oops()? Jon > > Bug: https://bugs.openjdk.java.net/browse/JDK-8046231 > Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/ > > Testing: > * gc-test-suite > * manual ad-hoc testing > > Thanks! > /Per > From andrey.x.zakharov at oracle.com Wed Jun 18 12:31:14 2014 From: andrey.x.zakharov at oracle.com (Andrey Zakharov) Date: Wed, 18 Jun 2014 16:31:14 +0400 Subject: RFR: 8026847 [TESTBUG] gc/g1/TestSummarizeRSetStats* tests launch 32bit jvm with UseCompressedOops Message-ID: <53A18692.20507@oracle.com> Hi, all. "UseCompressedOops" options is being used In gc/g1/TestSummarizeRSetStats* tests. But it doesn't needed for those tests. Also I have asked Thomas Schatzl about this options and he confirmed useless. So here is simple patch - just removing. webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8026847/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8026847 I have tested it locally for 32 and 64bits JDK and also in Aurora (batch 514909.ute.hs_jtreg.accept.full). Please, review it. Thanks. From per.liden at oracle.com Wed Jun 18 12:35:16 2014 From: per.liden at oracle.com (Per Liden) Date: Wed, 18 Jun 2014 14:35:16 +0200 Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not in strong code roots for region In-Reply-To: <53A0A520.8010203@oracle.com> References: <53A03098.2070207@oracle.com> <53A0A520.8010203@oracle.com> Message-ID: <53A18784.20705@oracle.com> Jon, On 06/17/2014 10:29 PM, Jon Masamitsu wrote: > > On 6/17/2014 5:12 AM, Per Liden wrote: >> Could I please have this fix reviewed. >> >> Summary: nmethods are only registered with the heap if >> nmethod::detect_scavenge_root_oops() returns true. However, in case >> the nmethod only contains oops to humongous objects >> detect_scavenge_root_oops() will return false and the nmethod will not >> be registered. This will later cause heap verification to fail. >> >> There are several ways in which this can be fixed. One alternative is >> to adjust the verification to ignore humongous oops (since these >> objects will never move). Another alternative is to just register the >> method regardless of what detect_scavenge_root_oops() says. Since we >> might want to allow humongous objects to move in the future this is >> the proposed fix. > > Per, > > Do you have any measurements on how many more nmethods get registered > with this approach (registering an nmethod regardless return from > detect_scavenge_root_oops()? I don't have any numbers, but I'm fairly confident that it's a small number. The only nmethods that weren't registered before this change were methods in classes loaded by the BootClassLoader, which only had humongous oops in them. All methods loaded by a SystemClassLoader would have been registered anyway. /Per > > Jon > >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231 >> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/ >> >> Testing: >> * gc-test-suite >> * manual ad-hoc testing >> >> Thanks! >> /Per >> > From jresch at cleversafe.com Wed Jun 18 19:20:36 2014 From: jresch at cleversafe.com (Jason Resch) Date: Wed, 18 Jun 2014 14:20:36 -0500 Subject: Reference Processing in G1 remark phase vs. throughput collector Message-ID: <53A1E684.1020000@cleversafe.com> Hello, We've recently been experimenting with the G1 collector for our application, and we noticed something odd with reference processing times in the G1. It is not clear to us if this is expected or indicative of a bug, but I thought I would mention it to this list to see if there is a reasonable explanation for this result. We are seeing that during the remark phase when non-strong references are processed, it takes around 20 times longer than the throughput collector spends processing the same number of references. As an example, here is some output for references processing times we observed: 2014-05-23T19:58:12.805+0000: 11446.605: [GC remark 11446.618: [GC ref-proc11446.618: [SoftReference, 0 refs, 0.0040400 secs]11446.622: [WeakReference, 11131810 refs, 8.7176900 secs]11455.340: [FinalReference, 2273593 refs, 2.0022000 secs]11457.342: [PhantomReference, 297950 refs, 0.3004680 secs]11457.643: [JNI Weak Reference, 0.0000040 secs], 13.7534950 secs], 13.8035420 secs] We see the G1 spent 8.7 seconds were spent processing 11 million weak references 2014-05-30T05:57:24.002+0000: 32724.998: [Full GC32726.138: [SoftReference, 154 refs, 0.0050380 secs]32726.143: [WeakReference, 7713339 refs, 0.3449380 secs]32726.488: [FinalReference, 1966941 refs, 0.1005860 secs]32726.588: [PhantomReference, 650797 refs, 0.0631680 secs]32726.652: [JNI Weak Reference, 0.0000060 secs] [PSYoungGen: 1012137K->0K(14784384K)] [ParOldGen: 16010001K->5894387K(16384000K)] 17022139K->5894387K(31168384K) [PSPermGen: 39256K->39256K(39552K)], 4.3463290 secs] [Times: user=98.05 sys=0.00, real=4.35 secs] While the throughput collector spent 0.34 seconds processing 7.7 million weak references In summary, the G1 collector processed weak references at a rate of 1.27 million per second, while the throughput collector processed them at 22.36 million references per second. Is there a fundamental design reason that explains why the G1 collector should be so much slower in this regard, or might there be ways to improve upon it? Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.B.Kessler at Oracle.COM Wed Jun 18 22:35:47 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Wed, 18 Jun 2014 15:35:47 -0700 Subject: History of finalizer execution and gc progress? In-Reply-To: <539FEB04.90704@redhat.com> References: <539FEB04.90704@redhat.com> Message-ID: <53A21443.5060607@Oracle.COM> As far back as I can remember (I never worked on the implementation of the "classic" JVM), the JVM discovers objects that should have their finalize() method called and puts them on a queue to be handled by Java library code. That means the JVM can continue collecting garbage even if that queue isn't being drained. Also as far back as I can remember, the Java library code calls the finalize() method from a thread dedicated to that. Cf. the static block at the end of [1] and [2], to cite only openly-available sources. As to "still in use around 2005", it would help to have a JDK version number. One of the things I used to do to dissuade people from using finalize() methods was to create and drop an object with a finalize() method that blocked, because that *would* block any subsequent calls from the Java library code to finalize() methods. Some people worked around that (or in real life :-) by calling Runtime.runFinalization() (usually by calling System.runFinalization()), which spins up an additional thread to drain the queue of discovered objects with non-trivial finalize() methods. Runtime.runFinalization() has been around since the beginning[3], though of course the specification is a little vague. You might be able to disable the collection of unreachable objects by misusing calls to the JNI functions GetPrimitiveArrayCritical and friends.[4] As long as you have all the infrastructure of your own thread, etc., I would recommend that you switch your uses of finalize() to WeakReferences (or PhantomReferences) and your own threads to drain your own queues. Then you would own all your code and wouldn't have to worry about interactions with other object types. ... peter -------- [1]http://hg.openjdk.java.net/jdk6/jdk6/jdk/file/a68f89bda2cf/src/share/classes/java/lang/ref/Finalizer.java [2]http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/27561aede285/src/share/classes/java/lang/ref/Finalizer.java [3]http://titanium.cs.berkeley.edu/doc/java-langspec-1.0/javalang.doc15.html#6892 [4] http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#GetPrimitiveArrayCritical On 06/17/14 00:15, Stephan Bergmann wrote: > Hi all, > > Does anybody recollect historical details of how execution of (potentially long-running) finalizers impacted overall gc progress? > > From the behavior of a small test program run on OpenJDK 8, it looks like recent JVMs at least offload all finalizer calls to a single dedicated thread, so that a blocking finalizer blocks finalization (and thus reclamation) of other garbage objects with explicit finalizers, but reclamation of other garbage proceeds unhindered. > > But how was the behavior in the past? Was it so that in older JVMs (still in use around 2005) execution of a blocking finalizer could block reclamation of /all/ garbage, even of those objects that did not have explicit finalizers? > > (I'm asking because in LibreOffice we have a dedicated thread to which we offload the actual work done by certain objects' finalize methods, introduced around 2005 to work around memory starvation in case one of those finalizers took too long. But I can't remember whether that was because no garbage at all was reclaimed in such a scenario---and we could drop our additional thread again today---, or because it blocked finalization of unrelated objects with explicit finalizers---in which case we would need to keep our additional thread.) > > Stephan From graham at vast.com Thu Jun 19 03:19:48 2014 From: graham at vast.com (graham sanderson) Date: Wed, 18 Jun 2014 22:19:48 -0500 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled In-Reply-To: <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com> References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> <539F371F.10008@oracle.com> <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com> Message-ID: The options are working great and as expected (faster initial mark, and no long pauses after abortable preclean timeout). One weird thing though which I?m curious about: I?m showing some data for six JVMs (calling them nodes - they are on separate machines) all with : Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild at c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014 JDK 1.7.0_60-b19 16 gig old gen 8 gig new (6.4 gig eden) -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 256 gig RAM 16 cores (sandy bridge) Nodes 4-6 also have -XX:+CMSEdenChunksRecordAlways -XX:+CMSParallelInitialMarkEnabled There are some application level config differences (which limit amount of certain objects kept in memory before flushing to disk) - 1&4 have the same app config, 2&5 have the same app config, 3&6 have the same app config This first dump, shows two days worth of total times application threads were stopped via grepping logs for Total time for which application threads were stopped and summing the values. worst case 4 minutes over the day is not too bad, so this isn?t a big issue 2014-06-17 : 1 154.623 2014-06-17 : 2 90.3006 2014-06-17 : 3 75.3602 2014-06-17 : 4 180.618 2014-06-17 : 5 107.668 2014-06-17 : 6 99.7783 ------- 2014-06-18 : 1 190.741 2014-06-18 : 2 82.8865 2014-06-18 : 3 90.0098 2014-06-18 : 4 239.702 2014-06-18 : 5 149.332 2014-06-18 : 6 138.03 Notably however if you look via JMX/visualGC, the total GC time is actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I know that biased lock revocation and other things cause safe point, so I figure something other than GC must be the cause? so I just did a count of log lines with Total time for which application threads were stopped and got this: 2014-06-17 : 1 19282 2014-06-17 : 2 6784 2014-06-17 : 3 1275 2014-06-17 : 4 26356 2014-06-17 : 5 14491 2014-06-17 : 6 8402 ------- 2014-06-18 : 1 20943 2014-06-18 : 2 1134 2014-06-18 : 3 1129 2014-06-18 : 4 30289 2014-06-18 : 5 16508 2014-06-18 : 6 11459 I can?t cycle these nodes right now (to try each new parameter individually), but am curious whether you can think of why adding these parameters would have such a large effect on the number of safe point stops - e.g. 1129 vs 11459 for otherwise identically configured nodes with very similar workload. Note the ratio is highest on nodes 2 vs node 5 which spill the least into the old generation (so certainly fewer CMS cycles, and also fewer young gen collections) if that sparks any ideas. Thanks, Graham. P.S. It is entirely possible I don?t know exactly what Total time for which application threads were stopped refers to in all cases (I?m assuming it is a safe point stop) On Jun 16, 2014, at 1:54 PM, graham sanderson wrote: > Thanks Jon; that?s exactly what i was hoping > > On Jun 16, 2014, at 1:27 PM, Jon Masamitsu wrote: > >> >> On 06/12/2014 09:16 PM, graham sanderson wrote: >>> Hi, I hope this is the right list for this question: >>> >>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19 >>> >>> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release) >>> >>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection (is this >>> your understanding too?) >> >> Yes. >> >>> >>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses >>> >>> we already have >>> >>> -XX:+UseParNewGC >>> -XX:+UseConcMarkSweepGC >>> -XX:+CMSParallelRemarkEnabled >>> >>> we have seen a few long(isn) initial marks, so >>> >>> -XX:+CMSParallelInitialMarkEnabled sounds good >>> >>> as for >>> >>> -XX:+CMSEdenChunksRecordAlways >>> >>> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk? >> >> Fast path allocation is done from TLAB's. If you have to get >> a new TLAB, the call to get the new TLAB comes from compiled >> code but the call is into the JVM and that is the slow path where >> the sampling is done. >> >> Jon >> >>> >>> Thanks, >>> >>> Graham >>> >>> P.S. less relevant I think, but our old generation is 16g >>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1574 bytes Desc: not available URL: From stefan.karlsson at oracle.com Thu Jun 19 07:16:38 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 19 Jun 2014 09:16:38 +0200 Subject: RFR: Remove unused _copy_metadata_obj_cl in G1CopyingKeepAliveClosure Message-ID: <53A28E56.5040504@oracle.com> Please, review this small patch to remove the unused G1CopyingKeepAliveClosure::_copy_metadata_obj_cl. http://cr.openjdk.java.net/~stefank/8047323/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8047323 thanks, StefanK From mikael.gerdin at oracle.com Thu Jun 19 07:47:53 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 19 Jun 2014 09:47:53 +0200 Subject: RFR: 8046670: Make CMS metadata aware closures applicable for other collectors In-Reply-To: <539983AA.50604@oracle.com> References: <53996927.1000102@oracle.com> <539983AA.50604@oracle.com> Message-ID: <3791251.MY5aGxyT38@mgerdin03> Hi Stefan, On Thursday 12 June 2014 12.40.42 Stefan Karlsson wrote: > On 2014-06-12 10:47, Stefan Karlsson wrote: > > Hi all, > > > > Please, review this patch to make the metadata-tracing oop closures > > used by CMS available to other collectors. This patch is needed by the > > G1 Class Unloading work. > > > > http://cr.openjdk.java.net/~stefank/8046670/webrev.00/ > > New patch: > http://cr.openjdk.java.net/~stefank/8046670/webrev.01/ > The change looks good. /Mikael > The old patch didn't include the new iterator.inline.hpp file. I've > added the file and made sure that we include it where needed. I've > verified that this builds without precompiled header. > > I've also verified that we unload classes when running Kitchensink with CMS. > > thanks, > StefanK > > > https://bugs.openjdk.java.net/browse/JDK-8046670 > > > > thanks, > > StefanK From mikael.gerdin at oracle.com Thu Jun 19 07:48:43 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 19 Jun 2014 09:48:43 +0200 Subject: RFR: Remove unused _copy_metadata_obj_cl in G1CopyingKeepAliveClosure In-Reply-To: <53A28E56.5040504@oracle.com> References: <53A28E56.5040504@oracle.com> Message-ID: <3272310.UF6rd0vlGd@mgerdin03> Stefan, On Thursday 19 June 2014 09.16.38 Stefan Karlsson wrote: > Please, review this small patch to remove the unused > G1CopyingKeepAliveClosure::_copy_metadata_obj_cl. > > http://cr.openjdk.java.net/~stefank/8047323/webrev.00/ Looks good. /Mikael > https://bugs.openjdk.java.net/browse/JDK-8047323 > > thanks, > StefanK From thomas.schatzl at oracle.com Thu Jun 19 08:12:34 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 19 Jun 2014 10:12:34 +0200 Subject: RFR: Remove unused _copy_metadata_obj_cl in G1CopyingKeepAliveClosure In-Reply-To: <53A28E56.5040504@oracle.com> References: <53A28E56.5040504@oracle.com> Message-ID: <1403165554.2621.1.camel@cirrus> Hi, On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote: > Please, review this small patch to remove the unused > G1CopyingKeepAliveClosure::_copy_metadata_obj_cl. > > http://cr.openjdk.java.net/~stefank/8047323/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8047323 I think the change should also remove the metadata_obj_cl from the constructor as it is obsolete too. Thanks, Thomas From dmitry.fazunenko at oracle.com Thu Jun 19 08:23:37 2014 From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko) Date: Thu, 19 Jun 2014 12:23:37 +0400 Subject: RFR: 8026847 [TESTBUG] gc/g1/TestSummarizeRSetStats* tests launch 32bit jvm with UseCompressedOops In-Reply-To: <53A18692.20507@oracle.com> References: <53A18692.20507@oracle.com> Message-ID: <53A29E09.3000002@oracle.com> Looks good. Thanks, Dima On 18.06.2014 16:31, Andrey Zakharov wrote: > Hi, all. > "UseCompressedOops" options is being used In > gc/g1/TestSummarizeRSetStats* tests. > But it doesn't needed for those tests. Also I have asked Thomas > Schatzl about this options and he confirmed useless. > So here is simple patch - just removing. > > webrev: > http://cr.openjdk.java.net/~fzhinkin/azakharov/8026847/webrev.00/ > bug: > https://bugs.openjdk.java.net/browse/JDK-8026847 > > I have tested it locally for 32 and 64bits JDK and also in Aurora > (batch 514909.ute.hs_jtreg.accept.full). > Please, review it. > Thanks. From stefan.karlsson at oracle.com Thu Jun 19 08:26:02 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 19 Jun 2014 10:26:02 +0200 Subject: RFR: Remove unused _copy_metadata_obj_cl in G1CopyingKeepAliveClosure In-Reply-To: <1403165554.2621.1.camel@cirrus> References: <53A28E56.5040504@oracle.com> <1403165554.2621.1.camel@cirrus> Message-ID: <53A29E9A.5030309@oracle.com> On 2014-06-19 10:12, Thomas Schatzl wrote: > Hi, > > On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote: >> Please, review this small patch to remove the unused >> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl. >> >> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8047323 > > I think the change should also remove the metadata_obj_cl from the > constructor as it is obsolete too. Updated patch: http://cr.openjdk.java.net/~stefank/8047323/webrev.01 thanks for catching this, StefanK > > Thanks, > Thomas > From thomas.schatzl at oracle.com Thu Jun 19 09:01:40 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 19 Jun 2014 11:01:40 +0200 Subject: RFR: Remove unused _copy_metadata_obj_cl in G1CopyingKeepAliveClosure In-Reply-To: <53A29E9A.5030309@oracle.com> References: <53A28E56.5040504@oracle.com> <1403165554.2621.1.camel@cirrus> <53A29E9A.5030309@oracle.com> Message-ID: <1403168500.2621.3.camel@cirrus> Hi Stefan, On Thu, 2014-06-19 at 10:26 +0200, Stefan Karlsson wrote: > On 2014-06-19 10:12, Thomas Schatzl wrote: > > Hi, > > > > On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote: > >> Please, review this small patch to remove the unused > >> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl. > >> > >> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/ > >> https://bugs.openjdk.java.net/browse/JDK-8047323 > > > > I think the change should also remove the metadata_obj_cl from the > > constructor as it is obsolete too. > > Updated patch: > http://cr.openjdk.java.net/~stefank/8047323/webrev.01 Looks good. Thanks, Thomas From stefan.karlsson at oracle.com Thu Jun 19 09:20:38 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 19 Jun 2014 11:20:38 +0200 Subject: RFR: Remove unused _copy_metadata_obj_cl in G1CopyingKeepAliveClosure In-Reply-To: <1403168500.2621.3.camel@cirrus> References: <53A28E56.5040504@oracle.com> <1403165554.2621.1.camel@cirrus> <53A29E9A.5030309@oracle.com> <1403168500.2621.3.camel@cirrus> Message-ID: <53A2AB66.3080701@oracle.com> On 2014-06-19 11:01, Thomas Schatzl wrote: > Hi Stefan, > > On Thu, 2014-06-19 at 10:26 +0200, Stefan Karlsson wrote: >> On 2014-06-19 10:12, Thomas Schatzl wrote: >>> Hi, >>> >>> On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote: >>>> Please, review this small patch to remove the unused >>>> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl. >>>> >>>> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8047323 >>> I think the change should also remove the metadata_obj_cl from the >>> constructor as it is obsolete too. >> Updated patch: >> http://cr.openjdk.java.net/~stefank/8047323/webrev.01 > Looks good. Thanks. StefanK > > Thanks, > Thomas > > From stefan.karlsson at oracle.com Thu Jun 19 12:45:13 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 19 Jun 2014 14:45:13 +0200 Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create a new RelocIterator Message-ID: <53A2DB59.9050605@oracle.com> Hi all, I have a patch that we have been using in the G1 Class Unloading project to lower the remark times. This changes Compiler code, so I would like to get feedback from the Compiler team. http://cr.openjdk.java.net/~stefank/8047362/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8047362 The patch builds upon the patch in: http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html Summary from the bug report: --- Creation of RelocIterators show up high in profiles of the remark phase, in the G1 Class Unloading project. There's a pattern in the nmethod/codecache code to create a RelocIterator and then materialize a CompiledIC: RelocIterator iter(this, low_boundary); while(iter.next()) { if (iter.type() == relocInfo::virtual_call_type) { CompiledIC *ic = CompiledIC_at(iter.reloc()); CompiledIC_at is implemented as: new CompiledIC(call_site->code(), nativeCall_at(call_site->addr())); And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, NativeCall* call) does is to create a new RelocIterator: ... address ic_call = call->instruction_address(); ... RelocIterator iter(nm, ic_call, ic_call+1); bool ret = iter.next(); assert(ret == true, "relocInfo must exist at this address"); assert(iter.addr() == ic_call, "must find ic_call"); I would like to propose that we pass down the RelocIterator that we already have, instead of creating a new. --- I've previously received feedback that this seems like reasonable thing to do, but that the parameter to the new CompileIC_at should take a const RelocIterator* instead of RelocIterator*. I couldn't do that without changing a significant amount of Compiler code, so I have left it out for now. Any opinions on how to handle that? To give an idea of the performance difference, I temporarily added the following code: void CodeCache::iterate_through_CIs(int style) { int count; FOR_ALL_ALIVE_NMETHODS(nm) { RelocIterator iter(nm); while(iter.next()) { if (iter.type() == relocInfo::virtual_call_type || iter.type() == relocInfo::opt_virtual_call_type) { if (style > 0) { CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : CompiledIC_at(iter.reloc()); if (ic->ic_destination() == (address)0xdeadb000) { gclog_or_tty->print_cr("ShouldNotReachHere"); } } } } } } and then measured how long time it took to execute iterate_through_CIs(style) 1000 times with style == {0, 1, 2}. The results are: iterate_through_CIs(0): 1.210833 s // No CompiledICs created iterate_through_CIs(1): 1.976557 s // New style iterate_through_CIs(2): 9.924209 s // Old style Testing: A similar version has been used and thoroughly been tested together with the other G1 Class Unloading changes. This exact version has so far only been tested with Kitchensink and SpecJVM2008 compiler.compiler. What test lists would be appropriate to test this with? thanks, StefanK From andreas.sjoberg at oracle.com Thu Jun 19 13:27:23 2014 From: andreas.sjoberg at oracle.com (=?ISO-8859-1?Q?Andreas_Sj=F6berg?=) Date: Thu, 19 Jun 2014 15:27:23 +0200 Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1 SparsePRTEntry Message-ID: <53A2E53B.3050508@oracle.com> Hi all, can I please have reviews for this patch that removes the unrolled for-loops in sparsePRT.cpp. I ran some performance benchmarks and could not see any benefits in keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on Linux x64 actually. Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/ Testing: jprt, specjbb2005, specjvm2008, specjbb2013 Thanks, Andreas From stefan.karlsson at oracle.com Thu Jun 19 15:36:44 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 19 Jun 2014 17:36:44 +0200 Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create a new RelocIterator In-Reply-To: <53A2DB59.9050605@oracle.com> References: <53A2DB59.9050605@oracle.com> Message-ID: <53A3038C.9020004@oracle.com> This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev. On 2014-06-19 14:45, Stefan Karlsson wrote: > Hi all, > > I have a patch that we have been using in the G1 Class Unloading > project to lower the remark times. This changes Compiler code, so I > would like to get feedback from the Compiler team. > > http://cr.openjdk.java.net/~stefank/8047362/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8047362 > > The patch builds upon the patch in: > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html > > > Summary from the bug report: > --- > Creation of RelocIterators show up high in profiles of the remark > phase, in the G1 Class Unloading project. > > There's a pattern in the nmethod/codecache code to create a > RelocIterator and then materialize a CompiledIC: > > RelocIterator iter(this, low_boundary); > while(iter.next()) { > if (iter.type() == relocInfo::virtual_call_type) { > CompiledIC *ic = CompiledIC_at(iter.reloc()); > > CompiledIC_at is implemented as: > new CompiledIC(call_site->code(), nativeCall_at(call_site->addr())); > > And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, > NativeCall* call) does is to create a new RelocIterator: > ... > address ic_call = call->instruction_address(); > ... > RelocIterator iter(nm, ic_call, ic_call+1); > bool ret = iter.next(); > assert(ret == true, "relocInfo must exist at this address"); > assert(iter.addr() == ic_call, "must find ic_call"); > > I would like to propose that we pass down the RelocIterator that we > already have, instead of creating a new. > --- > > > I've previously received feedback that this seems like reasonable > thing to do, but that the parameter to the new CompileIC_at should > take a const RelocIterator* instead of RelocIterator*. I couldn't do > that without changing a significant amount of Compiler code, so I have > left it out for now. Any opinions on how to handle that? > > > To give an idea of the performance difference, I temporarily added the > following code: > void CodeCache::iterate_through_CIs(int style) { > int count; > FOR_ALL_ALIVE_NMETHODS(nm) { > RelocIterator iter(nm); > while(iter.next()) { > if (iter.type() == relocInfo::virtual_call_type || > iter.type() == relocInfo::opt_virtual_call_type) { > if (style > 0) { > CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : > CompiledIC_at(iter.reloc()); > if (ic->ic_destination() == (address)0xdeadb000) { > gclog_or_tty->print_cr("ShouldNotReachHere"); > } > } > } > } > } > } > > and then measured how long time it took to execute > iterate_through_CIs(style) 1000 times with style == {0, 1, 2}. > > The results are: > iterate_through_CIs(0): 1.210833 s // No CompiledICs created > iterate_through_CIs(1): 1.976557 s // New style > iterate_through_CIs(2): 9.924209 s // Old style > > > Testing: > A similar version has been used and thoroughly been tested together > with the other G1 Class Unloading changes. This exact version has so > far only been tested with Kitchensink and SpecJVM2008 > compiler.compiler. What test lists would be appropriate to test this > with? > > > thanks, > StefanK > From jon.masamitsu at oracle.com Thu Jun 19 17:23:17 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 19 Jun 2014 10:23:17 -0700 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled In-Reply-To: References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> <539F371F.10008@oracle.com> <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com> Message-ID: <53A31C85.3040600@oracle.com> Graham, I don't have any guesses about what is causing the difference in the number of safepoints. As you note the messages you are counting are not specifically GC pauses. The differences you are seeing are huge (1129 vs 11459, which, by the way, I don't see those numbers in the tables; did you mean 31129 vs 611459?). If those numbers are really GC related I would expect some dramatic effect on the application behavior. Sometimes better GC means applications run faster so maybe more work is getting done in nodes 4-6. The scale of the difference is surprising though. Jon On 6/18/2014 8:19 PM, graham sanderson wrote: > The options are working great and as expected (faster initial mark, > and no long pauses after abortable preclean timeout). One weird thing > though which I?m curious about: I?m showing some data for six JVMs > (calling them nodes - they are on separate machines) > > all with : > > Linux version 2.6.32-431.3.1.el6.x86_64 > (mockbuild at c6b10.bsys.dev.centos.org > ) (gcc version 4.4.7 > 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014 > JDK 1.7.0_60-b19 > 16 gig old gen > 8 gig new (6.4 gig eden) > -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled > 256 gig RAM > 16 cores (sandy bridge) > > Nodes 4-6 also have > > -XX:+CMSEdenChunksRecordAlways > -XX:+CMSParallelInitialMarkEnabled > > There are some application level config differences (which limit > amount of certain objects kept in memory before flushing to disk) - > 1&4 have the same app config, 2&5 have the same app config, 3&6 have > the same app config > > This first dump, shows two days worth of total times application > threads were stopped via grepping logs for Total time for which > application threads were stopped and summing the values. worst case 4 > minutes over the day is not too bad, so this isn?t a big issue > > 2014-06-17 : 1 154.623 > 2014-06-17 : 2 90.3006 > 2014-06-17 : 3 75.3602 > 2014-06-17 : 4 180.618 > 2014-06-17 : 5 107.668 > 2014-06-17 : 6 99.7783 > ------- > 2014-06-18 : 1 190.741 > 2014-06-18 : 2 82.8865 > 2014-06-18 : 3 90.0098 > 2014-06-18 : 4 239.702 > 2014-06-18 : 5 149.332 > 2014-06-18 : 6 138.03 > > Notably however if you look via JMX/visualGC, the total GC time is > actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I > know that biased lock revocation and other things cause safe point, so > I figure something other than GC must be the cause? so I just did a > count of log lines with Total time for which application threads were > stopped and got this: > > 2014-06-17 : 1 19282 > 2014-06-17 : 2 6784 > 2014-06-17 : 3 1275 > 2014-06-17 : 4 26356 > 2014-06-17 : 5 14491 > 2014-06-17 : 6 8402 > ------- > 2014-06-18 : 1 20943 > 2014-06-18 : 2 1134 > 2014-06-18 : 3 1129 > 2014-06-18 : 4 30289 > 2014-06-18 : 5 16508 > 2014-06-18 : 6 11459 > > I can?t cycle these nodes right now (to try each new parameter > individually), but am curious whether you can think of why adding > these parameters would have such a large effect on the number of safe > point stops - e.g. 1129 vs 11459 for otherwise identically configured > nodes with very similar workload. Note the ratio is highest on nodes 2 > vs node 5 which spill the least into the old generation (so certainly > fewer CMS cycles, and also fewer young gen collections) if that sparks > any ideas. > > Thanks, > > Graham. > > P.S. It is entirely possible I don?t know exactly what Total time for > which application threads were stopped refers to in all cases (I?m > assuming it is a safe point stop) > > On Jun 16, 2014, at 1:54 PM, graham sanderson > wrote: > >> Thanks Jon; that?s exactly what i was hoping >> >> On Jun 16, 2014, at 1:27 PM, Jon Masamitsu > > wrote: >> >>> >>> On 06/12/2014 09:16 PM, graham sanderson wrote: >>>> Hi, I hope this is the right list for this question: >>>> >>>> I was investigating abortable preclean timeouts in our app (and >>>> associated long remark pause) so had a look at the old jdk6 code I >>>> had on my box, wondered about recording eden chunks during certain >>>> eden slow allocation paths (I wasn?t sure if TLAB allocation is >>>> just a CAS bump), and saw what looked perfect in the latest code, >>>> so was excited to install 1.7.0_60-b19 >>>> >>>> I wanted to ask what you consider the stability of these two >>>> options to be (I?m pretty sure at least the first one is new in >>>> this release) >>>> >>>> I have just installed locally on my mac, and am aware of >>>> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I >>>> could reproduce, however I wasn?t able to reproduce it without >>>> -XX:-UseCMSCompactAtFullCollection (is this your understanding too?) >>> >>> Yes. >>> >>>> >>>> We are running our application with 8 gig young generation (6.4g >>>> eden), on boxes with 32 cores? so parallelism is good for short pauses >>>> >>>> we already have >>>> >>>> -XX:+UseParNewGC >>>> -XX:+UseConcMarkSweepGC >>>> -XX:+CMSParallelRemarkEnabled >>>> >>>> we have seen a few long(isn) initial marks, so >>>> >>>> -XX:+CMSParallelInitialMarkEnabled sounds good >>>> >>>> as for >>>> >>>> -XX:+CMSEdenChunksRecordAlways >>>> >>>> my question is: what constitutes a slow path such an eden chunk is >>>> potentially recorded? TLAB allocation, or more horrific things; >>>> basically (and I?ll test our app >>>> with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll >>>> actually get less samples using -XX:+CMSEdenChunksRecordAlways in a >>>> highly multithread app than I would with sampling, or put another >>>> way? what sort of app allocation patterns if any might avoid the >>>> slow path altogether and might leave me with just one chunk? >>> >>> Fast path allocation is done from TLAB's. If you have to get >>> a new TLAB, the call to get the new TLAB comes from compiled >>> code but the call is into the JVM and that is the slow path where >>> the sampling is done. >>> >>> Jon >>> >>>> >>>> Thanks, >>>> >>>> Graham >>>> >>>> P.S. less relevant I think, but our old generation is 16g >>>> P.P.S. I suspect the abortable preclean timeouts mostly happen >>>> after a burst of very high allocation rate followed by an almost >>>> complete lull? this is one of the patterns that can happen in our >>>> application >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Jun 19 17:31:05 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 19 Jun 2014 19:31:05 +0200 Subject: RFR: 8026847 [TESTBUG] gc/g1/TestSummarizeRSetStats* tests launch 32bit jvm with UseCompressedOops In-Reply-To: <53A18692.20507@oracle.com> References: <53A18692.20507@oracle.com> Message-ID: <1403199065.2621.4.camel@cirrus> Hi, On Wed, 2014-06-18 at 16:31 +0400, Andrey Zakharov wrote: > Hi, all. > "UseCompressedOops" options is being used In > gc/g1/TestSummarizeRSetStats* tests. > But it doesn't needed for those tests. Also I have asked Thomas Schatzl > about this options and he confirmed useless. > So here is simple patch - just removing. > > webrev: > http://cr.openjdk.java.net/~fzhinkin/azakharov/8026847/webrev.00/ > bug: > https://bugs.openjdk.java.net/browse/JDK-8026847 Looks good. Thomas From graham at vast.com Thu Jun 19 19:22:08 2014 From: graham at vast.com (graham sanderson) Date: Thu, 19 Jun 2014 14:22:08 -0500 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled In-Reply-To: <53A31C85.3040600@oracle.com> References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> <539F371F.10008@oracle.com> <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com> <53A31C85.3040600@oracle.com> Message-ID: <80970B7A-B6AE-4125-AA99-4D846F3EF2DB@vast.com> Ok, thanks? The 3 before the 1129 and the 6 before the 11459 are the node numbers I?ll dig around in the source for any way of finding out what the cause of safe points are (I?m not aware of a better -XX: option) ? frankly I?ll probably just report this in the user hotspot-gc-use thread (it isn?t causing any real issues) and see if other people report it too. Thanks, Graham On Jun 19, 2014, at 12:23 PM, Jon Masamitsu wrote: > Graham, > > I don't have any guesses about what is causing the difference in > the number of safepoints. As you note the messages you are > counting are not specifically GC pauses. The differences you > are seeing are huge (1129 vs 11459, which, by the way, I don't > see those numbers in the tables; did you mean 31129 vs 611459?). > If those numbers are really GC related I would expect some > dramatic effect on the application behavior. > > Sometimes better GC means applications run faster so > maybe more work is getting done in nodes 4-6. > The scale of the difference is surprising though. > > Jon > > On 6/18/2014 8:19 PM, graham sanderson wrote: >> The options are working great and as expected (faster initial mark, and no long pauses after abortable preclean timeout). One weird thing though which I?m curious about: I?m showing some data for six JVMs (calling them nodes - they are on separate machines) >> >> all with : >> >> Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild at c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014 >> JDK 1.7.0_60-b19 >> 16 gig old gen >> 8 gig new (6.4 gig eden) >> -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC >> -XX:+CMSParallelRemarkEnabled >> 256 gig RAM >> 16 cores (sandy bridge) >> >> Nodes 4-6 also have >> >> -XX:+CMSEdenChunksRecordAlways >> -XX:+CMSParallelInitialMarkEnabled >> >> There are some application level config differences (which limit amount of certain objects kept in memory before flushing to disk) - 1&4 have the same app config, 2&5 have the same app config, 3&6 have the same app config >> >> This first dump, shows two days worth of total times application threads were stopped via grepping logs for Total time for which application threads were stopped and summing the values. worst case 4 minutes over the day is not too bad, so this isn?t a big issue >> >> 2014-06-17 : 1 154.623 >> 2014-06-17 : 2 90.3006 >> 2014-06-17 : 3 75.3602 >> 2014-06-17 : 4 180.618 >> 2014-06-17 : 5 107.668 >> 2014-06-17 : 6 99.7783 >> ------- >> 2014-06-18 : 1 190.741 >> 2014-06-18 : 2 82.8865 >> 2014-06-18 : 3 90.0098 >> 2014-06-18 : 4 239.702 >> 2014-06-18 : 5 149.332 >> 2014-06-18 : 6 138.03 >> >> Notably however if you look via JMX/visualGC, the total GC time is actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I know that biased lock revocation and other things cause safe point, so I figure something other than GC must be the cause? so I just did a count of log lines with Total time for which application threads were stopped and got this: >> >> 2014-06-17 : 1 19282 >> 2014-06-17 : 2 6784 >> 2014-06-17 : 3 1275 >> 2014-06-17 : 4 26356 >> 2014-06-17 : 5 14491 >> 2014-06-17 : 6 8402 >> ------- >> 2014-06-18 : 1 20943 >> 2014-06-18 : 2 1134 >> 2014-06-18 : 3 1129 >> 2014-06-18 : 4 30289 >> 2014-06-18 : 5 16508 >> 2014-06-18 : 6 11459 >> >> I can?t cycle these nodes right now (to try each new parameter individually), but am curious whether you can think of why adding these parameters would have such a large effect on the number of safe point stops - e.g. 1129 vs 11459 for otherwise identically configured nodes with very similar workload. Note the ratio is highest on nodes 2 vs node 5 which spill the least into the old generation (so certainly fewer CMS cycles, and also fewer young gen collections) if that sparks any ideas. >> >> Thanks, >> >> Graham. >> >> P.S. It is entirely possible I don?t know exactly what Total time for which application threads were stopped refers to in all cases (I?m assuming it is a safe point stop) >> >> On Jun 16, 2014, at 1:54 PM, graham sanderson wrote: >> >>> Thanks Jon; that?s exactly what i was hoping >>> >>> On Jun 16, 2014, at 1:27 PM, Jon Masamitsu wrote: >>> >>>> >>>> On 06/12/2014 09:16 PM, graham sanderson wrote: >>>>> Hi, I hope this is the right list for this question: >>>>> >>>>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19 >>>>> >>>>> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release) >>>>> >>>>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection >>>>> (is this your understanding too?) >>>> >>>> Yes. >>>> >>>>> >>>>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses >>>>> >>>>> we already have >>>>> >>>>> -XX:+UseParNewGC >>>>> -XX:+UseConcMarkSweepGC >>>>> -XX:+CMSParallelRemarkEnabled >>>>> >>>>> we have seen a few long(isn) initial marks, so >>>>> >>>>> -XX:+CMSParallelInitialMarkEnabled sounds good >>>>> >>>>> as for >>>>> >>>>> -XX:+CMSEdenChunksRecordAlways >>>>> >>>>> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk? >>>> >>>> Fast path allocation is done from TLAB's. If you have to get >>>> a new TLAB, the call to get the new TLAB comes from compiled >>>> code but the call is into the JVM and that is the slow path where >>>> the sampling is done. >>>> >>>> Jon >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Graham >>>>> >>>>> P.S. less relevant I think, but our old generation is 16g >>>>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1574 bytes Desc: not available URL: From graham at vast.com Fri Jun 20 14:39:14 2014 From: graham at vast.com (graham sanderson) Date: Fri, 20 Jun 2014 09:39:14 -0500 Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled In-Reply-To: <80970B7A-B6AE-4125-AA99-4D846F3EF2DB@vast.com> References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com> <539F371F.10008@oracle.com> <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com> <53A31C85.3040600@oracle.com> <80970B7A-B6AE-4125-AA99-4D846F3EF2DB@vast.com> Message-ID: <29C39CCF-DBFD-4554-8072-70BF1FDE098E@vast.com> All is well in the universe (except for some mild stupidity on my part). The anomalous safepoints are caused by VisualVM it would seem, which I had been using on and off to watch the GC visually, and just happened to have left it connected to certain nodes for coincidental lengths of time that seemed to produce a somewhat correlated pattern. I don?t have a dev JVM so I can?t do -XX:+TraceSafepoint, but whatever it is doing, it is doing it once every 1 or 2 seconds, even if you turn all monitoring off (but remain connected) On Jun 19, 2014, at 2:22 PM, graham sanderson wrote: > Ok, thanks? > > The 3 before the 1129 and the 6 before the 11459 are the node numbers > > I?ll dig around in the source for any way of finding out what the cause of safe points are (I?m not aware of a better -XX: option) ? frankly I?ll probably just report this in the user hotspot-gc-use thread (it isn?t causing any real issues) and see if other people report it too. > > Thanks, > > Graham > > On Jun 19, 2014, at 12:23 PM, Jon Masamitsu wrote: > >> Graham, >> >> I don't have any guesses about what is causing the difference in >> the number of safepoints. As you note the messages you are >> counting are not specifically GC pauses. The differences you >> are seeing are huge (1129 vs 11459, which, by the way, I don't >> see those numbers in the tables; did you mean 31129 vs 611459?). >> If those numbers are really GC related I would expect some >> dramatic effect on the application behavior. >> >> Sometimes better GC means applications run faster so >> maybe more work is getting done in nodes 4-6. >> The scale of the difference is surprising though. >> >> Jon >> >> On 6/18/2014 8:19 PM, graham sanderson wrote: >>> The options are working great and as expected (faster initial mark, and no long pauses after abortable preclean timeout). One weird thing though which I?m curious about: I?m showing some data for six JVMs (calling them nodes - they are on separate machines) >>> >>> all with : >>> >>> Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild at c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014 >>> JDK 1.7.0_60-b19 >>> 16 gig old gen >>> 8 gig new (6.4 gig eden) >>> -XX:+UseParNewGC >>> -XX:+UseConcMarkSweepGC >>> -XX:+CMSParallelRemarkEnabled >>> 256 gig RAM >>> 16 cores (sandy bridge) >>> >>> Nodes 4-6 also have >>> >>> -XX:+CMSEdenChunksRecordAlways >>> -XX:+CMSParallelInitialMarkEnabled >>> >>> There are some application level config differences (which limit amount of certain objects kept in memory before flushing to disk) - 1&4 have the same app config, 2&5 have the same app config, 3&6 have the same app config >>> >>> This first dump, shows two days worth of total times application threads were stopped via grepping logs for Total time for which application threads were stopped and summing the values. worst case 4 minutes over the day is not too bad, so this isn?t a big issue >>> >>> 2014-06-17 : 1 154.623 >>> 2014-06-17 : 2 90.3006 >>> 2014-06-17 : 3 75.3602 >>> 2014-06-17 : 4 180.618 >>> 2014-06-17 : 5 107.668 >>> 2014-06-17 : 6 99.7783 >>> ------- >>> 2014-06-18 : 1 190.741 >>> 2014-06-18 : 2 82.8865 >>> 2014-06-18 : 3 90.0098 >>> 2014-06-18 : 4 239.702 >>> 2014-06-18 : 5 149.332 >>> 2014-06-18 : 6 138.03 >>> >>> Notably however if you look via JMX/visualGC, the total GC time is actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I know that biased lock revocation and other things cause safe point, so I figure something other than GC must be the cause? so I just did a count of log lines with Total time for which application threads were stopped and got this: >>> >>> 2014-06-17 : 1 19282 >>> 2014-06-17 : 2 6784 >>> 2014-06-17 : 3 1275 >>> 2014-06-17 : 4 26356 >>> 2014-06-17 : 5 14491 >>> 2014-06-17 : 6 8402 >>> ------- >>> 2014-06-18 : 1 20943 >>> 2014-06-18 : 2 1134 >>> 2014-06-18 : 3 1129 >>> 2014-06-18 : 4 30289 >>> 2014-06-18 : 5 16508 >>> 2014-06-18 : 6 11459 >>> >>> I can?t cycle these nodes right now (to try each new parameter individually), but am curious whether you can think of why adding these parameters would have such a large effect on the number of safe point stops - e.g. 1129 vs 11459 for otherwise identically configured nodes with very similar workload. Note the ratio is highest on nodes 2 vs node 5 which spill the least into the old generation (so certainly fewer CMS cycles, and also fewer young gen collections) if that sparks any ideas. >>> >>> Thanks, >>> >>> Graham. >>> >>> P.S. It is entirely possible I don?t know exactly what Total time for which application threads were stopped refers to in all cases (I?m assuming it is a safe point stop) >>> >>> On Jun 16, 2014, at 1:54 PM, graham sanderson wrote: >>> >>>> Thanks Jon; that?s exactly what i was hoping >>>> >>>> On Jun 16, 2014, at 1:27 PM, Jon Masamitsu wrote: >>>> >>>>> >>>>> On 06/12/2014 09:16 PM, graham sanderson wrote: >>>>>> Hi, I hope this is the right list for this question: >>>>>> >>>>>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19 >>>>>> >>>>>> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release) >>>>>> >>>>>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection >>>>>> (is this your understanding too?) >>>>> >>>>> Yes. >>>>> >>>>>> >>>>>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses >>>>>> >>>>>> we already have >>>>>> >>>>>> -XX:+UseParNewGC >>>>>> -XX:+UseConcMarkSweepGC >>>>>> -XX:+CMSParallelRemarkEnabled >>>>>> >>>>>> we have seen a few long(isn) initial marks, so >>>>>> >>>>>> -XX:+CMSParallelInitialMarkEnabled sounds good >>>>>> >>>>>> as for >>>>>> >>>>>> -XX:+CMSEdenChunksRecordAlways >>>>>> >>>>>> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk? >>>>> >>>>> Fast path allocation is done from TLAB's. If you have to get >>>>> a new TLAB, the call to get the new TLAB comes from compiled >>>>> code but the call is into the JVM and that is the slow path where >>>>> the sampling is done. >>>>> >>>>> Jon >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Graham >>>>>> >>>>>> P.S. less relevant I think, but our old generation is 16g >>>>>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1574 bytes Desc: not available URL: From thomas.schatzl at oracle.com Mon Jun 23 10:05:29 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 23 Jun 2014 12:05:29 +0200 Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not in strong code roots for region In-Reply-To: <53A03098.2070207@oracle.com> References: <53A03098.2070207@oracle.com> Message-ID: <1403517929.2753.22.camel@cirrus> Hi Per, On Tue, 2014-06-17 at 14:12 +0200, Per Liden wrote: > Could I please have this fix reviewed. > > Summary: nmethods are only registered with the heap if > nmethod::detect_scavenge_root_oops() returns true. However, in case the > nmethod only contains oops to humongous objects > detect_scavenge_root_oops() will return false and the nmethod will not > be registered. This will later cause heap verification to fail. > > There are several ways in which this can be fixed. One alternative is to > adjust the verification to ignore humongous oops (since these objects > will never move). Another alternative is to just register the method > regardless of what detect_scavenge_root_oops() says. Since we might want > to allow humongous objects to move in the future this is the proposed fix. I agree that this is the better solution. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8046231 > Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/ > > Testing: > * gc-test-suite > * manual ad-hoc testing looks good. Thanks, Thomas From per.liden at oracle.com Mon Jun 23 10:44:47 2014 From: per.liden at oracle.com (Per Liden) Date: Mon, 23 Jun 2014 12:44:47 +0200 Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not in strong code roots for region In-Reply-To: <1403517929.2753.22.camel@cirrus> References: <53A03098.2070207@oracle.com> <1403517929.2753.22.camel@cirrus> Message-ID: <53A8051F.3040403@oracle.com> Thanks Thomas! /Per On 06/23/2014 12:05 PM, Thomas Schatzl wrote: > Hi Per, > > On Tue, 2014-06-17 at 14:12 +0200, Per Liden wrote: >> Could I please have this fix reviewed. >> >> Summary: nmethods are only registered with the heap if >> nmethod::detect_scavenge_root_oops() returns true. However, in case the >> nmethod only contains oops to humongous objects >> detect_scavenge_root_oops() will return false and the nmethod will not >> be registered. This will later cause heap verification to fail. >> >> There are several ways in which this can be fixed. One alternative is to >> adjust the verification to ignore humongous oops (since these objects >> will never move). Another alternative is to just register the method >> regardless of what detect_scavenge_root_oops() says. Since we might want >> to allow humongous objects to move in the future this is the proposed fix. > > I agree that this is the better solution. > >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231 >> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/ >> >> Testing: >> * gc-test-suite >> * manual ad-hoc testing > > looks good. > > Thanks, > Thomas > From erik.helin at oracle.com Mon Jun 23 12:39:11 2014 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 23 Jun 2014 14:39:11 +0200 Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1 SparsePRTEntry In-Reply-To: <53A2E53B.3050508@oracle.com> References: <53A2E53B.3050508@oracle.com> Message-ID: <1846572.Plu0fjq2zP@ehelin-laptop> On Thursday 19 June 2014 15:27:23 PM Andreas Sj?berg wrote: > Hi all, > > can I please have reviews for this patch that removes the unrolled > for-loops in sparsePRT.cpp. > > I ran some performance benchmarks and could not see any benefits in > keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on > Linux x64 actually. > > Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/ Looks good, Reviewed! Thanks, Erik > Testing: jprt, specjbb2005, specjvm2008, specjbb2013 > > Thanks, > Andreas From erik.helin at oracle.com Mon Jun 23 13:47:02 2014 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 23 Jun 2014 15:47:02 +0200 Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not in strong code roots for region In-Reply-To: <53A03098.2070207@oracle.com> References: <53A03098.2070207@oracle.com> Message-ID: <2481990.7gl4ni0RtK@ehelin-laptop> On Tuesday 17 June 2014 14:12:08 PM Per Liden wrote: > Could I please have this fix reviewed. > > Summary: nmethods are only registered with the heap if > nmethod::detect_scavenge_root_oops() returns true. However, in case the > nmethod only contains oops to humongous objects > detect_scavenge_root_oops() will return false and the nmethod will not > be registered. This will later cause heap verification to fail. > > There are several ways in which this can be fixed. One alternative is to > adjust the verification to ignore humongous oops (since these objects > will never move). Another alternative is to just register the method > regardless of what detect_scavenge_root_oops() says. Since we might want > to allow humongous objects to move in the future this is the proposed fix. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8046231 > Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/ Looks good, Reviewed. Thanks, Erik > Testing: > * gc-test-suite > * manual ad-hoc testing > > Thanks! > /Per From per.liden at oracle.com Mon Jun 23 14:18:14 2014 From: per.liden at oracle.com (Per Liden) Date: Mon, 23 Jun 2014 16:18:14 +0200 Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not in strong code roots for region In-Reply-To: <2481990.7gl4ni0RtK@ehelin-laptop> References: <53A03098.2070207@oracle.com> <2481990.7gl4ni0RtK@ehelin-laptop> Message-ID: <53A83726.90809@oracle.com> Thanks Erik! /Per On 06/23/2014 03:47 PM, Erik Helin wrote: > On Tuesday 17 June 2014 14:12:08 PM Per Liden wrote: >> Could I please have this fix reviewed. >> >> Summary: nmethods are only registered with the heap if >> nmethod::detect_scavenge_root_oops() returns true. However, in case the >> nmethod only contains oops to humongous objects >> detect_scavenge_root_oops() will return false and the nmethod will not >> be registered. This will later cause heap verification to fail. >> >> There are several ways in which this can be fixed. One alternative is to >> adjust the verification to ignore humongous oops (since these objects >> will never move). Another alternative is to just register the method >> regardless of what detect_scavenge_root_oops() says. Since we might want >> to allow humongous objects to move in the future this is the proposed fix. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231 >> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/ > > Looks good, Reviewed. > > Thanks, > Erik > >> Testing: >> * gc-test-suite >> * manual ad-hoc testing >> >> Thanks! >> /Per > From mikael.gerdin at oracle.com Mon Jun 23 14:25:40 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 23 Jun 2014 16:25:40 +0200 Subject: RFR: 8047819: G1 HeapRegionDCTOC does not need to inherit ContiguousSpaceDCTOC Message-ID: <87729984.51aIKtkShi@mgerdin03> Hi! As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] G1 needs to stop using the special version of DirtyCardToOopClosure. This also makes the code more easy to follow since G1 never actually relies on the functionality from Filtering_DCTOC and ContiguousSpaceDCTOC. This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 which are needed to refactor the HeapRegion class and its superclasses in order to simplify the G1 class unloading change which is coming. Bug: https://bugs.openjdk.java.net/browse/JDK-8047819 Webrev: http://cr.openjdk.java.net/~mgerdin/8047819/webrev/ [1] https://bugs.openjdk.java.net/browse/JDK-8047818 Thanks /Mikael From mikael.gerdin at oracle.com Mon Jun 23 14:25:53 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 23 Jun 2014 16:25:53 +0200 Subject: RFR: 8047820: G1 Block offset table does not need to support generic Space classes Message-ID: <11148293.8zVS3laxSo@mgerdin03> Hi! As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] G1's block offset table needs to be modified to work with Space subclasses which are not subclasses of ContiguousSpace. Just change the code to have knowledge of G1OffsetTableContigSpace. This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 which are needed to refactor the HeapRegion class and its superclasses in order to simplify the G1 class unloading change which is coming. Bug: https://bugs.openjdk.java.net/browse/JDK-8047820 Webrev: http://cr.openjdk.java.net/~mgerdin/8047820/webrev/ [1] https://bugs.openjdk.java.net/browse/JDK-8047818 Thanks /Mikael From mikael.gerdin at oracle.com Mon Jun 23 14:26:00 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 23 Jun 2014 16:26 +0200 Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended Message-ID: <1878024.dB1lCmA3nF@mgerdin03> Hi! As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] and as a general cleanup we should rename the save_marks and set_saved_marks methods on HeapRegion. They are not used with oops_since_saved_marks_iterate and cause more confusion than anything. This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 which are needed to refactor the HeapRegion class and its superclasses in order to simplify the G1 class unloading change which is coming. Bug: https://bugs.openjdk.java.net/browse/JDK-8047821 Webrev: http://cr.openjdk.java.net/~mgerdin/8047821/webrev/ [1] https://bugs.openjdk.java.net/browse/JDK-8047818 Thanks /Mikael From mikael.gerdin at oracle.com Mon Jun 23 14:26:03 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 23 Jun 2014 16:26:03 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces Message-ID: <17336712.znP3JIk1Pt@mgerdin03> Hi! When G1 is modified to unload classes without doing full collections the old HeapRegions can contain unparseable objects. This makes ContiguousSpace unsuitable as a base class for HeapRegion since it assumes that all objects below _top are parseable. Modify G1OffsetTableContigSpace to implement allocation with a separate _top and reimplement some Space pure virtuals to make object iteration work as expected. This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 which are needed to refactor the HeapRegion class and its superclasses in order to simplify the G1 class unloading change which is coming. This change depends on the 19, 20 and 21 changes. Bug: https://bugs.openjdk.java.net/browse/JDK-8047818 Webrev: http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ Notes: The moving of set_offset_range is due to an introduced circular dependency between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp Thanks /Mikael From mikael.gerdin at oracle.com Mon Jun 23 14:51:54 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 23 Jun 2014 16:51:54 +0200 Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1 SparsePRTEntry In-Reply-To: <53A2E53B.3050508@oracle.com> References: <53A2E53B.3050508@oracle.com> Message-ID: <2772002.dt1otWjIrg@mgerdin03> Hi Andreas, On Thursday 19 June 2014 15.27.23 Andreas Sj?berg wrote: > Hi all, > > can I please have reviews for this patch that removes the unrolled > for-loops in sparsePRT.cpp. > > I ran some performance benchmarks and could not see any benefits in > keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on > Linux x64 actually. > > Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/ It looks like you can remove the define as well: 36 #define UNROLL_CARD_LOOPS 1 UnrollFactor should also be useless now, but it seems like it's being used to align up the number of cards. I suggest you leave UnrollFactor for a second cleanup. /Mikael > > Testing: jprt, specjbb2005, specjvm2008, specjbb2013 > > Thanks, > Andreas From markus.gronlund at oracle.com Mon Jun 23 15:26:51 2014 From: markus.gronlund at oracle.com (=?iso-8859-1?B?TWFya3VzIEdy9m5sdW5k?=) Date: Mon, 23 Jun 2014 08:26:51 -0700 (PDT) Subject: FW: RFR(S): 8047812: Ensure ClassLoaderDataGraph::classes_unloading_do only delivers klasses from CLDs with non-reclaimed class loader oops Message-ID: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default> Sending this to the Hotspot-GC-dev group as well. ? /Markus ? From: Markus Gr?nlund Sent: den 23 juni 2014 17:03 To: hotspot-runtime-dev; serviceability-dev Subject: RFR(S): 8047812: Ensure ClassLoaderDataGraph::classes_unloading_do only delivers klasses from CLDs with non-reclaimed class loader oops ? Greetings, ? Kindly asking for reviews for the following change: ? Bug: https://bugs.openjdk.java.net/browse/JDK-8047812 Webrev: http://cr.openjdk.java.net/~mgronlun/8047812/webrev01 ? Description: The "8038212: Method::is_valid_method() check has performance regression ??impact for stackwalking" - changeset introduced a change in how the ClassLoaderDataGraph::_unloading list of ClassLoaderData's is purged. This change to the purging of the CLD's work the same as before for most GC's, but when using CMS GC, SystemDictionary::do_unloading() is called twice with no explicit purge call in between. On the second call (post-sweep), we can now get stale class loader oops delivered as part of the Klass closure callbacks from the _unloading list. Again, this is because there is no explicit purge call in between these two entries to SystemDictionary::do_unloading() - and being CMS and concurrent, it is very hard to accommodate a timely and proper purge call here. The first do_unloading call comes after CMS concurrent marking, and the second comes from a Full GC triggered while sweeping the CMS heap. This fix ensures the unloading purge mechanism to work correctly also for the CMS collector, in that only CLDs with non-reclaimed class loader oops will deliver klasses from the _unloading list. In addition, this will ensure a single "logical" pass is achieved when iterating the unloading list in-between purges (avoiding the processing of the same data twice). This fix is precipitated by nightly testing failures with CMS after the introduction of 8038212: Method::is_valid_method() check has performance regression ??impact for stackwalking" - for example "nsk/sysdict/vm/stress/jck12a//sysdictj12a008" which is crashing because of following up stale klass loader oop's from the ClassLoaderDataGraph::_unloading list. ? Thanks Markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Mon Jun 23 15:45:47 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 23 Jun 2014 17:45:47 +0200 Subject: FW: RFR(S): 8047812: Ensure ClassLoaderDataGraph::classes_unloading_do only delivers klasses from CLDs with non-reclaimed class loader oops In-Reply-To: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default> References: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default> Message-ID: <53A84BAB.2040709@oracle.com> Markus, You need to include all three mailing list in the same mail, or else the mail threads will diverge. thanks, StefanK On 2014-06-23 17:26, Markus Gr?nlund wrote: > > Sending this to the Hotspot-GC-dev group as well. > > /Markus > > *From:*Markus Gr?nlund > *Sent:* den 23 juni 2014 17:03 > *To:* hotspot-runtime-dev; serviceability-dev > *Subject:* RFR(S): 8047812: Ensure > ClassLoaderDataGraph::classes_unloading_do only delivers klasses from > CLDs with non-reclaimed class loader oops > > Greetings, > > Kindly asking for reviews for the following change: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8047812 > > Webrev: http://cr.openjdk.java.net/~mgronlun/8047812/webrev01 > > > Description: > > The "8038212: Method::is_valid_method() check has performance regression > impact for stackwalking" - changeset introduced a change in how the > ClassLoaderDataGraph::_unloading list of ClassLoaderData's is purged. > > This change to the purging of the CLD's work the same as before for > most GC's, but when using CMS GC, SystemDictionary::do_unloading() is > called twice with no explicit purge call in between. On the second > call (post-sweep), we can now get stale class loader oops delivered as > part of the Klass closure callbacks from the _unloading list. Again, > this is because there is no explicit purge call in between these two > entries to SystemDictionary::do_unloading() - and being CMS and > concurrent, it is very hard to accommodate a timely and proper purge > call here. > > The first do_unloading call comes after CMS concurrent marking, and > the second comes from a Full GC triggered while sweeping the CMS heap. > > This fix ensures the unloading purge mechanism to work correctly also > for the CMS collector, in that only CLDs with non-reclaimed class > loader oops will deliver klasses from the _unloading list. In > addition, this will ensure a single "logical" pass is achieved when > iterating the unloading list in-between purges (avoiding the > processing of the same data twice). > > This fix is precipitated by nightly testing failures with CMS after > the introduction of 8038212: Method::is_valid_method() check has > performance regression > impact for stackwalking" - for example > "nsk/sysdict/vm/stress/jck12a//sysdictj12a008" which is crashing > because of following up stale klass loader oop's from the > ClassLoaderDataGraph::_unloading list. > > Thanks > > Markus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Tue Jun 24 09:05:08 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 24 Jun 2014 11:05:08 +0200 Subject: RFR: 8047819: G1 HeapRegionDCTOC does not need to inherit ContiguousSpaceDCTOC In-Reply-To: <87729984.51aIKtkShi@mgerdin03> References: <87729984.51aIKtkShi@mgerdin03> Message-ID: <53A93F44.6050907@oracle.com> On 2014-06-23 16:25, Mikael Gerdin wrote: > Hi! > > As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] > G1 needs to stop using the special version of DirtyCardToOopClosure. > This also makes the code more easy to follow since G1 never actually > relies on the functionality from Filtering_DCTOC and ContiguousSpaceDCTOC. > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 > which are needed to refactor the HeapRegion class and its superclasses > in order to simplify the G1 class unloading change which is coming. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047819 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047819/webrev/ Looks good. StefanK > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > > Thanks > /Mikael From stefan.karlsson at oracle.com Tue Jun 24 09:23:36 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 24 Jun 2014 11:23:36 +0200 Subject: RFR: 8047820: G1 Block offset table does not need to support generic Space classes In-Reply-To: <11148293.8zVS3laxSo@mgerdin03> References: <11148293.8zVS3laxSo@mgerdin03> Message-ID: <53A94398.2080301@oracle.com> On 2014-06-23 16:25, Mikael Gerdin wrote: > Hi! > > As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] > G1's block offset table needs to be modified to work with Space subclasses > which are not subclasses of ContiguousSpace. Just change the code to have > knowledge of G1OffsetTableContigSpace. > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 > which are needed to refactor the HeapRegion class and its superclasses > in order to simplify the G1 class unloading change which is coming. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047820 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/ http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implementation/g1/g1BlockOffsetTable.cpp.udiff.html http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implementation/g1/g1BlockOffsetTable.inline.hpp.udiff.html I talked to Mikael and we decided to do the changes from obj->size() to block_size() in a later change. thanks, StefanK > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > > Thanks > /Mikael From stefan.karlsson at oracle.com Tue Jun 24 10:25:25 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 24 Jun 2014 12:25:25 +0200 Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended In-Reply-To: <1878024.dB1lCmA3nF@mgerdin03> References: <1878024.dB1lCmA3nF@mgerdin03> Message-ID: <53A95215.9020108@oracle.com> On 2014-06-23 16:26, Mikael Gerdin wrote: > Hi! > > As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] > and as a general cleanup we should rename the save_marks and set_saved_marks > methods on HeapRegion. They are not used with oops_since_saved_marks_iterate > and cause more confusion than anything. > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 > which are needed to refactor the HeapRegion class and its superclasses > in order to simplify the G1 class unloading change which is coming. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047821 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047821/webrev/ Looks good, but it would be nice if you could remove these as well: http://cr.openjdk.java.net/~mgerdin/8047821/webrev/src/share/vm/gc_implementation/g1/heapRegion.hpp.frames.html 583 // Apply "cl->do_oop" to (the addresses of) all reference fields in objects 584 // allocated in the current region before the last call to "save_mark". 585 void oop_before_save_marks_iterate(ExtendedOopClosure* cl); and 205 // Requires that the region "mr" be dense with objects, and begin and end 206 // with an object. 207 void oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl); and 396 void HeapRegion::oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl) { 397 HeapWord* p = mr.start(); 398 HeapWord* e = mr.end(); 399 oop obj; 400 while (p < e) { 401 obj = oop(p); 402 p += obj->oop_iterate(cl); 403 } 404 assert(p == e, "bad memregion: doesn't end on obj boundary"); 405 } thanks, StefanK > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > > Thanks > /Mikael From stefan.karlsson at oracle.com Tue Jun 24 11:06:46 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 24 Jun 2014 13:06:46 +0200 Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended In-Reply-To: <53A95215.9020108@oracle.com> References: <1878024.dB1lCmA3nF@mgerdin03> <53A95215.9020108@oracle.com> Message-ID: <53A95BC6.60401@oracle.com> On 2014-06-24 12:25, Stefan Karlsson wrote: > > On 2014-06-23 16:26, Mikael Gerdin wrote: >> Hi! >> >> As part of a larger effort to detach G1's HeapRegion from >> ContiguousSpace[1] >> and as a general cleanup we should rename the save_marks and >> set_saved_marks >> methods on HeapRegion. They are not used with >> oops_since_saved_marks_iterate >> and cause more confusion than anything. >> >> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, >> 8047821 >> which are needed to refactor the HeapRegion class and its superclasses >> in order to simplify the G1 class unloading change which is coming. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8047821 >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/ > > Looks good, but it would be nice if you could remove these as well: > > http://cr.openjdk.java.net/~mgerdin/8047821/webrev/src/share/vm/gc_implementation/g1/heapRegion.hpp.frames.html > This should also be removed: 572 // Allows logical separation between objects allocated before and after. 573 void save_marks(); StefanK > > 583 // Apply "cl->do_oop" to (the addresses of) all reference > fields in objects > 584 // allocated in the current region before the last call to > "save_mark". > 585 void oop_before_save_marks_iterate(ExtendedOopClosure* cl); > > and > > 205 // Requires that the region "mr" be dense with objects, and > begin and end > 206 // with an object. > 207 void oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl); > > and > > 396 void HeapRegion::oops_in_mr_iterate(MemRegion mr, > ExtendedOopClosure* cl) { > 397 HeapWord* p = mr.start(); > 398 HeapWord* e = mr.end(); > 399 oop obj; > 400 while (p < e) { > 401 obj = oop(p); > 402 p += obj->oop_iterate(cl); > 403 } > 404 assert(p == e, "bad memregion: doesn't end on obj boundary"); > 405 } > > > thanks, > StefanK > >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8047818 >> >> Thanks >> /Mikael > From mikael.gerdin at oracle.com Tue Jun 24 11:33:01 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 24 Jun 2014 13:33:01 +0200 Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended In-Reply-To: <53A95BC6.60401@oracle.com> References: <1878024.dB1lCmA3nF@mgerdin03> <53A95215.9020108@oracle.com> <53A95BC6.60401@oracle.com> Message-ID: <4395417.SDKTkeWyTO@mgerdin03> On Tuesday 24 June 2014 13.06.46 Stefan Karlsson wrote: > On 2014-06-24 12:25, Stefan Karlsson wrote: > > On 2014-06-23 16:26, Mikael Gerdin wrote: > >> Hi! > >> > >> As part of a larger effort to detach G1's HeapRegion from > >> ContiguousSpace[1] > >> and as a general cleanup we should rename the save_marks and > >> set_saved_marks > >> methods on HeapRegion. They are not used with > >> oops_since_saved_marks_iterate > >> and cause more confusion than anything. > >> > >> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, > >> 8047821 > >> which are needed to refactor the HeapRegion class and its superclasses > >> in order to simplify the G1 class unloading change which is coming. > >> > >> Bug: > >> https://bugs.openjdk.java.net/browse/JDK-8047821 > >> Webrev: > >> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/ > > > > Looks good, but it would be nice if you could remove these as well: > > > > http://cr.openjdk.java.net/~mgerdin/8047821/webrev/src/share/vm/gc_impleme > > ntation/g1/heapRegion.hpp.frames.html > This should also be removed: > > 572 // Allows logical separation between objects allocated before and > after. 573 void save_marks(); Will do. Thanks /Mikael > > StefanK > > > 583 // Apply "cl->do_oop" to (the addresses of) all reference > > > > fields in objects > > > > 584 // allocated in the current region before the last call to > > > > "save_mark". > > > > 585 void oop_before_save_marks_iterate(ExtendedOopClosure* cl); > > > > and > > > > 205 // Requires that the region "mr" be dense with objects, and > > > > begin and end > > > > 206 // with an object. > > 207 void oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl); > > > > and > > > > 396 void HeapRegion::oops_in_mr_iterate(MemRegion mr, > > > > ExtendedOopClosure* cl) { > > > > 397 HeapWord* p = mr.start(); > > 398 HeapWord* e = mr.end(); > > 399 oop obj; > > 400 while (p < e) { > > 401 obj = oop(p); > > 402 p += obj->oop_iterate(cl); > > 403 } > > 404 assert(p == e, "bad memregion: doesn't end on obj boundary"); > > 405 } > > > > thanks, > > StefanK > > > >> [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > >> > >> Thanks > >> /Mikael From mikael.gerdin at oracle.com Tue Jun 24 12:10:47 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 24 Jun 2014 14:10:47 +0200 Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended In-Reply-To: <1878024.dB1lCmA3nF@mgerdin03> References: <1878024.dB1lCmA3nF@mgerdin03> Message-ID: <2070242.8rk8uRrq1v@mgerdin03> Hi! On Monday 23 June 2014 16.26.00 Mikael Gerdin wrote: > Hi! > > As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] > and as a general cleanup we should rename the save_marks and > set_saved_marks methods on HeapRegion. They are not used with > oops_since_saved_marks_iterate and cause more confusion than anything. > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, > 8047821 which are needed to refactor the HeapRegion class and its > superclasses in order to simplify the G1 class unloading change which is > coming. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047821 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047821/webrev/ Stefan discovered some more dead code in HeapRegion, here are a new set of webrevs: http://cr.openjdk.java.net/~mgerdin/8047821/webrev.0_to_1/ http://cr.openjdk.java.net/~mgerdin/8047821/webrev.1/ /Mikael > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > > Thanks > /Mikael From mikael.gerdin at oracle.com Tue Jun 24 12:11:59 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 24 Jun 2014 14:11:59 +0200 Subject: RFR: 8047820: G1 Block offset table does not need to support generic Space classes In-Reply-To: <53A94398.2080301@oracle.com> References: <11148293.8zVS3laxSo@mgerdin03> <53A94398.2080301@oracle.com> Message-ID: <1942487.fCe74F8AUE@mgerdin03> On Tuesday 24 June 2014 11.23.36 Stefan Karlsson wrote: > On 2014-06-23 16:25, Mikael Gerdin wrote: > > Hi! > > > > As part of a larger effort to detach G1's HeapRegion from > > ContiguousSpace[1] G1's block offset table needs to be modified to work > > with Space subclasses which are not subclasses of ContiguousSpace. Just > > change the code to have knowledge of G1OffsetTableContigSpace. > > > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, > > 8047821 which are needed to refactor the HeapRegion class and its > > superclasses in order to simplify the G1 class unloading change which is > > coming. > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8047820 > > Webrev: > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/ > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implement > ation/g1/g1BlockOffsetTable.cpp.udiff.html > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implemen > tation/g1/g1BlockOffsetTable.inline.hpp.udiff.html > > I talked to Mikael and we decided to do the changes from obj->size() to > block_size() in a later change. And here's the webrev reflecting that change. http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1/ /Mikael > > thanks, > StefanK > > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > > > > Thanks > > /Mikael From thomas.schatzl at oracle.com Tue Jun 24 12:20:30 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 24 Jun 2014 14:20:30 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> Message-ID: <1403612430.2662.14.camel@cirrus> Hi, On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote: > Hi! > > When G1 is modified to unload classes without doing full collections the old > HeapRegions can contain unparseable objects. This makes ContiguousSpace > unsuitable as a base class for HeapRegion since it assumes that all objects > below _top are parseable. > > Modify G1OffsetTableContigSpace to implement allocation with a separate _top > and reimplement some Space pure virtuals to make object iteration work as > expected. > > This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820, > 8047821 which are needed to refactor the HeapRegion class and its superclasses > in order to simplify the G1 class unloading change which is coming. > This change depends on the 19, 20 and 21 changes. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047818 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > > Notes: > The moving of set_offset_range is due to an introduced circular dependency > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp a few minor nits: - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should access _top directly per coding guidelines - just a note: _top should be declared volatile as it is used in the CAS, although the code is correct. However there is already an issue for that https://bugs.openjdk.java.net/browse/JDK-8033552, so I suggest postponing this. - extra newline after G1OffsetTableContigSpace::allocate_inner() - extra newline after G1BlockOffsetSharedArray::set_offset_array() Thanks, Thomas From thomas.schatzl at oracle.com Tue Jun 24 12:25:13 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 24 Jun 2014 14:25:13 +0200 Subject: RFR: 8047819: G1 HeapRegionDCTOC does not need to inherit ContiguousSpaceDCTOC In-Reply-To: <87729984.51aIKtkShi@mgerdin03> References: <87729984.51aIKtkShi@mgerdin03> Message-ID: <1403612713.2662.15.camel@cirrus> Hi, On Mon, 2014-06-23 at 16:25 +0200, Mikael Gerdin wrote: > Hi! > > As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] > G1 needs to stop using the special version of DirtyCardToOopClosure. > This also makes the code more easy to follow since G1 never actually > relies on the functionality from Filtering_DCTOC and ContiguousSpaceDCTOC. > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 > which are needed to refactor the HeapRegion class and its superclasses > in order to simplify the G1 class unloading change which is coming. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047819 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047819/webrev/ > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 looks good. Thomas From stefan.karlsson at oracle.com Tue Jun 24 13:32:44 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 24 Jun 2014 15:32:44 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> Message-ID: <53A97DFC.2040809@oracle.com> On 2014-06-23 16:26, Mikael Gerdin wrote: > Hi! > > When G1 is modified to unload classes without doing full collections the old > HeapRegions can contain unparseable objects. This makes ContiguousSpace > unsuitable as a base class for HeapRegion since it assumes that all objects > below _top are parseable. > > Modify G1OffsetTableContigSpace to implement allocation with a separate _top > and reimplement some Space pure virtuals to make object iteration work as > expected. > > This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820, > 8047821 which are needed to refactor the HeapRegion class and its superclasses > in order to simplify the G1 class unloading change which is coming. > This change depends on the 19, 20 and 21 changes. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047818 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ http://cr.openjdk.java.net/~mgerdin/8047818/webrev/src/share/vm/gc_implementation/g1/heapRegion.hpp.udiff.html + inline HeapWord* cas_allocate_inner(size_t size); + inline HeapWord* allocate_inner(size_t size); Could you move these declarations to a point after the variable declarations? + inline void set_top(HeapWord* value) { _top = value; } No need for the inline keyword here. void G1OffsetTableContigSpace::clear(bool mangle_space) { - ContiguousSpace::clear(mangle_space); + set_top(bottom()); + CompactibleSpace::clear(mangle_space); ContiguousSpace::clear calls void set_saved_mark() { _saved_mark_word = top(); } I think it might be worth being a bit defensive and doing the same from G1OffsetTableContigSpace::clear. +void G1OffsetTableContigSpace::object_iterate(ObjectClosure* blk) { + HeapWord* p = bottom(); + if (!block_is_obj(p)) { + p += block_size(p); + } + while (p < top()) { + blk->do_object(oop(p)); + p += block_size(p); + } Shouldn't blk->do_object(oop(p)) be guarded by: if (!block_is_obj(p)) G1OffsetTableContigSpace:: G1OffsetTableContigSpace(G1BlockOffsetSharedArray* sharedOffsetArray, MemRegion mr) : + _top(bottom()), _offsets(sharedOffsetArray, mr), _par_alloc_lock(Mutex::leaf, "OffsetTableContigSpace par alloc lock", true), _gc_time_stamp(0) { _offsets.set_space(this); // false ==> we'll do the clearing if there's clearing to be done. - ContiguousSpace::initialize(mr, false, SpaceDecorator::Mangle); + CompactibleSpace::initialize(mr, false, SpaceDecorator::Mangle); bottom() is used before _bottom has been initialized to the correct value. As the code stands we set _top to NULL. _bottom is initialized through this call chain: CompactibleSpace::initialize. Space::initialize set_bottom And the SA agent needs to be updated. =) thanks, StefanK > > Notes: > The moving of set_offset_range is due to an introduced circular dependency > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > > Thanks > /Mikael > From stefan.karlsson at oracle.com Tue Jun 24 13:39:29 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 24 Jun 2014 15:39:29 +0200 Subject: RFR: 8047820: G1 Block offset table does not need to support generic Space classes In-Reply-To: <1942487.fCe74F8AUE@mgerdin03> References: <11148293.8zVS3laxSo@mgerdin03> <53A94398.2080301@oracle.com> <1942487.fCe74F8AUE@mgerdin03> Message-ID: <53A97F91.40204@oracle.com> On 2014-06-24 14:11, Mikael Gerdin wrote: > On Tuesday 24 June 2014 11.23.36 Stefan Karlsson wrote: >> On 2014-06-23 16:25, Mikael Gerdin wrote: >>> Hi! >>> >>> As part of a larger effort to detach G1's HeapRegion from >>> ContiguousSpace[1] G1's block offset table needs to be modified to work >>> with Space subclasses which are not subclasses of ContiguousSpace. Just >>> change the code to have knowledge of G1OffsetTableContigSpace. >>> >>> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, >>> 8047821 which are needed to refactor the HeapRegion class and its >>> superclasses in order to simplify the G1 class unloading change which is >>> coming. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8047820 >>> Webrev: >>> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/ >> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implement >> ation/g1/g1BlockOffsetTable.cpp.udiff.html >> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implemen >> tation/g1/g1BlockOffsetTable.inline.hpp.udiff.html >> >> I talked to Mikael and we decided to do the changes from obj->size() to >> block_size() in a later change. > And here's the webrev reflecting that change. > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1/ Looks good. thanks, StefanK > > /Mikael > >> thanks, >> StefanK >> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8047818 >>> >>> Thanks >>> /Mikael From stefan.karlsson at oracle.com Tue Jun 24 13:40:08 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 24 Jun 2014 15:40:08 +0200 Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended In-Reply-To: <2070242.8rk8uRrq1v@mgerdin03> References: <1878024.dB1lCmA3nF@mgerdin03> <2070242.8rk8uRrq1v@mgerdin03> Message-ID: <53A97FB8.4050809@oracle.com> On 2014-06-24 14:10, Mikael Gerdin wrote: > Hi! > > On Monday 23 June 2014 16.26.00 Mikael Gerdin wrote: >> Hi! >> >> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] >> and as a general cleanup we should rename the save_marks and >> set_saved_marks methods on HeapRegion. They are not used with >> oops_since_saved_marks_iterate and cause more confusion than anything. >> >> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, >> 8047821 which are needed to refactor the HeapRegion class and its >> superclasses in order to simplify the G1 class unloading change which is >> coming. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8047821 >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/ > Stefan discovered some more dead code in HeapRegion, here are a new set of > webrevs: > > http://cr.openjdk.java.net/~mgerdin/8047821/webrev.0_to_1/ Looks good. thanks, StefanK > http://cr.openjdk.java.net/~mgerdin/8047821/webrev.1/ > > /Mikael > >> [1] https://bugs.openjdk.java.net/browse/JDK-8047818 >> >> Thanks >> /Mikael From thomas.schatzl at oracle.com Tue Jun 24 14:06:40 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 24 Jun 2014 16:06:40 +0200 Subject: RFR: 8047820: G1 Block offset table does not need to support generic Space classes In-Reply-To: <1942487.fCe74F8AUE@mgerdin03> References: <11148293.8zVS3laxSo@mgerdin03> <53A94398.2080301@oracle.com> <1942487.fCe74F8AUE@mgerdin03> Message-ID: <1403618800.2662.17.camel@cirrus> Hi all, On Tue, 2014-06-24 at 14:11 +0200, Mikael Gerdin wrote: > On Tuesday 24 June 2014 11.23.36 Stefan Karlsson wrote: > > On 2014-06-23 16:25, Mikael Gerdin wrote: > > > Hi! > > > > > > As part of a larger effort to detach G1's HeapRegion from > > > ContiguousSpace[1] G1's block offset table needs to be modified to work > > > with Space subclasses which are not subclasses of ContiguousSpace. Just > > > change the code to have knowledge of G1OffsetTableContigSpace. > > > > > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, > > > 8047821 which are needed to refactor the HeapRegion class and its > > > superclasses in order to simplify the G1 class unloading change which is > > > coming. > > > > > > Bug: > > > https://bugs.openjdk.java.net/browse/JDK-8047820 > > > Webrev: > > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/ > > > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implement > > ation/g1/g1BlockOffsetTable.cpp.udiff.html > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implemen > > tation/g1/g1BlockOffsetTable.inline.hpp.udiff.html > > > > I talked to Mikael and we decided to do the changes from obj->size() to > > block_size() in a later change. > > And here's the webrev reflecting that change. > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1/ > Looks good, Thomas From jon.masamitsu at oracle.com Tue Jun 24 14:26:52 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 24 Jun 2014 07:26:52 -0700 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> Message-ID: <53A98AAC.6060408@oracle.com> Mikael, Did you consider creating a base class for ContiguousSpace and G1OffsetTableContigSpace that has a _top but does not assume parsability? Could allocate_inner() have been called allocate_impl() as it is in ContiguousSpace? I don't know what the "inner" in the name is telling me. I've just started on the review so more to come. Jon On 06/23/2014 07:26 AM, Mikael Gerdin wrote: > Hi! > > When G1 is modified to unload classes without doing full collections the old > HeapRegions can contain unparseable objects. This makes ContiguousSpace > unsuitable as a base class for HeapRegion since it assumes that all objects > below _top are parseable. > > Modify G1OffsetTableContigSpace to implement allocation with a separate _top > and reimplement some Space pure virtuals to make object iteration work as > expected. > > This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820, > 8047821 which are needed to refactor the HeapRegion class and its superclasses > in order to simplify the G1 class unloading change which is coming. > This change depends on the 19, 20 and 21 changes. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047818 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > > Notes: > The moving of set_offset_range is due to an introduced circular dependency > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > > Thanks > /Mikael > From mikael.gerdin at oracle.com Tue Jun 24 14:43:41 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 24 Jun 2014 16:43:41 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <53A98AAC.6060408@oracle.com> References: <17336712.znP3JIk1Pt@mgerdin03> <53A98AAC.6060408@oracle.com> Message-ID: <2594075.LDRc6zFiFv@mgerdin03> Jon, On Tuesday 24 June 2014 07.26.52 Jon Masamitsu wrote: > Mikael, > > Did you consider creating a base class for ContiguousSpace and > G1OffsetTableContigSpace that has a _top but does not assume > parsability? I did consider it but I thought that the added complexity of having even more levels of inheritance were not worth the benefit of sharing the _top field. Especially since G1 attempts to hack around the semantics around the _top field with respect to concurrent access. See the disjunction I removed from allocate_impl and when G1 calls cas_allocate and allocate. > > Could allocate_inner() have been called allocate_impl() as it is > in ContiguousSpace? I don't know what the "inner" in the name > is telling me. "inner" tries to signal that this function is internal and wrapped by other methods providing the external API. I don't have a particular naming preference here, if the other reviewers are fine with "impl" or don't have a preference I'm fine with changing it. > > I've just started on the review so more to come. Great! Thanks /Mikael > > Jon > > On 06/23/2014 07:26 AM, Mikael Gerdin wrote: > > Hi! > > > > When G1 is modified to unload classes without doing full collections the > > old HeapRegions can contain unparseable objects. This makes > > ContiguousSpace unsuitable as a base class for HeapRegion since it > > assumes that all objects below _top are parseable. > > > > Modify G1OffsetTableContigSpace to implement allocation with a separate > > _top and reimplement some Space pure virtuals to make object iteration > > work as expected. > > > > This change is the last part of a set of 4 changes: 8047818, 8047819, > > 8047820, 8047821 which are needed to refactor the HeapRegion class and > > its superclasses in order to simplify the G1 class unloading change which > > is coming. This change depends on the 19, 20 and 21 changes. > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8047818 > > Webrev: > > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > > > > Notes: > > The moving of set_offset_range is due to an introduced circular dependency > > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > > > > Thanks > > /Mikael From mikael.gerdin at oracle.com Tue Jun 24 15:39:33 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 24 Jun 2014 17:39:33 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <1403612430.2662.14.camel@cirrus> References: <17336712.znP3JIk1Pt@mgerdin03> <1403612430.2662.14.camel@cirrus> Message-ID: <6630007.I6e4KLTY5F@mgerdin03> On Tuesday 24 June 2014 14.20.30 Thomas Schatzl wrote: > Hi, > > On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote: > > Hi! > > > > When G1 is modified to unload classes without doing full collections the > > old HeapRegions can contain unparseable objects. This makes > > ContiguousSpace unsuitable as a base class for HeapRegion since it > > assumes that all objects below _top are parseable. > > > > Modify G1OffsetTableContigSpace to implement allocation with a separate > > _top and reimplement some Space pure virtuals to make object iteration > > work as expected. > > > > This change is the last part of a set of 4 changes: 8047818, 8047819, > > 8047820, 8047821 which are needed to refactor the HeapRegion class and > > its superclasses in order to simplify the G1 class unloading change which > > is coming. This change depends on the 19, 20 and 21 changes. > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8047818 > > Webrev: > > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > > > > Notes: > > The moving of set_offset_range is due to an introduced circular dependency > > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > > a few minor nits: > > - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should > access _top directly per coding guidelines I interpret this as a request to change to HeapWord* obj = _top; Should I change other uses of top() as well? I could only find https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-Accessors as a reference here. Do you interpret that as "only use public accessors if outside the class"? There have been a few requests for renaming and changing the *allocate functions to be either exact copies of the ones in ContiguousSpace or even to break *allocate and _top into a separate class, how do you feel about this? > - just a note: _top should be declared volatile as it is used in the > CAS, although the code is correct. However there is already an issue for > that https://bugs.openjdk.java.net/browse/JDK-8033552, so I suggest > postponing this. Ok. > - extra newline after G1OffsetTableContigSpace::allocate_inner() > - extra newline after G1BlockOffsetSharedArray::set_offset_array() I'll remove these. /Mikael > > Thanks, > Thomas From jon.masamitsu at oracle.com Tue Jun 24 16:34:57 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 24 Jun 2014 09:34:57 -0700 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <2594075.LDRc6zFiFv@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> <53A98AAC.6060408@oracle.com> <2594075.LDRc6zFiFv@mgerdin03> Message-ID: <53A9A8B1.2080301@oracle.com> On 06/24/2014 07:43 AM, Mikael Gerdin wrote: > Jon, > > On Tuesday 24 June 2014 07.26.52 Jon Masamitsu wrote: >> Mikael, >> >> Did you consider creating a base class for ContiguousSpace and >> G1OffsetTableContigSpace that has a _top but does not assume >> parsability? > I did consider it but I thought that the added complexity of having even more > levels of inheritance were not worth the benefit of sharing the _top field. > Especially since G1 attempts to hack around the semantics around the _top > field with respect to concurrent access. See the disjunction I removed from > allocate_impl and when G1 calls cas_allocate and allocate. Ok. That's enough of a reason. > >> Could allocate_inner() have been called allocate_impl() as it is >> in ContiguousSpace? I don't know what the "inner" in the name >> is telling me. > "inner" tries to signal that this function is internal and wrapped by other > methods providing the external API. I don't have a particular naming > preference here, if the other reviewers are fine with "impl" or don't have a > preference I'm fine with changing it. If no one objects, the "impl" has more of a meaning to me. http://cr.openjdk.java.net/~mgerdin/8047818/webrev/src/share/vm/gc_implementation/g1/heapRegion.inline.hpp.frames.html You use the if-then {return ...} return NULL style. 52 inline HeapWord* G1OffsetTableContigSpace::allocate_inner(size_t size) { 53 HeapWord* obj = top(); 54 if (pointer_delta(end(), obj) >= size) { 55 HeapWord* new_top = obj + size; 56 assert(is_aligned(obj) && is_aligned(new_top), "checking alignment"); 57 set_top(new_top); 58 return obj; 59 } 60 return NULL; 61 } The ContiguousSpace uses the if-then {return ...} else {return NULL} style. Any reason not to use the same style? block_is_obj() seems more like an is_in() method. 89 inline bool 90 HeapRegion::block_is_obj(const HeapWord* p) const { 91 return p < top(); 92 } Could you add a specification for this block_size()? 94 inline size_t 95 HeapRegion::block_size(const HeapWord *addr) const { 96 const HeapWord* current_top = top(); 97 if (addr < current_top) { 98 return oop(addr)->size(); 99 } else { 100 assert(addr == current_top, "just checking"); 101 return pointer_delta(end(), addr); 102 } 103 } Jon > >> I've just started on the review so more to come. > Great! > > Thanks > /Mikael > >> Jon >> >> On 06/23/2014 07:26 AM, Mikael Gerdin wrote: >>> Hi! >>> >>> When G1 is modified to unload classes without doing full collections the >>> old HeapRegions can contain unparseable objects. This makes >>> ContiguousSpace unsuitable as a base class for HeapRegion since it >>> assumes that all objects below _top are parseable. >>> >>> Modify G1OffsetTableContigSpace to implement allocation with a separate >>> _top and reimplement some Space pure virtuals to make object iteration >>> work as expected. >>> >>> This change is the last part of a set of 4 changes: 8047818, 8047819, >>> 8047820, 8047821 which are needed to refactor the HeapRegion class and >>> its superclasses in order to simplify the G1 class unloading change which >>> is coming. This change depends on the 19, 20 and 21 changes. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8047818 >>> Webrev: >>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ >>> >>> Notes: >>> The moving of set_offset_range is due to an introduced circular dependency >>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp >>> >>> Thanks >>> /Mikael From mikael.gerdin at oracle.com Wed Jun 25 06:56:56 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 25 Jun 2014 08:56:56 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <53A9A8B1.2080301@oracle.com> References: <17336712.znP3JIk1Pt@mgerdin03> <2594075.LDRc6zFiFv@mgerdin03> <53A9A8B1.2080301@oracle.com> Message-ID: <3844327.z641kvG9QC@mgerdin03> Jon, On Tuesday 24 June 2014 09.34.57 Jon Masamitsu wrote: > On 06/24/2014 07:43 AM, Mikael Gerdin wrote: > > Jon, > > > > On Tuesday 24 June 2014 07.26.52 Jon Masamitsu wrote: > >> Mikael, > >> > >> Did you consider creating a base class for ContiguousSpace and > >> G1OffsetTableContigSpace that has a _top but does not assume > >> parsability? > > > > I did consider it but I thought that the added complexity of having even > > more levels of inheritance were not worth the benefit of sharing the _top > > field. Especially since G1 attempts to hack around the semantics around > > the _top field with respect to concurrent access. See the disjunction I > > removed from allocate_impl and when G1 calls cas_allocate and allocate. > > Ok. That's enough of a reason. Thanks. I'm planning on filing an RFE for unifying the allocation code between G1 and ContiguousSpace somehow. > > >> Could allocate_inner() have been called allocate_impl() as it is > >> in ContiguousSpace? I don't know what the "inner" in the name > >> is telling me. > > > > "inner" tries to signal that this function is internal and wrapped by > > other > > methods providing the external API. I don't have a particular naming > > preference here, if the other reviewers are fine with "impl" or don't have > > a preference I'm fine with changing it. > > If no one objects, the "impl" has more of a meaning to me. Ok, naming it "impl" also makes it closer to the ContiguousSpace version. > > > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/src/share/vm/gc_implement > ation/g1/heapRegion.inline.hpp.frames.html > > You use the if-then {return ...} return NULL style. > > 52 inline HeapWord* G1OffsetTableContigSpace::allocate_inner(size_t size) > { 53 HeapWord* obj = top(); > 54 if (pointer_delta(end(), obj) >= size) { > 55 HeapWord* new_top = obj + size; > 56 assert(is_aligned(obj) && is_aligned(new_top), "checking > alignment"); 57 set_top(new_top); > 58 return obj; > 59 } > 60 return NULL; > 61 } > > The ContiguousSpace uses the if-then {return ...} else {return NULL} style. > > Any reason not to use the same style? No good reason. I'll keep the new ones in the same style. > > block_is_obj() seems more like an is_in() method. > > 89 inline bool > 90 HeapRegion::block_is_obj(const HeapWord* p) const { > 91 return p < top(); > 92 } > It's actually the same as ContiguousSpace::block_is_obj When G1 class unloading is integrated the implementation of HeapRegion::block_is_obj will change. > > Could you add a specification for this block_size()? > > 94 inline size_t > 95 HeapRegion::block_size(const HeapWord *addr) const { > 96 const HeapWord* current_top = top(); > 97 if (addr < current_top) { > 98 return oop(addr)->size(); > 99 } else { > 100 assert(addr == current_top, "just checking"); > 101 return pointer_delta(end(), addr); > 102 } > 103 } Similar to block_is_obj this is currently the same as the ContiguousSpace variant. When G1 class unloading is integrated the implementation will change. The other declarations of block_size are not that clearly specified, can you elaborate on what kind of specification you are looking for? /Mikael > > Jon > > >> I've just started on the review so more to come. > > > > Great! > > > > Thanks > > /Mikael > > > >> Jon > >> > >> On 06/23/2014 07:26 AM, Mikael Gerdin wrote: > >>> Hi! > >>> > >>> When G1 is modified to unload classes without doing full collections the > >>> old HeapRegions can contain unparseable objects. This makes > >>> ContiguousSpace unsuitable as a base class for HeapRegion since it > >>> assumes that all objects below _top are parseable. > >>> > >>> Modify G1OffsetTableContigSpace to implement allocation with a separate > >>> _top and reimplement some Space pure virtuals to make object iteration > >>> work as expected. > >>> > >>> This change is the last part of a set of 4 changes: 8047818, 8047819, > >>> 8047820, 8047821 which are needed to refactor the HeapRegion class and > >>> its superclasses in order to simplify the G1 class unloading change > >>> which > >>> is coming. This change depends on the 19, 20 and 21 changes. > >>> > >>> Bug: > >>> https://bugs.openjdk.java.net/browse/JDK-8047818 > >>> Webrev: > >>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > >>> > >>> Notes: > >>> The moving of set_offset_range is due to an introduced circular > >>> dependency > >>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > >>> > >>> Thanks > >>> /Mikael From andreas.sjoberg at oracle.com Wed Jun 25 07:02:26 2014 From: andreas.sjoberg at oracle.com (=?ISO-8859-1?Q?Andreas_Sj=F6berg?=) Date: Wed, 25 Jun 2014 09:02:26 +0200 Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1 SparsePRTEntry In-Reply-To: <2772002.dt1otWjIrg@mgerdin03> References: <53A2E53B.3050508@oracle.com> <2772002.dt1otWjIrg@mgerdin03> Message-ID: <53AA7402.1040608@oracle.com> Hi! Following Mikael's review and some offline comments from Thomas I've made these changes in addition to removing the unrolled card loops: * removed the now unused define * added braces for the for-loop in SparsePRTEntry::init * changed the implementation of copy_cards to use memcpy New webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev.02/ Thanks On 06/23/2014 04:51 PM, Mikael Gerdin wrote: > Hi Andreas, > > On Thursday 19 June 2014 15.27.23 Andreas Sj?berg wrote: >> Hi all, >> >> can I please have reviews for this patch that removes the unrolled >> for-loops in sparsePRT.cpp. >> >> I ran some performance benchmarks and could not see any benefits in >> keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on >> Linux x64 actually. >> >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/ > > It looks like you can remove the define as well: > 36 #define UNROLL_CARD_LOOPS 1 > > UnrollFactor should also be useless now, but it seems like it's being used to > align up the number of cards. I suggest you leave UnrollFactor for a second > cleanup. > > /Mikael > >> >> Testing: jprt, specjbb2005, specjvm2008, specjbb2013 >> >> Thanks, >> Andreas > From mikael.gerdin at oracle.com Wed Jun 25 11:25:01 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 25 Jun 2014 13:25:01 +0200 Subject: RFR: 8047820: G1 Block offset table does not need to support generic Space classes In-Reply-To: <11148293.8zVS3laxSo@mgerdin03> References: <11148293.8zVS3laxSo@mgerdin03> Message-ID: <1643242.qRoacBket5@mgerdin03> Hi! On Monday 23 June 2014 16.25.53 Mikael Gerdin wrote: > Hi! > > As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] > G1's block offset table needs to be modified to work with Space subclasses > which are not subclasses of ContiguousSpace. Just change the code to have > knowledge of G1OffsetTableContigSpace. > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, > 8047821 which are needed to refactor the HeapRegion class and its > superclasses in order to simplify the G1 class unloading change which is > coming. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047820 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/ I discovered that I accidentally put the set_offset_array change in the 8047818 webrev. It is actually needed in this change to make the JVM compile without precompiled headers. Here's the new full webrev: http://cr.openjdk.java.net/~mgerdin/8047820/webrev.2/ Incremental webrev: http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1_to_2/ Thanks /Mikael > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > > Thanks > /Mikael From thomas.schatzl at oracle.com Wed Jun 25 11:28:09 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 25 Jun 2014 13:28:09 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <6630007.I6e4KLTY5F@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> <1403612430.2662.14.camel@cirrus> <6630007.I6e4KLTY5F@mgerdin03> Message-ID: <1403695689.2769.12.camel@cirrus> Hi, On Tue, 2014-06-24 at 17:39 +0200, Mikael Gerdin wrote: > On Tuesday 24 June 2014 14.20.30 Thomas Schatzl wrote: > > Hi, > > > > On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote: > > > Hi! > > > > > > When G1 is modified to unload classes without doing full collections the > > > old HeapRegions can contain unparseable objects. This makes > > > ContiguousSpace unsuitable as a base class for HeapRegion since it > > > assumes that all objects below _top are parseable. > > > > > > Modify G1OffsetTableContigSpace to implement allocation with a separate > > > _top and reimplement some Space pure virtuals to make object iteration > > > work as expected. > > > > > > This change is the last part of a set of 4 changes: 8047818, 8047819, > > > 8047820, 8047821 which are needed to refactor the HeapRegion class and > > > its superclasses in order to simplify the G1 class unloading change which > > > is coming. This change depends on the 19, 20 and 21 changes. > > > > > > Bug: > > > https://bugs.openjdk.java.net/browse/JDK-8047818 > > > Webrev: > > > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > > > > > > Notes: > > > The moving of set_offset_range is due to an introduced circular dependency > > > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > > > > a few minor nits: > > > > - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should > > access _top directly per coding guidelines > > I interpret this as a request to change to > HeapWord* obj = _top; > Should I change other uses of top() as well? > > I could only find > https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-Accessors > as a reference here. Do you interpret that as "only use public accessors if > outside the class"? I remember having been made aware of that we are supposed to use members directly within a class and its descendants a (small) few times when I started here - because I otherwise tend to add accessors except for very simple ones, mainly for private variables. I may have misunderstood something too. Looking through the code, this might be (again) G1 code specific where it's done (relatively) frequently in code that is not almost copy&paste from CMS. I remember some changes that also actively removed by-accessor accesses within a given class hierarchy (e.g. collector policy). So in the end I may have misunderstood something, and as I did not really search for something "written down and generally accepted" at that time I am grateful to be corrected. I like this way better too. So keep it as is. Thanks, Thomas From mikael.gerdin at oracle.com Wed Jun 25 11:32:23 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 25 Jun 2014 13:32:23 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <1403695689.2769.12.camel@cirrus> References: <17336712.znP3JIk1Pt@mgerdin03> <6630007.I6e4KLTY5F@mgerdin03> <1403695689.2769.12.camel@cirrus> Message-ID: <4341532.UT0yUrG55d@mgerdin03> Hi Thomas, On Wednesday 25 June 2014 13.28.09 Thomas Schatzl wrote: > Hi, > > On Tue, 2014-06-24 at 17:39 +0200, Mikael Gerdin wrote: > > On Tuesday 24 June 2014 14.20.30 Thomas Schatzl wrote: > > > Hi, > > > > > > On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote: > > > > Hi! > > > > > > > > When G1 is modified to unload classes without doing full collections > > > > the > > > > old HeapRegions can contain unparseable objects. This makes > > > > ContiguousSpace unsuitable as a base class for HeapRegion since it > > > > assumes that all objects below _top are parseable. > > > > > > > > Modify G1OffsetTableContigSpace to implement allocation with a > > > > separate > > > > _top and reimplement some Space pure virtuals to make object iteration > > > > work as expected. > > > > > > > > This change is the last part of a set of 4 changes: 8047818, 8047819, > > > > 8047820, 8047821 which are needed to refactor the HeapRegion class and > > > > its superclasses in order to simplify the G1 class unloading change > > > > which > > > > is coming. This change depends on the 19, 20 and 21 changes. > > > > > > > > Bug: > > > > https://bugs.openjdk.java.net/browse/JDK-8047818 > > > > Webrev: > > > > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > > > > > > > > Notes: > > > > The moving of set_offset_range is due to an introduced circular > > > > dependency > > > > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > > > > > > > a few minor nits: > > > - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should > > > > > > access _top directly per coding guidelines > > > > I interpret this as a request to change to > > > > HeapWord* obj = _top; > > > > Should I change other uses of top() as well? > > > > I could only find > > https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-Access > > ors as a reference here. Do you interpret that as "only use public > > accessors if outside the class"? > > I remember having been made aware of that we are supposed to use members > directly within a class and its descendants a (small) few times when I > started here - because I otherwise tend to add accessors except for very > simple ones, mainly for private variables. I may have misunderstood > something too. > > Looking through the code, this might be (again) G1 code specific where it's > done (relatively) frequently in code that is not almost copy&paste from CMS. > > I remember some changes that also actively removed by-accessor accesses > within a given class hierarchy (e.g. collector policy). > > So in the end I may have misunderstood something, and as I did not really > search for something "written down and generally accepted" at that time I am > grateful to be corrected. I like this way better too. > > So keep it as is. Ok, I see your point. I agree that adding accessors for all members just for the sake of it is generally not that useful. In this case I was sort-of mimicing the code in ContiguousSpace. Per Jon and Stefan's requests I've copied the ContiguousSpace versions of allocate_impl and par_allocate_impl straight off instead of having slightly different own versions, so in the end I added a top_addr() as well since ContiguousSpace has one. A new webrev is coming. /Mikael > > Thanks, > Thomas From mikael.gerdin at oracle.com Wed Jun 25 11:50:57 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 25 Jun 2014 13:50:57 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> Message-ID: <2049395.4GI85LboNd@mgerdin03> Hi! On Monday 23 June 2014 16.26.03 Mikael Gerdin wrote: > Hi! > > When G1 is modified to unload classes without doing full collections the old > HeapRegions can contain unparseable objects. This makes ContiguousSpace > unsuitable as a base class for HeapRegion since it assumes that all objects > below _top are parseable. > > Modify G1OffsetTableContigSpace to implement allocation with a separate _top > and reimplement some Space pure virtuals to make object iteration work as > expected. > > This change is the last part of a set of 4 changes: 8047818, 8047819, > 8047820, 8047821 which are needed to refactor the HeapRegion class and its > superclasses in order to simplify the G1 class unloading change which is > coming. This change depends on the 19, 20 and 21 changes. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8047818 > Webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ Based on review comments from Jon, Stefan and Thomas (thanks!) here's a second version of this webrev. A quick summary of the incremental changes: * SA Support * taking {par_,}allocate_impl from ContiguousSpace * fix for building without precompiled headers * setting _saved_mark_word in clear() * initialization order problem with _top vs _bottom * object_iterate block_is_obj check * added a short specification for block_is_obj and block_size Note that the set_offset_array change was moved to the 8047820 webrev since it's needed to get that change to compile without precompiled headers. Full webrev: http://cr.openjdk.java.net/~mgerdin/8047818/webrev.1/ Incremental webrev: http://cr.openjdk.java.net/~mgerdin/8047818/webrev.0_to_1/ /Mikael > > Notes: > The moving of set_offset_range is due to an introduced circular dependency > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp > > Thanks > /Mikael From stefan.karlsson at oracle.com Wed Jun 25 11:58:06 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 25 Jun 2014 13:58:06 +0200 Subject: RFR: 8047820: G1 Block offset table does not need to support generic Space classes In-Reply-To: <1643242.qRoacBket5@mgerdin03> References: <11148293.8zVS3laxSo@mgerdin03> <1643242.qRoacBket5@mgerdin03> Message-ID: <53AAB94E.90609@oracle.com> On 2014-06-25 13:25, Mikael Gerdin wrote: > Hi! > > On Monday 23 June 2014 16.25.53 Mikael Gerdin wrote: >> Hi! >> >> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1] >> G1's block offset table needs to be modified to work with Space subclasses >> which are not subclasses of ContiguousSpace. Just change the code to have >> knowledge of G1OffsetTableContigSpace. >> >> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, >> 8047821 which are needed to refactor the HeapRegion class and its >> superclasses in order to simplify the G1 class unloading change which is >> coming. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8047820 >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/ > I discovered that I accidentally put the set_offset_array change in the > 8047818 webrev. It is actually needed in this change to make the JVM compile > without precompiled headers. > > Here's the new full webrev: > http://cr.openjdk.java.net/~mgerdin/8047820/webrev.2/ > > Incremental webrev: > http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1_to_2/ Looks good. StefanK > > Thanks > /Mikael > >> [1] https://bugs.openjdk.java.net/browse/JDK-8047818 >> >> Thanks >> /Mikael From jon.masamitsu at oracle.com Wed Jun 25 13:50:06 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Wed, 25 Jun 2014 06:50:06 -0700 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <2049395.4GI85LboNd@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> <2049395.4GI85LboNd@mgerdin03> Message-ID: <53AAD38E.4040507@oracle.com> On 6/25/2014 4:50 AM, Mikael Gerdin wrote: > Hi! > > On Monday 23 June 2014 16.26.03 Mikael Gerdin wrote: >> Hi! >> >> When G1 is modified to unload classes without doing full collections the old >> HeapRegions can contain unparseable objects. This makes ContiguousSpace >> unsuitable as a base class for HeapRegion since it assumes that all objects >> below _top are parseable. >> >> Modify G1OffsetTableContigSpace to implement allocation with a separate _top >> and reimplement some Space pure virtuals to make object iteration work as >> expected. >> >> This change is the last part of a set of 4 changes: 8047818, 8047819, >> 8047820, 8047821 which are needed to refactor the HeapRegion class and its >> superclasses in order to simplify the G1 class unloading change which is >> coming. This change depends on the 19, 20 and 21 changes. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8047818 >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > Based on review comments from Jon, Stefan and Thomas (thanks!) here's a second > version of this webrev. > > A quick summary of the incremental changes: > > * SA Support > * taking {par_,}allocate_impl from ContiguousSpace > * fix for building without precompiled headers > * setting _saved_mark_word in clear() > * initialization order problem with _top vs _bottom > * object_iterate block_is_obj check > * added a short specification for block_is_obj and block_size > > Note that the set_offset_array change was moved to the 8047820 webrev since > it's needed to get that change to compile without precompiled headers. > > Full webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev.1/ > > Incremental webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev.0_to_1/ Looks good. Thanks for the changes. Reviewed. Jon > > /Mikael > >> Notes: >> The moving of set_offset_range is due to an introduced circular dependency >> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp >> >> Thanks >> /Mikael From jon.masamitsu at oracle.com Wed Jun 25 14:04:13 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Wed, 25 Jun 2014 07:04:13 -0700 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <3844327.z641kvG9QC@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> <2594075.LDRc6zFiFv@mgerdin03> <53A9A8B1.2080301@oracle.com> <3844327.z641kvG9QC@mgerdin03> Message-ID: <53AAD6DD.7@oracle.com> On 6/24/2014 11:56 PM, Mikael Gerdin wrote: > [...] > Similar to block_is_obj this is currently the same as the ContiguousSpace > variant. When G1 class unloading is integrated the implementation will change. > > The other declarations of block_size are not that clearly specified, can you > elaborate on what kind of specification you are looking for? The specification you put in was fine. I just wanted it clearer about the action when p >= top (in which case, to me, "block_size" is not a particularly descriptive name, never has been). Thanks. Jon > > /Mikael > >> Jon >> >>>> I've just started on the review so more to come. >>> Great! >>> >>> Thanks >>> /Mikael >>> >>>> Jon >>>> >>>> On 06/23/2014 07:26 AM, Mikael Gerdin wrote: >>>>> Hi! >>>>> >>>>> When G1 is modified to unload classes without doing full collections the >>>>> old HeapRegions can contain unparseable objects. This makes >>>>> ContiguousSpace unsuitable as a base class for HeapRegion since it >>>>> assumes that all objects below _top are parseable. >>>>> >>>>> Modify G1OffsetTableContigSpace to implement allocation with a separate >>>>> _top and reimplement some Space pure virtuals to make object iteration >>>>> work as expected. >>>>> >>>>> This change is the last part of a set of 4 changes: 8047818, 8047819, >>>>> 8047820, 8047821 which are needed to refactor the HeapRegion class and >>>>> its superclasses in order to simplify the G1 class unloading change >>>>> which >>>>> is coming. This change depends on the 19, 20 and 21 changes. >>>>> >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8047818 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ >>>>> >>>>> Notes: >>>>> The moving of set_offset_range is due to an introduced circular >>>>> dependency >>>>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp >>>>> >>>>> Thanks >>>>> /Mikael From stefan.karlsson at oracle.com Wed Jun 25 14:05:40 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 25 Jun 2014 16:05:40 +0200 Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces In-Reply-To: <2049395.4GI85LboNd@mgerdin03> References: <17336712.znP3JIk1Pt@mgerdin03> <2049395.4GI85LboNd@mgerdin03> Message-ID: <53AAD734.7090103@oracle.com> On 2014-06-25 13:50, Mikael Gerdin wrote: > Hi! > > On Monday 23 June 2014 16.26.03 Mikael Gerdin wrote: >> Hi! >> >> When G1 is modified to unload classes without doing full collections the old >> HeapRegions can contain unparseable objects. This makes ContiguousSpace >> unsuitable as a base class for HeapRegion since it assumes that all objects >> below _top are parseable. >> >> Modify G1OffsetTableContigSpace to implement allocation with a separate _top >> and reimplement some Space pure virtuals to make object iteration work as >> expected. >> >> This change is the last part of a set of 4 changes: 8047818, 8047819, >> 8047820, 8047821 which are needed to refactor the HeapRegion class and its >> superclasses in order to simplify the G1 class unloading change which is >> coming. This change depends on the 19, 20 and 21 changes. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8047818 >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/ > Based on review comments from Jon, Stefan and Thomas (thanks!) here's a second > version of this webrev. > > A quick summary of the incremental changes: > > * SA Support > * taking {par_,}allocate_impl from ContiguousSpace > * fix for building without precompiled headers > * setting _saved_mark_word in clear() > * initialization order problem with _top vs _bottom > * object_iterate block_is_obj check > * added a short specification for block_is_obj and block_size > > Note that the set_offset_array change was moved to the 8047820 webrev since > it's needed to get that change to compile without precompiled headers. > > Full webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev.1/ > > Incremental webrev: > http://cr.openjdk.java.net/~mgerdin/8047818/webrev.0_to_1/ Looks good. StefanK > > /Mikael > >> Notes: >> The moving of set_offset_range is due to an introduced circular dependency >> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp >> >> Thanks >> /Mikael From erik.helin at oracle.com Wed Jun 25 15:01:58 2014 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 25 Jun 2014 17:01:58 +0200 Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended In-Reply-To: <2070242.8rk8uRrq1v@mgerdin03> References: <1878024.dB1lCmA3nF@mgerdin03> <2070242.8rk8uRrq1v@mgerdin03> Message-ID: <2377352.slVLfJONC9@ehelin-laptop> On Tuesday 24 June 2014 14:10:47 PM Mikael Gerdin wrote: > Hi! > > On Monday 23 June 2014 16.26.00 Mikael Gerdin wrote: > > Hi! > > > > As part of a larger effort to detach G1's HeapRegion from > > ContiguousSpace[1] and as a general cleanup we should rename the > > save_marks and > > set_saved_marks methods on HeapRegion. They are not used with > > oops_since_saved_marks_iterate and cause more confusion than anything. > > > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820, > > 8047821 which are needed to refactor the HeapRegion class and its > > superclasses in order to simplify the G1 class unloading change which is > > coming. > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8047821 > > Webrev: > > http://cr.openjdk.java.net/~mgerdin/8047821/webrev/ > > Stefan discovered some more dead code in HeapRegion, here are a new set of > webrevs: > > http://cr.openjdk.java.net/~mgerdin/8047821/webrev.0_to_1/ > http://cr.openjdk.java.net/~mgerdin/8047821/webrev.1/ Looks good, Reviewed! Thanks, Erik > > /Mikael > > > [1] https://bugs.openjdk.java.net/browse/JDK-8047818 > > > > Thanks > > /Mikael From thomas.schatzl at oracle.com Thu Jun 26 07:16:53 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jun 2014 09:16:53 +0200 Subject: Ping to re-review JDK-8035400, JDK-8035401 and JDK-8040977 Message-ID: <1403767013.2656.11.camel@cirrus> Hi all, can I get re-reviews for the following issues (Bengt, Mikael?) that have been lingering for at least a month so that I can complete them? They are actually blocking me to get reviews a few more changesets for some time now. JDK-8035400: Move G1ParScanThreadState into its own files Most recent email in review thread: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010133.html Diff to recent changes: http://cr.openjdk.java.net/~tschatzl/8035400/webrev.1_to_2/ Latest changes (complete): http://cr.openjdk.java.net/~tschatzl/8035400/webrev.2/ JDK-8035401: Fix visibility of G1ParScanThreadState members Most recent email in review thread: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010134.html Diff to recent changes: http://cr.openjdk.java.net/~tschatzl/8035401/webrev.1_to_2/ Latest changes (complete): http://cr.openjdk.java.net/~tschatzl/8035401/webrev.2/ JDK-8040977: G1 crashes when run with -XX:-G1DeferredRSUpdate Most recent email in review thread: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010132.html Latest changes (complete): http://cr.openjdk.java.net/~tschatzl/8040977/webrev.1/ Thanks, Thomas From bengt.rutisson at oracle.com Thu Jun 26 11:28:49 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jun 2014 13:28:49 +0200 Subject: RFR (XS): JDK-8040977: G1 crashes when run with -XX:-G1DeferredRSUpdate In-Reply-To: <1401187156.2682.12.camel@cirrus> References: <1398171391.3002.24.camel@cirrus> <5360D563.1070303@oracle.com> <1401187156.2682.12.camel@cirrus> Message-ID: <53AC03F1.1090309@oracle.com> Hi Thomas, Sorry for the very late reply. I think the dependency between G1ParScanClosure is still very awkward, but I think your change is a step in the right direction. Thanks for fixing this. Reviewed. Bengt On 2014-05-27 12:39, Thomas Schatzl wrote: > Hi Bengt, > > thanks for the review. > > On Wed, 2014-04-30 at 12:50 +0200, Bengt Rutisson wrote: >> Hi Thomas, >> >> On 2014-04-22 14:56, Thomas Schatzl wrote: >>> Hi all, >>> >>> can I have reviews for this change? It fixes wrong order of >>> declaration of members of G1ParScanThreadState that causes crashes when >>> G1DeferredRSUpdate is disabled. >>> >>> The change is based on the changes for 8035400 and8035401 posted recently. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8040977 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8040977/webrev/ >> I realize that this fixes the code but I would really appreciate a more >> stable way of handling the dependencies. >> >> As it it now we end up calling methods on a G1ParScanThreadState >> instance while we are setting it up. This seems broken to me and will >> probably lead to similar initialization order issues again. Best would >> be to not pass "this" to the constructor of G1ParScanClosure and instead >> manage the circular dependency between G1ParScanClosure and >> G1ParScanThreadState more explicitly after they have both been properly >> set up. >> >> Second best would be to at least pass the worker id/queue num as a >> separate parameter to avoid having to call methods on an uninitialized >> object. > I fixed this implementing the former idea. Also added some > > New webrev at > http://cr.openjdk.java.net/~tschatzl/8040977/webrev.1/ > > (Sorry, I already had merged the changes before making a diff webrev - > however, most changes in the VM code have been redone anyway. The test > case stayed the same). > > Thanks, > Thomas > From mikael.gerdin at oracle.com Thu Jun 26 11:33:28 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 26 Jun 2014 13:33:28 +0200 Subject: RFR: 8048214: Linker error when compiling G1SATBCardTableModRefBS after include order changes Message-ID: <7134937.oekIdTSvyk@mgerdin03> Hi all! A small build issue occurs with the change for 8047818 due to some strange include order effects. The symptom is that a template function in G1SATBCardTableModRefBS is not instantiated when compiling on Windows and the link of jvm.dll fails. Since 8047818 is already reviewed and is a change we want to keep separate I'd like to push the fix for this issue before 8047818 instead of folding it into that change. My suggested fix is to move the implementations of the callers of the template function into the cpp file as well. They override virtual functions so they should not have been inlined in the first place (since we always call through a base class pointer to the BarrierSet). Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev Bug: https://bugs.openjdk.java.net/browse/JDK-8048214 Thanks /Mikael From bengt.rutisson at oracle.com Thu Jun 26 11:31:08 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jun 2014 13:31:08 +0200 Subject: RFR (M/L): JDK-8035400: Move G1ParScanThreadState into its own files In-Reply-To: <1401187161.2682.13.camel@cirrus> References: <1397829156.2717.24.camel@cirrus> <536A1685.7080704@oracle.com> <1401187161.2682.13.camel@cirrus> Message-ID: <53AC047C.4090101@oracle.com> Hi Thomas, I think this looks good now. Thanks, Bengt On 2014-05-27 12:39, Thomas Schatzl wrote: > Hi Bengt, > > thanks for the review and sorry for the long delay... > > On Wed, 2014-05-07 at 13:18 +0200, Bengt Rutisson wrote: >> Hi Thomas, >> >> On 2014-04-18 15:52, Thomas Schatzl wrote: >>> Hi all, >>> >>> can I have reviews for the above change? It moves G1ParScanThreadState >>> into G1ParScanThreadState*pp files. >>> >>> The only changes are limited to: >>> - adding a "#pragma warning( disable:4355 ) // 'this' : used in base >>> member initializer list" to shut visual C up about the problem (which >>> should be cleaned up at some point - I found an issue that slipped >>> through because of that, JDK-8040977) >> As I commented in the review of JDK-8040977 I would prefer to make the >> change to not pass this as a parameter to the constructor. That would >> also remove the need for disabling the warning. Maybe in that case base >> this review on top of the fix for JDK-8040977 rather than the other way >> around? > I do not see an advantage either way. Since this would require me make > significant changes to all patches, I would prefer keeping the order > this way if you do not mind. > > In the latest JDK-8040977 I removed the need for the pragma as > requested. > >>> - added necessary include file references; I hope the AIX guys can >>> compile that change to avoid troubles. It compiles fine with all Oracle >>> supported archs. >> You also moved the definition of the destructor of G1ParScanThreadState >> from the hpp file to the cpp file. Makes sense, but was not strictly >> needed for this change, right? > Fixed that. This has been an oversight when separating out the changes. > >>> There will be another CR for fixing up visibility and cleaning up stuff >>> a little. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8035400 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8035400/webrev/ >> It is a bit hard to review moved code. But except for the comment >> regarding JDK-8040977 above I think it looks good. >> >> I think you can clean up the includes a bit more if you have time. Seems >> like these includes in g1CollectedHeap.cpp are for example not needed >> anymore: >> >> #include "oops/oop.inline.hpp" >> #include "oops/oop.pcgc.inline.hpp" > I tried to clean up the includes a little more. However you cannot move > these particular includes because they are still needed for evacuation > failure handling. > > I also rebased the change on the current hotspot jdk9 gc repo. > > Diff webrev at > http://cr.openjdk.java.net/~tschatzl/8035400/webrev.1_to_2/ > > Complete webrev at > http://cr.openjdk.java.net/~tschatzl/8035400/webrev.2/ > > Thanks, > Thomas > From stefan.karlsson at oracle.com Thu Jun 26 11:28:45 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 26 Jun 2014 13:28:45 +0200 Subject: RFR: 8048214: Linker error when compiling G1SATBCardTableModRefBS after include order changes In-Reply-To: <7134937.oekIdTSvyk@mgerdin03> References: <7134937.oekIdTSvyk@mgerdin03> Message-ID: <53AC03ED.9070808@oracle.com> On 2014-06-26 13:33, Mikael Gerdin wrote: > Hi all! > > A small build issue occurs with the change for 8047818 due to some strange > include order effects. > The symptom is that a template function in G1SATBCardTableModRefBS is not > instantiated when compiling on Windows and the link of jvm.dll fails. > > Since 8047818 is already reviewed and is a change we want to keep separate I'd > like to push the fix for this issue before 8047818 instead of folding it into > that change. > > My suggested fix is to move the implementations of the callers of the template > function into the cpp file as well. They override virtual functions so they > should not have been inlined in the first place (since we always call through > a base class pointer to the BarrierSet). > > Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev Looks good. StefanK > Bug: https://bugs.openjdk.java.net/browse/JDK-8048214 > > Thanks > /Mikael From bengt.rutisson at oracle.com Thu Jun 26 11:39:24 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jun 2014 13:39:24 +0200 Subject: RFR: 8048214: Linker error when compiling G1SATBCardTableModRefBS after include order changes In-Reply-To: <53AC03ED.9070808@oracle.com> References: <7134937.oekIdTSvyk@mgerdin03> <53AC03ED.9070808@oracle.com> Message-ID: <53AC066C.4000909@oracle.com> On 2014-06-26 13:28, Stefan Karlsson wrote: > > On 2014-06-26 13:33, Mikael Gerdin wrote: >> Hi all! >> >> A small build issue occurs with the change for 8047818 due to some >> strange >> include order effects. >> The symptom is that a template function in G1SATBCardTableModRefBS is >> not >> instantiated when compiling on Windows and the link of jvm.dll fails. >> >> Since 8047818 is already reviewed and is a change we want to keep >> separate I'd >> like to push the fix for this issue before 8047818 instead of folding >> it into >> that change. >> >> My suggested fix is to move the implementations of the callers of the >> template >> function into the cpp file as well. They override virtual functions >> so they >> should not have been inlined in the first place (since we always call >> through >> a base class pointer to the BarrierSet). >> >> Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev > > Looks good. +1 Bengt > > StefanK > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8048214 >> >> Thanks >> /Mikael > From thomas.schatzl at oracle.com Thu Jun 26 11:41:59 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jun 2014 13:41:59 +0200 Subject: RFR: 8048214: Linker error when compiling G1SATBCardTableModRefBS after include order changes In-Reply-To: <7134937.oekIdTSvyk@mgerdin03> References: <7134937.oekIdTSvyk@mgerdin03> Message-ID: <1403782919.2656.33.camel@cirrus> Hi, On Thu, 2014-06-26 at 13:33 +0200, Mikael Gerdin wrote: > Hi all! > > A small build issue occurs with the change for 8047818 due to some strange > include order effects. > The symptom is that a template function in G1SATBCardTableModRefBS is not > instantiated when compiling on Windows and the link of jvm.dll fails. > > Since 8047818 is already reviewed and is a change we want to keep separate I'd > like to push the fix for this issue before 8047818 instead of folding it into > that change. > > My suggested fix is to move the implementations of the callers of the template > function into the cpp file as well. They override virtual functions so they > should not have been inlined in the first place (since we always call through > a base class pointer to the BarrierSet). > > Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev > Bug: https://bugs.openjdk.java.net/browse/JDK-8048214 Looks good. Thomas From bengt.rutisson at oracle.com Thu Jun 26 11:43:17 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jun 2014 13:43:17 +0200 Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState members In-Reply-To: <1401187165.2682.14.camel@cirrus> References: <1397827791.2717.23.camel@cirrus> <5356CC15.7070507@oracle.com> <1398198974.2532.32.camel@cirrus> <5356DFB3.7050400@oracle.com> <5360E01E.7020906@oracle.com> <1401187165.2682.14.camel@cirrus> Message-ID: <53AC0755.3050300@oracle.com> Hi Thomas, Looks good. One question though. In g1ParScanThreadState.hpp you have: 131 inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz); 132 inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t word_sz); 133 inline void undo_allocation(GCAllocPurpose purpose, HeapWord* obj, size_t word_sz); But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't the implementation be placed in g1ParScanThreadState.inline.hpp? Thanks, Bengt On 2014-05-27 12:39, Thomas Schatzl wrote: > Hi Bengt, > > thanks for the review. > > On Wed, 2014-04-30 at 13:35 +0200, Bengt Rutisson wrote: >> Hi Thomas, >> >> Over all this looks good to me too. >> >> One question for g1ParScanThreadState.cpp. You have marked the >> deal_with_refrence() methods as "inline" even though they are in the >> same cpp file. Does that have any effect? > I moved them to the inline files. > >> 394 >> 395 template inline void >> G1ParScanThreadState::deal_with_reference(T* ref_to_scan) { >> 396 if (!has_partial_array_mask(ref_to_scan)) { >> 397 // Note: we can use "raw" versions of "region_containing" because >> 398 // "obj_to_scan" is definitely in the heap, and is not in a >> 399 // humongous region. >> 400 HeapRegion* r = _g1h->heap_region_containing_raw(ref_to_scan); >> 401 do_oop_evac(ref_to_scan, r); >> 402 } else { >> 403 do_oop_partial_array((oop*)ref_to_scan); >> 404 } >> 405 } >> 406 >> 407 inline void G1ParScanThreadState::deal_with_reference(StarTask ref) { >> 408 assert(verify_task(ref), "sanity"); >> 409 if (ref.is_narrow()) { >> 410 deal_with_reference((narrowOop*)ref); >> 411 } else { >> 412 deal_with_reference((oop*)ref); >> 413 } >> 414 } >> >> Also, I think that you have to declare methods that should be inlined >> before the place where they are being used on some platforms (Solaris). >> In this case I think it means that they should be declared before >> steal_and_trim_queue(). > Moved them to the inline file. > >> Personally I also find the new deal_with_reference(StarTask ref) a >> little confusing. With that method and the two methods generated by >> deal_with_reference(T* ref_to_scan) I get kind of unsure which method >> that will be executed by a call like: >> >> 156 StarTask stolen_task; >> 157 while (task_queues->steal(queue_num(), hash_seed(), stolen_task)) { >> 158 assert(verify_task(stolen_task), "sanity"); >> 159 deal_with_reference(stolen_task); >> >> All three deal_with_reference() methods are potential matches. I assume >> the compiler prefers the deal_with_reference(StarTask ref) but it makes >> me unsure when I read the code. > Changed to dispatch_reference(). > >> One minor nit: >> >> g1ParScanThreadState.hpp >> You have changed the indentation of private/protected/public keywords to >> have one space indentation. That's fine as I think that is the standard, >> but since the whole file used no space indentation I would also have >> been fine with leaving that. However now the last "public" keyword is >> still having no space before it. Can you indent that too? >> >> 218 public: > Fixed. Also removed superfluous newlines at the end of files. > > Also re-checked again for performance regressions, none found. > > Diff to last revision > http://cr.openjdk.java.net/~tschatzl/8035401/webrev.1_to_2/ > > Full diff: > http://cr.openjdk.java.net/~tschatzl/8035401/webrev.2/ > > (based on 8035400) > > Thanks, > Thomas > > >> Thanks, >> Bengt >> >> >> On 2014-04-22 23:31, Jon Masamitsu wrote: >>> On 4/22/14 1:36 PM, Thomas Schatzl wrote: >>>> Hi Jon, >>>> >>>> On Tue, 2014-04-22 at 13:07 -0700, Jon Masamitsu wrote: >>>>> Thomas, >>>>> >>>>> What I see in these changes are >>>>> >>>>> 1) no semantic changes >>>> No. >>>> >>>>> 2) some methods in .hpp files moved to .cpp files >>>> Yes, because they were only referenced by the cpp file, so I thought it >>>> would be good to move them there. They will be inlined as needed anyway >>>> (and I think for some of them they were never inlined due to their >>>> size). >>>> >>>> I will do some more runs with the inline's added again. >>>> >>>>> 3) creation of steal_and_trim_queue() with definition in >>>>> a .cpp file (I may have missed additional such new >>>>> methods) >>>> There are none except queue_is_empty(), see below. >>>> >>>>> 4) change in visibility as the CR says >>>> That's the main change. >>>> >>>>> 5) no performance regressions as stated in your RFR >>>> No. Checked the results for the usual benchmarks (specjvm2008, >>>> specjbb05/2013) again right now, and there are no significant >>>> differences in the scores (on x64 and sparc), and for specjbb05/2013 the >>>> average gc pause time, and the object copy time (assuming that this is >>>> the part that will be affected most) stay the same as in the baseline. >>>> >>>>> If that's what constitutes the change, looks good. >>>> Thanks. >>>> >>>>> Reviewed. >>>>> >>>>> If there is something more significant that I have >>>>> overlooked, please point me at it and I'll look again. >>>> There is not. Sorry, I should have pointed out the changes in more >>>> detail instead of you making guesses. >>>> >>>> Additional minor changes: >>>> >>>> - G1ParScanThreadState accesses members directly instead of using >>>> getters (e.g. _refs instead of refs()). >>>> >>>> - fixed some newlines in method declarations, removing newlines >>>> >>>> - removed refs() to avoid direct access from outside, and adding a new >>>> method queue_is_empty() (only used in asserts as refs()->is_empty(), and >>>> I did not want to expose refs() just for the asserts). >>> All looks good. >>> >>> Reviewed. >>> >>> Jon >>> >>>> Thanks, >>>> Thomas >>>> >>>>> On 4/18/14 6:29 AM, Thomas Schatzl wrote: >>>>>> Hi all, >>>>>> >>>>>> can I have reviews for this change? After moving >>>>>> G1ParScanThreadState, >>>>>> this change cleans up visibility, making a whole lot of stuff private. >>>>>> >>>>>> CR: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8035401 >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~tschatzl/8035401/webrev/ >>>>>> >>>>>> Testing: >>>>>> perf testing indicated no changes, jprt >>>>>> >>>>>> Thanks, >>>>>> Thomas >>>>>> >>>>>> > From mikael.gerdin at oracle.com Thu Jun 26 11:58:59 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 26 Jun 2014 13:58:59 +0200 Subject: RFR (M/L): JDK-8035400: Move G1ParScanThreadState into its own files In-Reply-To: <53AC047C.4090101@oracle.com> References: <1397829156.2717.24.camel@cirrus> <1401187161.2682.13.camel@cirrus> <53AC047C.4090101@oracle.com> Message-ID: <4767727.8Tv2L2B5Jo@mgerdin03> Hi, On Thursday 26 June 2014 13.31.08 Bengt Rutisson wrote: > Hi Thomas, > > I think this looks good now. +1 Looks good /Mikael > > Thanks, > Bengt > > On 2014-05-27 12:39, Thomas Schatzl wrote: > > Hi Bengt, > > > > thanks for the review and sorry for the long delay... > > > > On Wed, 2014-05-07 at 13:18 +0200, Bengt Rutisson wrote: > >> Hi Thomas, > >> > >> On 2014-04-18 15:52, Thomas Schatzl wrote: > >>> Hi all, > >>> > >>> can I have reviews for the above change? It moves > >>> G1ParScanThreadState > >>> > >>> into G1ParScanThreadState*pp files. > >>> > >>> The only changes are limited to: > >>> - adding a "#pragma warning( disable:4355 ) // 'this' : used in base > >>> > >>> member initializer list" to shut visual C up about the problem (which > >>> should be cleaned up at some point - I found an issue that slipped > >>> through because of that, JDK-8040977) > >> > >> As I commented in the review of JDK-8040977 I would prefer to make the > >> change to not pass this as a parameter to the constructor. That would > >> also remove the need for disabling the warning. Maybe in that case base > >> this review on top of the fix for JDK-8040977 rather than the other way > >> around? > > > > I do not see an advantage either way. Since this would require me make > > significant changes to all patches, I would prefer keeping the order > > this way if you do not mind. > > > > In the latest JDK-8040977 I removed the need for the pragma as > > requested. > > > >>> - added necessary include file references; I hope the AIX guys can > >>> > >>> compile that change to avoid troubles. It compiles fine with all Oracle > >>> supported archs. > >> > >> You also moved the definition of the destructor of G1ParScanThreadState > >> from the hpp file to the cpp file. Makes sense, but was not strictly > >> needed for this change, right? > > > > Fixed that. This has been an oversight when separating out the changes. > > > >>> There will be another CR for fixing up visibility and cleaning up stuff > >>> a little. > >>> > >>> CR: > >>> https://bugs.openjdk.java.net/browse/JDK-8035400 > >>> > >>> Webrev: > >>> http://cr.openjdk.java.net/~tschatzl/8035400/webrev/ > >> > >> It is a bit hard to review moved code. But except for the comment > >> regarding JDK-8040977 above I think it looks good. > >> > >> I think you can clean up the includes a bit more if you have time. Seems > >> like these includes in g1CollectedHeap.cpp are for example not needed > >> anymore: > >> > >> #include "oops/oop.inline.hpp" > >> #include "oops/oop.pcgc.inline.hpp" > > > > I tried to clean up the includes a little more. However you cannot move > > these particular includes because they are still needed for evacuation > > failure handling. > > > > I also rebased the change on the current hotspot jdk9 gc repo. > > > > Diff webrev at > > http://cr.openjdk.java.net/~tschatzl/8035400/webrev.1_to_2/ > > > > Complete webrev at > > http://cr.openjdk.java.net/~tschatzl/8035400/webrev.2/ > > > > Thanks, > > > > Thomas From thomas.schatzl at oracle.com Thu Jun 26 11:59:52 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jun 2014 13:59:52 +0200 Subject: RFR (M/L): JDK-8035400: Move G1ParScanThreadState into its own files In-Reply-To: <4767727.8Tv2L2B5Jo@mgerdin03> References: <1397829156.2717.24.camel@cirrus> <1401187161.2682.13.camel@cirrus> <53AC047C.4090101@oracle.com> <4767727.8Tv2L2B5Jo@mgerdin03> Message-ID: <1403783992.2656.37.camel@cirrus> Hi Bengt, Mikael, On Thu, 2014-06-26 at 13:58 +0200, Mikael Gerdin wrote: > Hi, > > On Thursday 26 June 2014 13.31.08 Bengt Rutisson wrote: > > Hi Thomas, > > > > I think this looks good now. > > +1 > Looks good > /Mikael thanks for the reviews. Thomas From bengt.rutisson at oracle.com Thu Jun 26 12:07:20 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jun 2014 14:07:20 +0200 Subject: RFR: Backport of: JDK-8043607: Add a GC id as a log decoration similar to PrintGCTimeStamps Message-ID: <53AC0CF8.5090903@oracle.com> Hi all, Can I have a couple of reviews for this backport? The fix for JDK-8043607 applied cleanly to the 8 update repository, but as I mentioned in the original review I would like to change the default value for the logging flag when I backport this. Here's the webrev for the backport to 8u: http://cr.openjdk.java.net/~brutisso/8043607/webrev.8u.00/ The only change compared to what was pushed to JDK 9 is that in globals.hpp the default value for PrintGCID is different. JDK8: http://cr.openjdk.java.net/~brutisso/8043607/webrev.8u.00/src/share/vm/runtime/globals.hpp.udiff.html JDK9: http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/dabee7bb3a8f#l29.7 Thanks, Bengt From mikael.gerdin at oracle.com Thu Jun 26 12:13:25 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 26 Jun 2014 14:13:25 +0200 Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState members In-Reply-To: <53AC0755.3050300@oracle.com> References: <1397827791.2717.23.camel@cirrus> <1401187165.2682.14.camel@cirrus> <53AC0755.3050300@oracle.com> Message-ID: <4783916.GRNXiRhIx2@mgerdin03> Bengt, On Thursday 26 June 2014 13.43.17 Bengt Rutisson wrote: > Hi Thomas, > > Looks good. One question though. In g1ParScanThreadState.hpp you have: > > 131 inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz); > 132 inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t > word_sz); > 133 inline void undo_allocation(GCAllocPurpose purpose, HeapWord* > obj, size_t word_sz); > > But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't > the implementation be placed in g1ParScanThreadState.inline.hpp? These are now private methods and are only ever called from the .cpp file so they don't need to be visible outside it. The inline keyword is sort of a hint to the compiler to inline them into the caller, I'm not sure if it's strictly needed though. I think the change in webrev.2 looks good. /Mikael > > Thanks, > Bengt > > On 2014-05-27 12:39, Thomas Schatzl wrote: > > Hi Bengt, > > > > thanks for the review. > > > > On Wed, 2014-04-30 at 13:35 +0200, Bengt Rutisson wrote: > >> Hi Thomas, > >> > >> Over all this looks good to me too. > >> > >> One question for g1ParScanThreadState.cpp. You have marked the > >> deal_with_refrence() methods as "inline" even though they are in the > >> same cpp file. Does that have any effect? > > > > I moved them to the inline files. > > > >> 394 > >> 395 template inline void > >> > >> G1ParScanThreadState::deal_with_reference(T* ref_to_scan) { > >> > >> 396 if (!has_partial_array_mask(ref_to_scan)) { > >> 397 // Note: we can use "raw" versions of "region_containing" > >> because > >> 398 // "obj_to_scan" is definitely in the heap, and is not in a > >> 399 // humongous region. > >> 400 HeapRegion* r = _g1h->heap_region_containing_raw(ref_to_scan); > >> 401 do_oop_evac(ref_to_scan, r); > >> 402 } else { > >> 403 do_oop_partial_array((oop*)ref_to_scan); > >> 404 } > >> 405 } > >> 406 > >> 407 inline void G1ParScanThreadState::deal_with_reference(StarTask > >> ref) { > >> 408 assert(verify_task(ref), "sanity"); > >> 409 if (ref.is_narrow()) { > >> 410 deal_with_reference((narrowOop*)ref); > >> 411 } else { > >> 412 deal_with_reference((oop*)ref); > >> 413 } > >> 414 } > >> > >> Also, I think that you have to declare methods that should be inlined > >> before the place where they are being used on some platforms (Solaris). > >> In this case I think it means that they should be declared before > >> steal_and_trim_queue(). > > > > Moved them to the inline file. > > > >> Personally I also find the new deal_with_reference(StarTask ref) a > >> little confusing. With that method and the two methods generated by > >> deal_with_reference(T* ref_to_scan) I get kind of unsure which method > >> > >> that will be executed by a call like: > >> 156 StarTask stolen_task; > >> 157 while (task_queues->steal(queue_num(), hash_seed(), > >> stolen_task)) { > >> 158 assert(verify_task(stolen_task), "sanity"); > >> 159 deal_with_reference(stolen_task); > >> > >> All three deal_with_reference() methods are potential matches. I assume > >> the compiler prefers the deal_with_reference(StarTask ref) but it makes > >> me unsure when I read the code. > > > > Changed to dispatch_reference(). > > > >> One minor nit: > >> > >> g1ParScanThreadState.hpp > >> You have changed the indentation of private/protected/public keywords to > >> have one space indentation. That's fine as I think that is the standard, > >> but since the whole file used no space indentation I would also have > >> been fine with leaving that. However now the last "public" keyword is > >> still having no space before it. Can you indent that too? > > > >> 218 public: > > Fixed. Also removed superfluous newlines at the end of files. > > > > Also re-checked again for performance regressions, none found. > > > > Diff to last revision > > http://cr.openjdk.java.net/~tschatzl/8035401/webrev.1_to_2/ > > > > Full diff: > > http://cr.openjdk.java.net/~tschatzl/8035401/webrev.2/ > > > > (based on 8035400) > > > > Thanks, > > Thomas > > > >> Thanks, > >> Bengt > >> > >> On 2014-04-22 23:31, Jon Masamitsu wrote: > >>> On 4/22/14 1:36 PM, Thomas Schatzl wrote: > >>>> Hi Jon, > >>>> > >>>> On Tue, 2014-04-22 at 13:07 -0700, Jon Masamitsu wrote: > >>>>> Thomas, > >>>>> > >>>>> What I see in these changes are > >>>>> > >>>>> 1) no semantic changes > >>>> > >>>> No. > >>>> > >>>>> 2) some methods in .hpp files moved to .cpp files > >>>> > >>>> Yes, because they were only referenced by the cpp file, so I thought it > >>>> would be good to move them there. They will be inlined as needed anyway > >>>> (and I think for some of them they were never inlined due to their > >>>> size). > >>>> > >>>> I will do some more runs with the inline's added again. > >>>> > >>>>> 3) creation of steal_and_trim_queue() with definition in > >>>>> a .cpp file (I may have missed additional such new > >>>>> methods) > >>>> > >>>> There are none except queue_is_empty(), see below. > >>>> > >>>>> 4) change in visibility as the CR says > >>>> > >>>> That's the main change. > >>>> > >>>>> 5) no performance regressions as stated in your RFR > >>>> > >>>> No. Checked the results for the usual benchmarks (specjvm2008, > >>>> specjbb05/2013) again right now, and there are no significant > >>>> differences in the scores (on x64 and sparc), and for specjbb05/2013 > >>>> the > >>>> average gc pause time, and the object copy time (assuming that this is > >>>> the part that will be affected most) stay the same as in the baseline. > >>>> > >>>>> If that's what constitutes the change, looks good. > >>>> > >>>> Thanks. > >>>> > >>>>> Reviewed. > >>>>> > >>>>> If there is something more significant that I have > >>>>> overlooked, please point me at it and I'll look again. > >>>> > >>>> There is not. Sorry, I should have pointed out the changes in more > >>>> detail instead of you making guesses. > >>>> > >>>> Additional minor changes: > >>>> > >>>> - G1ParScanThreadState accesses members directly instead of using > >>>> getters (e.g. _refs instead of refs()). > >>>> > >>>> - fixed some newlines in method declarations, removing newlines > >>>> > >>>> - removed refs() to avoid direct access from outside, and adding a new > >>>> method queue_is_empty() (only used in asserts as refs()->is_empty(), > >>>> and > >>>> I did not want to expose refs() just for the asserts). > >>> > >>> All looks good. > >>> > >>> Reviewed. > >>> > >>> Jon > >>> > >>>> Thanks, > >>>> > >>>> Thomas > >>>>> > >>>>> On 4/18/14 6:29 AM, Thomas Schatzl wrote: > >>>>>> Hi all, > >>>>>> > >>>>>> can I have reviews for this change? After moving > >>>>>> > >>>>>> G1ParScanThreadState, > >>>>>> this change cleans up visibility, making a whole lot of stuff > >>>>>> private. > >>>>>> > >>>>>> CR: > >>>>>> https://bugs.openjdk.java.net/browse/JDK-8035401 > >>>>>> > >>>>>> Webrev: > >>>>>> http://cr.openjdk.java.net/~tschatzl/8035401/webrev/ > >>>>>> > >>>>>> Testing: > >>>>>> perf testing indicated no changes, jprt > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Thomas From thomas.schatzl at oracle.com Thu Jun 26 12:19:38 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jun 2014 14:19:38 +0200 Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState members In-Reply-To: <53AC0755.3050300@oracle.com> References: <1397827791.2717.23.camel@cirrus> <5356CC15.7070507@oracle.com> <1398198974.2532.32.camel@cirrus> <5356DFB3.7050400@oracle.com> <5360E01E.7020906@oracle.com> <1401187165.2682.14.camel@cirrus> <53AC0755.3050300@oracle.com> Message-ID: <1403785178.2656.40.camel@cirrus> Hi Bengt, thanks for looking at this... On Thu, 2014-06-26 at 13:43 +0200, Bengt Rutisson wrote: > Hi Thomas, > > Looks good. One question though. In g1ParScanThreadState.hpp you have: > > 131 inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz); > 132 inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t > word_sz); > 133 inline void undo_allocation(GCAllocPurpose purpose, HeapWord* > obj, size_t word_sz); > > But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't > the implementation be placed in g1ParScanThreadState.inline.hpp? > if an inlined method is only used in one place, at least G1 code often places that method into the cpp file directly. It is not required to make them publicly visible. I do not have a preference either. What do you think? Thomas From bengt.rutisson at oracle.com Thu Jun 26 12:34:24 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jun 2014 14:34:24 +0200 Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState members In-Reply-To: <1403785178.2656.40.camel@cirrus> References: <1397827791.2717.23.camel@cirrus> <5356CC15.7070507@oracle.com> <1398198974.2532.32.camel@cirrus> <5356DFB3.7050400@oracle.com> <5360E01E.7020906@oracle.com> <1401187165.2682.14.camel@cirrus> <53AC0755.3050300@oracle.com> <1403785178.2656.40.camel@cirrus> Message-ID: <53AC1350.4070903@oracle.com> On 2014-06-26 14:19, Thomas Schatzl wrote: > Hi Bengt, > > thanks for looking at this... > > On Thu, 2014-06-26 at 13:43 +0200, Bengt Rutisson wrote: >> Hi Thomas, >> >> Looks good. One question though. In g1ParScanThreadState.hpp you have: >> >> 131 inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz); >> 132 inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t >> word_sz); >> 133 inline void undo_allocation(GCAllocPurpose purpose, HeapWord* >> obj, size_t word_sz); >> >> But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't >> the implementation be placed in g1ParScanThreadState.inline.hpp? >> > if an inlined method is only used in one place, at least G1 code often > places that method into the cpp file directly. It is not required to > make them publicly visible. > > I do not have a preference either. What do you think? I think I would prefer to remove the inline keyword in that case. The compiler probably does a better job deciding whether or not it is a good idea to inline these methods. Either way is fine with me. Reveiwed. :) Bengt > > Thomas > > From mikael.gerdin at oracle.com Thu Jun 26 12:39:19 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 26 Jun 2014 14:39:19 +0200 Subject: RFR (XS): JDK-8040977: G1 crashes when run with -XX:-G1DeferredRSUpdate In-Reply-To: <53AC03F1.1090309@oracle.com> References: <1398171391.3002.24.camel@cirrus> <1401187156.2682.12.camel@cirrus> <53AC03F1.1090309@oracle.com> Message-ID: <6725819.L3vCke9nVL@mgerdin03> Thomas, On Thursday 26 June 2014 13.28.49 Bengt Rutisson wrote: > Hi Thomas, > > Sorry for the very late reply. > > I think the dependency between G1ParScanClosure is still very awkward, > but I think your change is a step in the right direction. Thanks for > fixing this. I agree with Bengt. The change seems reasonable. /Mikael > > Reviewed. > Bengt > > On 2014-05-27 12:39, Thomas Schatzl wrote: > > Hi Bengt, > > > > thanks for the review. > > > > On Wed, 2014-04-30 at 12:50 +0200, Bengt Rutisson wrote: > >> Hi Thomas, > >> > >> On 2014-04-22 14:56, Thomas Schatzl wrote: > >>> Hi all, > >>> > >>> can I have reviews for this change? It fixes wrong order of > >>> > >>> declaration of members of G1ParScanThreadState that causes crashes when > >>> G1DeferredRSUpdate is disabled. > >>> > >>> The change is based on the changes for 8035400 and8035401 posted > >>> recently. > >>> > >>> CR: > >>> https://bugs.openjdk.java.net/browse/JDK-8040977 > >>> > >>> Webrev: > >>> http://cr.openjdk.java.net/~tschatzl/8040977/webrev/ > >> > >> I realize that this fixes the code but I would really appreciate a more > >> stable way of handling the dependencies. > >> > >> As it it now we end up calling methods on a G1ParScanThreadState > >> instance while we are setting it up. This seems broken to me and will > >> probably lead to similar initialization order issues again. Best would > >> be to not pass "this" to the constructor of G1ParScanClosure and instead > >> manage the circular dependency between G1ParScanClosure and > >> G1ParScanThreadState more explicitly after they have both been properly > >> set up. > >> > >> Second best would be to at least pass the worker id/queue num as a > >> separate parameter to avoid having to call methods on an uninitialized > >> object. > > > > I fixed this implementing the former idea. Also added some > > > > New webrev at > > http://cr.openjdk.java.net/~tschatzl/8040977/webrev.1/ > > > > (Sorry, I already had merged the changes before making a diff webrev - > > however, most changes in the VM code have been redone anyway. The test > > case stayed the same). > > > > Thanks, > > > > Thomas From mikael.gerdin at oracle.com Thu Jun 26 14:16:36 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 26 Jun 2014 16:16:36 +0200 Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create a new RelocIterator In-Reply-To: <53A3038C.9020004@oracle.com> References: <53A2DB59.9050605@oracle.com> <53A3038C.9020004@oracle.com> Message-ID: <5113836.YgHIQ7lM1y@mgerdin03> Hi, On Thursday 19 June 2014 17.36.44 Stefan Karlsson wrote: > This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev. > > On 2014-06-19 14:45, Stefan Karlsson wrote: > > Hi all, > > > > I have a patch that we have been using in the G1 Class Unloading > > project to lower the remark times. This changes Compiler code, so I > > would like to get feedback from the Compiler team. > > > > http://cr.openjdk.java.net/~stefank/8047362/webrev.00/ The change looks good. I had an offline discussion with Steafan about this and we think that it would actually suffice to pass down the Relocation* since it appears to contain all the information needed to create the CompiledIC objects. However in the interest of moving forward with changes built on top of this we will look at that for a future cleanup. /Mikael > > https://bugs.openjdk.java.net/browse/JDK-8047362 > > > > The patch builds upon the patch in: > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html > > > > > > Summary from the bug report: > > --- > > Creation of RelocIterators show up high in profiles of the remark > > phase, in the G1 Class Unloading project. > > > > There's a pattern in the nmethod/codecache code to create a > > > > RelocIterator and then materialize a CompiledIC: > > RelocIterator iter(this, low_boundary); > > while(iter.next()) { > > > > if (iter.type() == relocInfo::virtual_call_type) { > > > > CompiledIC *ic = CompiledIC_at(iter.reloc()); > > > > CompiledIC_at is implemented as: > > new CompiledIC(call_site->code(), nativeCall_at(call_site->addr())); > > > > And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, > > NativeCall* call) does is to create a new RelocIterator: > > ... > > address ic_call = call->instruction_address(); > > ... > > > > RelocIterator iter(nm, ic_call, ic_call+1); > > bool ret = iter.next(); > > assert(ret == true, "relocInfo must exist at this address"); > > assert(iter.addr() == ic_call, "must find ic_call"); > > > > I would like to propose that we pass down the RelocIterator that we > > already have, instead of creating a new. > > --- > > > > > > I've previously received feedback that this seems like reasonable > > thing to do, but that the parameter to the new CompileIC_at should > > take a const RelocIterator* instead of RelocIterator*. I couldn't do > > that without changing a significant amount of Compiler code, so I have > > left it out for now. Any opinions on how to handle that? > > > > > > To give an idea of the performance difference, I temporarily added the > > following code: > > void CodeCache::iterate_through_CIs(int style) { > > > > int count; > > FOR_ALL_ALIVE_NMETHODS(nm) { > > > > RelocIterator iter(nm); > > while(iter.next()) { > > > > if (iter.type() == relocInfo::virtual_call_type || > > > > iter.type() == relocInfo::opt_virtual_call_type) { > > > > if (style > 0) { > > > > CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : > > CompiledIC_at(iter.reloc()); > > > > if (ic->ic_destination() == (address)0xdeadb000) { > > > > gclog_or_tty->print_cr("ShouldNotReachHere"); > > > > } > > > > } > > > > } > > > > } > > > > } > > > > } > > > > and then measured how long time it took to execute > > iterate_through_CIs(style) 1000 times with style == {0, 1, 2}. > > > > The results are: > > iterate_through_CIs(0): 1.210833 s // No CompiledICs created > > iterate_through_CIs(1): 1.976557 s // New style > > iterate_through_CIs(2): 9.924209 s // Old style > > > > Testing: > > A similar version has been used and thoroughly been tested together > > > > with the other G1 Class Unloading changes. This exact version has so > > far only been tested with Kitchensink and SpecJVM2008 > > compiler.compiler. What test lists would be appropriate to test this > > with? > > > > > > thanks, > > StefanK From stefan.karlsson at oracle.com Thu Jun 26 14:10:05 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 26 Jun 2014 16:10:05 +0200 Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create a new RelocIterator In-Reply-To: <5113836.YgHIQ7lM1y@mgerdin03> References: <53A2DB59.9050605@oracle.com> <53A3038C.9020004@oracle.com> <5113836.YgHIQ7lM1y@mgerdin03> Message-ID: <53AC29BD.4070004@oracle.com> On 2014-06-26 16:16, Mikael Gerdin wrote: > Hi, > > On Thursday 19 June 2014 17.36.44 Stefan Karlsson wrote: >> This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev. >> >> On 2014-06-19 14:45, Stefan Karlsson wrote: >>> Hi all, >>> >>> I have a patch that we have been using in the G1 Class Unloading >>> project to lower the remark times. This changes Compiler code, so I >>> would like to get feedback from the Compiler team. >>> >>> http://cr.openjdk.java.net/~stefank/8047362/webrev.00/ > The change looks good. > > I had an offline discussion with Steafan about this and we think that it would > actually suffice to pass down the Relocation* since it appears to contain all > the information needed to create the CompiledIC objects. > However in the interest of moving forward with changes built on top of this we > will look at that for a future cleanup. Thanks. StefanK > > /Mikael > >>> https://bugs.openjdk.java.net/browse/JDK-8047362 >>> >>> The patch builds upon the patch in: >>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html >>> >>> >>> Summary from the bug report: >>> --- >>> Creation of RelocIterators show up high in profiles of the remark >>> phase, in the G1 Class Unloading project. >>> >>> There's a pattern in the nmethod/codecache code to create a >>> >>> RelocIterator and then materialize a CompiledIC: >>> RelocIterator iter(this, low_boundary); >>> while(iter.next()) { >>> >>> if (iter.type() == relocInfo::virtual_call_type) { >>> >>> CompiledIC *ic = CompiledIC_at(iter.reloc()); >>> >>> CompiledIC_at is implemented as: >>> new CompiledIC(call_site->code(), nativeCall_at(call_site->addr())); >>> >>> And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, >>> NativeCall* call) does is to create a new RelocIterator: >>> ... >>> address ic_call = call->instruction_address(); >>> ... >>> >>> RelocIterator iter(nm, ic_call, ic_call+1); >>> bool ret = iter.next(); >>> assert(ret == true, "relocInfo must exist at this address"); >>> assert(iter.addr() == ic_call, "must find ic_call"); >>> >>> I would like to propose that we pass down the RelocIterator that we >>> already have, instead of creating a new. >>> --- >>> >>> >>> I've previously received feedback that this seems like reasonable >>> thing to do, but that the parameter to the new CompileIC_at should >>> take a const RelocIterator* instead of RelocIterator*. I couldn't do >>> that without changing a significant amount of Compiler code, so I have >>> left it out for now. Any opinions on how to handle that? >>> >>> >>> To give an idea of the performance difference, I temporarily added the >>> following code: >>> void CodeCache::iterate_through_CIs(int style) { >>> >>> int count; >>> FOR_ALL_ALIVE_NMETHODS(nm) { >>> >>> RelocIterator iter(nm); >>> while(iter.next()) { >>> >>> if (iter.type() == relocInfo::virtual_call_type || >>> >>> iter.type() == relocInfo::opt_virtual_call_type) { >>> >>> if (style > 0) { >>> >>> CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : >>> CompiledIC_at(iter.reloc()); >>> >>> if (ic->ic_destination() == (address)0xdeadb000) { >>> >>> gclog_or_tty->print_cr("ShouldNotReachHere"); >>> >>> } >>> >>> } >>> >>> } >>> >>> } >>> >>> } >>> >>> } >>> >>> and then measured how long time it took to execute >>> iterate_through_CIs(style) 1000 times with style == {0, 1, 2}. >>> >>> The results are: >>> iterate_through_CIs(0): 1.210833 s // No CompiledICs created >>> iterate_through_CIs(1): 1.976557 s // New style >>> iterate_through_CIs(2): 9.924209 s // Old style >>> >>> Testing: >>> A similar version has been used and thoroughly been tested together >>> >>> with the other G1 Class Unloading changes. This exact version has so >>> far only been tested with Kitchensink and SpecJVM2008 >>> compiler.compiler. What test lists would be appropriate to test this >>> with? >>> >>> >>> thanks, >>> StefanK From mikael.gerdin at oracle.com Thu Jun 26 14:18:58 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 26 Jun 2014 16:18:58 +0200 Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create a new RelocIterator In-Reply-To: <5113836.YgHIQ7lM1y@mgerdin03> References: <53A2DB59.9050605@oracle.com> <53A3038C.9020004@oracle.com> <5113836.YgHIQ7lM1y@mgerdin03> Message-ID: <1898495.XQJiaMVt6f@mgerdin03> I replied to the wrong list, sorry. Forwarding my review to hotspot-dev. /Mikael On Thursday 26 June 2014 16.16.36 Mikael Gerdin wrote: > Hi, > > On Thursday 19 June 2014 17.36.44 Stefan Karlsson wrote: > > This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev. > > > > On 2014-06-19 14:45, Stefan Karlsson wrote: > > > Hi all, > > > > > > I have a patch that we have been using in the G1 Class Unloading > > > project to lower the remark times. This changes Compiler code, so I > > > would like to get feedback from the Compiler team. > > > > > > http://cr.openjdk.java.net/~stefank/8047362/webrev.00/ > > The change looks good. > > I had an offline discussion with Steafan about this and we think that it > would actually suffice to pass down the Relocation* since it appears to > contain all the information needed to create the CompiledIC objects. > However in the interest of moving forward with changes built on top of this > we will look at that for a future cleanup. > > /Mikael > > > > https://bugs.openjdk.java.net/browse/JDK-8047362 > > > > > > The patch builds upon the patch in: > > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html > > > > > > > > > Summary from the bug report: > > > --- > > > Creation of RelocIterators show up high in profiles of the remark > > > phase, in the G1 Class Unloading project. > > > > > > There's a pattern in the nmethod/codecache code to create a > > > > > > RelocIterator and then materialize a CompiledIC: > > > RelocIterator iter(this, low_boundary); > > > while(iter.next()) { > > > > > > if (iter.type() == relocInfo::virtual_call_type) { > > > > > > CompiledIC *ic = CompiledIC_at(iter.reloc()); > > > > > > CompiledIC_at is implemented as: > > > new CompiledIC(call_site->code(), nativeCall_at(call_site->addr())); > > > > > > And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, > > > NativeCall* call) does is to create a new RelocIterator: > > > ... > > > address ic_call = call->instruction_address(); > > > ... > > > > > > RelocIterator iter(nm, ic_call, ic_call+1); > > > bool ret = iter.next(); > > > assert(ret == true, "relocInfo must exist at this address"); > > > assert(iter.addr() == ic_call, "must find ic_call"); > > > > > > I would like to propose that we pass down the RelocIterator that we > > > already have, instead of creating a new. > > > --- > > > > > > > > > I've previously received feedback that this seems like reasonable > > > thing to do, but that the parameter to the new CompileIC_at should > > > take a const RelocIterator* instead of RelocIterator*. I couldn't do > > > that without changing a significant amount of Compiler code, so I have > > > left it out for now. Any opinions on how to handle that? > > > > > > > > > To give an idea of the performance difference, I temporarily added the > > > following code: > > > void CodeCache::iterate_through_CIs(int style) { > > > > > > int count; > > > FOR_ALL_ALIVE_NMETHODS(nm) { > > > > > > RelocIterator iter(nm); > > > while(iter.next()) { > > > > > > if (iter.type() == relocInfo::virtual_call_type || > > > > > > iter.type() == relocInfo::opt_virtual_call_type) { > > > > > > if (style > 0) { > > > > > > CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : > > > CompiledIC_at(iter.reloc()); > > > > > > if (ic->ic_destination() == (address)0xdeadb000) { > > > > > > gclog_or_tty->print_cr("ShouldNotReachHere"); > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > and then measured how long time it took to execute > > > iterate_through_CIs(style) 1000 times with style == {0, 1, 2}. > > > > > > The results are: > > > iterate_through_CIs(0): 1.210833 s // No CompiledICs created > > > iterate_through_CIs(1): 1.976557 s // New style > > > iterate_through_CIs(2): 9.924209 s // Old style > > > > > > Testing: > > > A similar version has been used and thoroughly been tested together > > > > > > with the other G1 Class Unloading changes. This exact version has so > > > far only been tested with Kitchensink and SpecJVM2008 > > > compiler.compiler. What test lists would be appropriate to test this > > > with? > > > > > > > > > thanks, > > > StefanK From andreas.sjoberg at oracle.com Thu Jun 26 14:24:23 2014 From: andreas.sjoberg at oracle.com (=?ISO-8859-1?Q?Andreas_Sj=F6berg?=) Date: Thu, 26 Jun 2014 16:24:23 +0200 Subject: RFR JDK-8047328: Change typedef CardIdx_t from int to uint16_t Message-ID: <53AC2D17.60507@oracle.com> Hi all, could I please have reviews for this patch that changes the typedef CardIdx_t from int to uint16_t. The motivation behind this patch is to reduce the memory footprint caused by the G1 remembered sets. This adds a _next_null field to the SparsePRTEntry class which keeps track of where the next possible insert could be. The other modifications are to make use of the fact that we know exactly how many cards are contained in the SparsePRTEntry, and no longer have to compare against a designated NullEntry value. webrev: http://cr.openjdk.java.net/~jwilhelm/8047328/webrev/ Thanks, Andreas From thomas.schatzl at oracle.com Thu Jun 26 14:38:49 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jun 2014 16:38:49 +0200 Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1 SparsePRTEntry In-Reply-To: <53AA7402.1040608@oracle.com> References: <53A2E53B.3050508@oracle.com> <2772002.dt1otWjIrg@mgerdin03> <53AA7402.1040608@oracle.com> Message-ID: <1403793529.2656.50.camel@cirrus> Hi, On Wed, 2014-06-25 at 09:02 +0200, Andreas Sj?berg wrote: > Hi! > > Following Mikael's review and some offline comments from Thomas I've > made these changes in addition to removing the unrolled card loops: > > * removed the now unused define > * added braces for the for-loop in SparsePRTEntry::init > * changed the implementation of copy_cards to use memcpy > > New webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev.02/ looks okay to me. Thomas From jon.masamitsu at oracle.com Thu Jun 26 17:46:55 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 26 Jun 2014 10:46:55 -0700 Subject: Request for Review (vs) - 8034056: assert(_heap_alignment >= _space_alignment) failed: heap_alignment less than space_alignment Message-ID: <53AC5C8F.4040203@oracle.com> 8034056: assert(_heap_alignment >= _space_alignment) failed: heap_alignment less than space_alignment In the calculation of the heap alignment there was an exception for the UseParalelGC collector for the case of large pages. The fix was to remove that exception. Fixed was tested on a machine that exhibited the failure (thanks, Stefan J) http://cr.openjdk.java.net/~jmasa/8034056/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8034056 Thanks. Jon From erik.helin at oracle.com Fri Jun 27 10:55:55 2014 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 27 Jun 2014 12:55:55 +0200 Subject: FW: RFR(S): 8047812: Ensure ClassLoaderDataGraph::classes_unloading_do only delivers klasses from CLDs with non-reclaimed class loader oops In-Reply-To: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default> References: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default> Message-ID: <18720451.n6H3skUL5s@ehelin-laptop> Looks good, Reviewd. Thanks, Erik On Monday 23 June 2014 08:26:51 AM Markus Gr?nlund wrote: > Sending this to the Hotspot-GC-dev group as well. > > > > /Markus > > > > From: Markus Gr?nlund > Sent: den 23 juni 2014 17:03 > To: hotspot-runtime-dev; serviceability-dev > Subject: RFR(S): 8047812: Ensure ClassLoaderDataGraph::classes_unloading_do > only delivers klasses from CLDs with non-reclaimed class loader oops > > > > Greetings, > > > > Kindly asking for reviews for the following change: > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8047812 > > Webrev: http://cr.openjdk.java.net/~mgronlun/8047812/webrev01 > > > > Description: > > The "8038212: Method::is_valid_method() check has performance regression > impact for stackwalking" - changeset introduced a change in how the > ClassLoaderDataGraph::_unloading list of ClassLoaderData's is purged. > > This change to the purging of the CLD's work the same as before for most > GC's, but when using CMS GC, SystemDictionary::do_unloading() is called > twice with no explicit purge call in between. On the second call > (post-sweep), we can now get stale class loader oops delivered as part of > the Klass closure callbacks from the _unloading list. Again, this is > because there is no explicit purge call in between these two entries to > SystemDictionary::do_unloading() - and being CMS and concurrent, it is very > hard to accommodate a timely and proper purge call here. > > The first do_unloading call comes after CMS concurrent marking, and the > second comes from a Full GC triggered while sweeping the CMS heap. > > This fix ensures the unloading purge mechanism to work correctly also for > the CMS collector, in that only CLDs with non-reclaimed class loader oops > will deliver klasses from the _unloading list. In addition, this will > ensure a single "logical" pass is achieved when iterating the unloading > list in-between purges (avoiding the processing of the same data twice). > > This fix is precipitated by nightly testing failures with CMS after the > introduction of 8038212: Method::is_valid_method() check has performance > regression impact for stackwalking" - for example > "nsk/sysdict/vm/stress/jck12a//sysdictj12a008" which is crashing because of > following up stale klass loader oop's from the > ClassLoaderDataGraph::_unloading list. > > > > Thanks > > Markus From tprintezis at twitter.com Fri Jun 27 13:00:11 2014 From: tprintezis at twitter.com (Tony Printezis) Date: Fri, 27 Jun 2014 09:00:11 -0400 Subject: The GCLocker blues... Message-ID: <53AD6ADB.10301@twitter.com> Hi all, (trying again from my Twitter address; moderator: feel free to disregard the original I accidentally sent from my personal address) We have recently noticed an interesting problem which seems to happen quite frequently under certain circumstances. Immediately after a young GC, a second one happens which seems unnecessary given that it starts with an empty or almost empty eden. Here's an example: {Heap before GC invocations=2 (full 0): par new generation total 471872K, used 433003K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) eden space 419456K, 100% used [0x00000007bae00000, 0x00000007d47a0000, 0x00000007d47a0000) from space 52416K, 25% used [0x00000007d47a0000, 0x00000007d54dacb0, 0x00000007d7ad0000) to space 52416K, 0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000) tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) No shared spaces configured. 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] Heap after GC invocations=3 (full 0): par new generation total 471872K, used 15843K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) eden space 419456K, 0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000) from space 52416K, 30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000) to space 52416K, 0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000) tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) No shared spaces configured. } {Heap before GC invocations=3 (full 0): par new generation total 471872K, used 24002K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) eden space 419456K, 1% used [0x00000007bae00000, 0x00000007bb5f7c50, 0x00000007d47a0000) from space 52416K, 30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000) to space 52416K, 0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000) tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) No shared spaces configured. 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] Heap after GC invocations=4 (full 0): par new generation total 471872K, used 12748K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) eden space 419456K, 0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000) from space 52416K, 24% used [0x00000007d47a0000, 0x00000007d5413320, 0x00000007d7ad0000) to space 52416K, 0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000) tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) No shared spaces configured. } Notice that: * The timestamp of the second GC (1.130) is almost equal to the timestamp of the first GC plus the duration of the first GC (1.119 + 0.0103320 = 1.1293). In this test young GCs normally happen at a frequency of one every 100ms-110ms or so. * The eden at the start of the second GC is almost empty (1% occupancy). We've also seen it very often with a completely empty eden. * (the big hint) The second GC is GClocker-initiated. This happens most often with ParNew (in some cases, more than 30% of the GCs are those unnecessary ones) but also happens with ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are those unnecessary ones). I was unable to reproduce it with G1. I can reproduce it with with latest JDK 7, JDK 8, and also the latest hotspot-gc/hotspot workspace. Are you guys looking into this (and is there a CR?)? I have a small test I can reproduce it with and a diagnosis / proposed fix(es) if you're interested. Tony -- Tony Printezis | JVM/GC Engineer / VM Team | Twitter @TonyPrintezis tprintezis at twitter.com From jon.masamitsu at oracle.com Fri Jun 27 15:25:49 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Fri, 27 Jun 2014 08:25:49 -0700 Subject: The GCLocker blues... In-Reply-To: <53AD6ADB.10301@twitter.com> References: <53AD6ADB.10301@twitter.com> Message-ID: <53AD8CFD.4080903@oracle.com> Tony, I don't recall talk within the GC group about this type of problem. I didn't find a CR that relates to that behavior. If there is one, I don't think it is on anyone's radar. Can I infer that the problem does not occur in jdk6? Any theories on what's going on? Jon On 6/27/2014 6:00 AM, Tony Printezis wrote: > Hi all, > > (trying again from my Twitter address; moderator: feel free to > disregard the original I accidentally sent from my personal address) > > We have recently noticed an interesting problem which seems to happen > quite frequently under certain circumstances. Immediately after a > young GC, a second one happens which seems unnecessary given that it > starts with an empty or almost empty eden. Here's an example: > > {Heap before GC invocations=2 (full 0): > par new generation total 471872K, used 433003K [0x00000007bae00000, > 0x00000007dae00000, 0x00000007dae00000) > eden space 419456K, 100% used [0x00000007bae00000, > 0x00000007d47a0000, 0x00000007d47a0000) > from space 52416K, 25% used [0x00000007d47a0000, > 0x00000007d54dacb0, 0x00000007d7ad0000) > to space 52416K, 0% used [0x00000007d7ad0000, > 0x00000007d7ad0000, 0x00000007dae00000) > tenured generation total 524288K, used 0K [0x00000007dae00000, > 0x00000007fae00000, 0x00000007fae00000) > the space 524288K, 0% used [0x00000007dae00000, > 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) > compacting perm gen total 21248K, used 2549K [0x00000007fae00000, > 0x00000007fc2c0000, 0x0000000800000000) > the space 21248K, 12% used [0x00000007fae00000, > 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) > No shared spaces configured. > 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), > 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: > user=0.03 sys=0.00, real=0.01 secs] > Heap after GC invocations=3 (full 0): > par new generation total 471872K, used 15843K [0x00000007bae00000, > 0x00000007dae00000, 0x00000007dae00000) > eden space 419456K, 0% used [0x00000007bae00000, > 0x00000007bae00000, 0x00000007d47a0000) > from space 52416K, 30% used [0x00000007d7ad0000, > 0x00000007d8a48c88, 0x00000007dae00000) > to space 52416K, 0% used [0x00000007d47a0000, > 0x00000007d47a0000, 0x00000007d7ad0000) > tenured generation total 524288K, used 0K [0x00000007dae00000, > 0x00000007fae00000, 0x00000007fae00000) > the space 524288K, 0% used [0x00000007dae00000, > 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) > compacting perm gen total 21248K, used 2549K [0x00000007fae00000, > 0x00000007fc2c0000, 0x0000000800000000) > the space 21248K, 12% used [0x00000007fae00000, > 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) > No shared spaces configured. > } > {Heap before GC invocations=3 (full 0): > par new generation total 471872K, used 24002K [0x00000007bae00000, > 0x00000007dae00000, 0x00000007dae00000) > eden space 419456K, 1% used [0x00000007bae00000, > 0x00000007bb5f7c50, 0x00000007d47a0000) > from space 52416K, 30% used [0x00000007d7ad0000, > 0x00000007d8a48c88, 0x00000007dae00000) > to space 52416K, 0% used [0x00000007d47a0000, > 0x00000007d47a0000, 0x00000007d7ad0000) > tenured generation total 524288K, used 0K [0x00000007dae00000, > 0x00000007fae00000, 0x00000007fae00000) > the space 524288K, 0% used [0x00000007dae00000, > 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) > compacting perm gen total 21248K, used 2549K [0x00000007fae00000, > 0x00000007fc2c0000, 0x0000000800000000) > the space 21248K, 12% used [0x00000007fae00000, > 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) > No shared spaces configured. > 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), > 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: > user=0.04 sys=0.01, real=0.01 secs] > Heap after GC invocations=4 (full 0): > par new generation total 471872K, used 12748K [0x00000007bae00000, > 0x00000007dae00000, 0x00000007dae00000) > eden space 419456K, 0% used [0x00000007bae00000, > 0x00000007bae00000, 0x00000007d47a0000) > from space 52416K, 24% used [0x00000007d47a0000, > 0x00000007d5413320, 0x00000007d7ad0000) > to space 52416K, 0% used [0x00000007d7ad0000, > 0x00000007d7ad0000, 0x00000007dae00000) > tenured generation total 524288K, used 0K [0x00000007dae00000, > 0x00000007fae00000, 0x00000007fae00000) > the space 524288K, 0% used [0x00000007dae00000, > 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) > compacting perm gen total 21248K, used 2549K [0x00000007fae00000, > 0x00000007fc2c0000, 0x0000000800000000) > the space 21248K, 12% used [0x00000007fae00000, > 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) > No shared spaces configured. > } > > Notice that: > > * The timestamp of the second GC (1.130) is almost equal to the > timestamp of the first GC plus the duration of the first GC (1.119 + > 0.0103320 = 1.1293). In this test young GCs normally happen at a > frequency of one every 100ms-110ms or so. > * The eden at the start of the second GC is almost empty (1% > occupancy). We've also seen it very often with a completely empty eden. > * (the big hint) The second GC is GClocker-initiated. > > This happens most often with ParNew (in some cases, more than 30% of > the GCs are those unnecessary ones) but also happens with ParallelGC > too but less frequently (maybe 1%-1.5% of the GCs are those > unnecessary ones). I was unable to reproduce it with G1. > > I can reproduce it with with latest JDK 7, JDK 8, and also the latest > hotspot-gc/hotspot workspace. > > Are you guys looking into this (and is there a CR?)? I have a small > test I can reproduce it with and a diagnosis / proposed fix(es) if > you're interested. > > Tony > From tprintezis at twitter.com Fri Jun 27 16:41:38 2014 From: tprintezis at twitter.com (Tony Printezis) Date: Fri, 27 Jun 2014 12:41:38 -0400 Subject: The GCLocker blues... In-Reply-To: <53AD8CFD.4080903@oracle.com> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> Message-ID: <53AD9EC2.6080803@twitter.com> Hi Jon, Great to hear from you! :-) I haven't actually tried running the test with JDK 6 (I could if it'd be helpful...). Yes, I know exactly what's going on. There's a race between one thread in jni_unlock() scheduling the GCLocker-initiated young GC (let's call it GC-L) and another thread also scheduling a young GC (let's call it GC-A) because it couldn't allocate due to the eden being full. Under certain circumstances, GC-A can happen first, with GC-L being scheduled and going ahead as soon as GC-A finishes. I'll open a CR and add a more detailed analysis to it. Tony On 6/27/14, 11:25 AM, Jon Masamitsu wrote: > Tony, > > I don't recall talk within the GC group about this type of > problem. I didn't find a CR that relates to that behavior. > If there is one, I don't think it is on anyone's radar. > > Can I infer that the problem does not occur in jdk6? > > Any theories on what's going on? > > Jon > > > On 6/27/2014 6:00 AM, Tony Printezis wrote: >> Hi all, >> >> (trying again from my Twitter address; moderator: feel free to >> disregard the original I accidentally sent from my personal address) >> >> We have recently noticed an interesting problem which seems to happen >> quite frequently under certain circumstances. Immediately after a >> young GC, a second one happens which seems unnecessary given that it >> starts with an empty or almost empty eden. Here's an example: >> >> {Heap before GC invocations=2 (full 0): >> par new generation total 471872K, used 433003K >> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 100% used [0x00000007bae00000, >> 0x00000007d47a0000, 0x00000007d47a0000) >> from space 52416K, 25% used [0x00000007d47a0000, >> 0x00000007d54dacb0, 0x00000007d7ad0000) >> to space 52416K, 0% used [0x00000007d7ad0000, >> 0x00000007d7ad0000, 0x00000007dae00000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), >> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: >> user=0.03 sys=0.00, real=0.01 secs] >> Heap after GC invocations=3 (full 0): >> par new generation total 471872K, used 15843K [0x00000007bae00000, >> 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 0% used [0x00000007bae00000, >> 0x00000007bae00000, 0x00000007d47a0000) >> from space 52416K, 30% used [0x00000007d7ad0000, >> 0x00000007d8a48c88, 0x00000007dae00000) >> to space 52416K, 0% used [0x00000007d47a0000, >> 0x00000007d47a0000, 0x00000007d7ad0000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> } >> {Heap before GC invocations=3 (full 0): >> par new generation total 471872K, used 24002K [0x00000007bae00000, >> 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 1% used [0x00000007bae00000, >> 0x00000007bb5f7c50, 0x00000007d47a0000) >> from space 52416K, 30% used [0x00000007d7ad0000, >> 0x00000007d8a48c88, 0x00000007dae00000) >> to space 52416K, 0% used [0x00000007d47a0000, >> 0x00000007d47a0000, 0x00000007d7ad0000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), >> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: >> user=0.04 sys=0.01, real=0.01 secs] >> Heap after GC invocations=4 (full 0): >> par new generation total 471872K, used 12748K [0x00000007bae00000, >> 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 0% used [0x00000007bae00000, >> 0x00000007bae00000, 0x00000007d47a0000) >> from space 52416K, 24% used [0x00000007d47a0000, >> 0x00000007d5413320, 0x00000007d7ad0000) >> to space 52416K, 0% used [0x00000007d7ad0000, >> 0x00000007d7ad0000, 0x00000007dae00000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> } >> >> Notice that: >> >> * The timestamp of the second GC (1.130) is almost equal to the >> timestamp of the first GC plus the duration of the first GC (1.119 + >> 0.0103320 = 1.1293). In this test young GCs normally happen at a >> frequency of one every 100ms-110ms or so. >> * The eden at the start of the second GC is almost empty (1% >> occupancy). We've also seen it very often with a completely empty eden. >> * (the big hint) The second GC is GClocker-initiated. >> >> This happens most often with ParNew (in some cases, more than 30% of >> the GCs are those unnecessary ones) but also happens with ParallelGC >> too but less frequently (maybe 1%-1.5% of the GCs are those >> unnecessary ones). I was unable to reproduce it with G1. >> >> I can reproduce it with with latest JDK 7, JDK 8, and also the latest >> hotspot-gc/hotspot workspace. >> >> Are you guys looking into this (and is there a CR?)? I have a small >> test I can reproduce it with and a diagnosis / proposed fix(es) if >> you're interested. >> >> Tony >> > -- Tony Printezis | JVM/GC Engineer / VM Team | Twitter @TonyPrintezis tprintezis at twitter.com From tprintezis at twitter.com Fri Jun 27 16:45:52 2014 From: tprintezis at twitter.com (Tony Printezis) Date: Fri, 27 Jun 2014 12:45:52 -0400 Subject: The GCLocker blues... In-Reply-To: <53AD8CFD.4080903@oracle.com> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> Message-ID: <53AD9FC0.1010302@twitter.com> Jon, https://bugs.openjdk.java.net/browse/JDK-8048556 Tony On 6/27/14, 11:25 AM, Jon Masamitsu wrote: > Tony, > > I don't recall talk within the GC group about this type of > problem. I didn't find a CR that relates to that behavior. > If there is one, I don't think it is on anyone's radar. > > Can I infer that the problem does not occur in jdk6? > > Any theories on what's going on? > > Jon > > > On 6/27/2014 6:00 AM, Tony Printezis wrote: >> Hi all, >> >> (trying again from my Twitter address; moderator: feel free to >> disregard the original I accidentally sent from my personal address) >> >> We have recently noticed an interesting problem which seems to happen >> quite frequently under certain circumstances. Immediately after a >> young GC, a second one happens which seems unnecessary given that it >> starts with an empty or almost empty eden. Here's an example: >> >> {Heap before GC invocations=2 (full 0): >> par new generation total 471872K, used 433003K >> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 100% used [0x00000007bae00000, >> 0x00000007d47a0000, 0x00000007d47a0000) >> from space 52416K, 25% used [0x00000007d47a0000, >> 0x00000007d54dacb0, 0x00000007d7ad0000) >> to space 52416K, 0% used [0x00000007d7ad0000, >> 0x00000007d7ad0000, 0x00000007dae00000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), >> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: >> user=0.03 sys=0.00, real=0.01 secs] >> Heap after GC invocations=3 (full 0): >> par new generation total 471872K, used 15843K [0x00000007bae00000, >> 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 0% used [0x00000007bae00000, >> 0x00000007bae00000, 0x00000007d47a0000) >> from space 52416K, 30% used [0x00000007d7ad0000, >> 0x00000007d8a48c88, 0x00000007dae00000) >> to space 52416K, 0% used [0x00000007d47a0000, >> 0x00000007d47a0000, 0x00000007d7ad0000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> } >> {Heap before GC invocations=3 (full 0): >> par new generation total 471872K, used 24002K [0x00000007bae00000, >> 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 1% used [0x00000007bae00000, >> 0x00000007bb5f7c50, 0x00000007d47a0000) >> from space 52416K, 30% used [0x00000007d7ad0000, >> 0x00000007d8a48c88, 0x00000007dae00000) >> to space 52416K, 0% used [0x00000007d47a0000, >> 0x00000007d47a0000, 0x00000007d7ad0000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), >> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: >> user=0.04 sys=0.01, real=0.01 secs] >> Heap after GC invocations=4 (full 0): >> par new generation total 471872K, used 12748K [0x00000007bae00000, >> 0x00000007dae00000, 0x00000007dae00000) >> eden space 419456K, 0% used [0x00000007bae00000, >> 0x00000007bae00000, 0x00000007d47a0000) >> from space 52416K, 24% used [0x00000007d47a0000, >> 0x00000007d5413320, 0x00000007d7ad0000) >> to space 52416K, 0% used [0x00000007d7ad0000, >> 0x00000007d7ad0000, 0x00000007dae00000) >> tenured generation total 524288K, used 0K [0x00000007dae00000, >> 0x00000007fae00000, 0x00000007fae00000) >> the space 524288K, 0% used [0x00000007dae00000, >> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >> 0x00000007fc2c0000, 0x0000000800000000) >> the space 21248K, 12% used [0x00000007fae00000, >> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >> No shared spaces configured. >> } >> >> Notice that: >> >> * The timestamp of the second GC (1.130) is almost equal to the >> timestamp of the first GC plus the duration of the first GC (1.119 + >> 0.0103320 = 1.1293). In this test young GCs normally happen at a >> frequency of one every 100ms-110ms or so. >> * The eden at the start of the second GC is almost empty (1% >> occupancy). We've also seen it very often with a completely empty eden. >> * (the big hint) The second GC is GClocker-initiated. >> >> This happens most often with ParNew (in some cases, more than 30% of >> the GCs are those unnecessary ones) but also happens with ParallelGC >> too but less frequently (maybe 1%-1.5% of the GCs are those >> unnecessary ones). I was unable to reproduce it with G1. >> >> I can reproduce it with with latest JDK 7, JDK 8, and also the latest >> hotspot-gc/hotspot workspace. >> >> Are you guys looking into this (and is there a CR?)? I have a small >> test I can reproduce it with and a diagnosis / proposed fix(es) if >> you're interested. >> >> Tony >> > -- Tony Printezis | JVM/GC Engineer / VM Team | Twitter @TonyPrintezis tprintezis at twitter.com From monica.b at servergy.com Fri Jun 27 17:24:25 2014 From: monica.b at servergy.com (Monica Beckwith) Date: Fri, 27 Jun 2014 12:24:25 -0500 Subject: The GCLocker blues... In-Reply-To: <53AD9FC0.1010302@twitter.com> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> <53AD9FC0.1010302@twitter.com> Message-ID: <53ADA8C9.8040104@servergy.com> Hi Tony/ Jon - AFAIK, this was observed and fixed a while back for G1. I will see if I can find the CR# for G1. -Monica On 6/27/14, 11:45 AM, Tony Printezis wrote: > Jon, > > https://bugs.openjdk.java.net/browse/JDK-8048556 > > Tony > > On 6/27/14, 11:25 AM, Jon Masamitsu wrote: >> Tony, >> >> I don't recall talk within the GC group about this type of >> problem. I didn't find a CR that relates to that behavior. >> If there is one, I don't think it is on anyone's radar. >> >> Can I infer that the problem does not occur in jdk6? >> >> Any theories on what's going on? >> >> Jon >> >> >> On 6/27/2014 6:00 AM, Tony Printezis wrote: >>> Hi all, >>> >>> (trying again from my Twitter address; moderator: feel free to >>> disregard the original I accidentally sent from my personal address) >>> >>> We have recently noticed an interesting problem which seems to >>> happen quite frequently under certain circumstances. Immediately >>> after a young GC, a second one happens which seems unnecessary given >>> that it starts with an empty or almost empty eden. Here's an example: >>> >>> {Heap before GC invocations=2 (full 0): >>> par new generation total 471872K, used 433003K >>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 100% used [0x00000007bae00000, >>> 0x00000007d47a0000, 0x00000007d47a0000) >>> from space 52416K, 25% used [0x00000007d47a0000, >>> 0x00000007d54dacb0, 0x00000007d7ad0000) >>> to space 52416K, 0% used [0x00000007d7ad0000, >>> 0x00000007d7ad0000, 0x00000007dae00000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>> 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, >>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>> 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, >>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), >>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: >>> user=0.03 sys=0.00, real=0.01 secs] >>> Heap after GC invocations=3 (full 0): >>> par new generation total 471872K, used 15843K >>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 0% used [0x00000007bae00000, >>> 0x00000007bae00000, 0x00000007d47a0000) >>> from space 52416K, 30% used [0x00000007d7ad0000, >>> 0x00000007d8a48c88, 0x00000007dae00000) >>> to space 52416K, 0% used [0x00000007d47a0000, >>> 0x00000007d47a0000, 0x00000007d7ad0000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>> 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, >>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>> 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, >>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> } >>> {Heap before GC invocations=3 (full 0): >>> par new generation total 471872K, used 24002K >>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 1% used [0x00000007bae00000, >>> 0x00000007bb5f7c50, 0x00000007d47a0000) >>> from space 52416K, 30% used [0x00000007d7ad0000, >>> 0x00000007d8a48c88, 0x00000007dae00000) >>> to space 52416K, 0% used [0x00000007d47a0000, >>> 0x00000007d47a0000, 0x00000007d7ad0000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>> 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, >>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>> 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, >>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), >>> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: >>> user=0.04 sys=0.01, real=0.01 secs] >>> Heap after GC invocations=4 (full 0): >>> par new generation total 471872K, used 12748K >>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 0% used [0x00000007bae00000, >>> 0x00000007bae00000, 0x00000007d47a0000) >>> from space 52416K, 24% used [0x00000007d47a0000, >>> 0x00000007d5413320, 0x00000007d7ad0000) >>> to space 52416K, 0% used [0x00000007d7ad0000, >>> 0x00000007d7ad0000, 0x00000007dae00000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>> 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, >>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>> 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, >>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> } >>> >>> Notice that: >>> >>> * The timestamp of the second GC (1.130) is almost equal to the >>> timestamp of the first GC plus the duration of the first GC (1.119 + >>> 0.0103320 = 1.1293). In this test young GCs normally happen at a >>> frequency of one every 100ms-110ms or so. >>> * The eden at the start of the second GC is almost empty (1% >>> occupancy). We've also seen it very often with a completely empty eden. >>> * (the big hint) The second GC is GClocker-initiated. >>> >>> This happens most often with ParNew (in some cases, more than 30% of >>> the GCs are those unnecessary ones) but also happens with ParallelGC >>> too but less frequently (maybe 1%-1.5% of the GCs are those >>> unnecessary ones). I was unable to reproduce it with G1. >>> >>> I can reproduce it with with latest JDK 7, JDK 8, and also the >>> latest hotspot-gc/hotspot workspace. >>> >>> Are you guys looking into this (and is there a CR?)? I have a small >>> test I can reproduce it with and a diagnosis / proposed fix(es) if >>> you're interested. >>> >>> Tony >>> >> > From Peter.B.Kessler at Oracle.COM Fri Jun 27 18:35:18 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Fri, 27 Jun 2014 11:35:18 -0700 Subject: The GCLocker blues... In-Reply-To: <53AD9EC2.6080803@twitter.com> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> <53AD9EC2.6080803@twitter.com> Message-ID: <53ADB966.7090805@Oracle.COM> I thought there was code somewhere to address this in the case that multiple threads requested collections (for whatever reason, not just GC-locker). Something like: when the collection is requested, record the time (or an epoch number like the collection count), and then when compare that time (or epoch number) to the time (or epoch number) of the last collection when the requested collection is processed. If there's been a collection since the request, assume the requested collection is redundant. There it is: line 88 of http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp VM_GC_Operation::skip_operation(). With a comment describing why it's there. Maybe it's not detailed enough to prevent an extra collection in your situation? ... peter On 06/27/14 09:41, Tony Printezis wrote: > Hi Jon, > > Great to hear from you! :-) > > I haven't actually tried running the test with JDK 6 (I could if it'd be helpful...). > > Yes, I know exactly what's going on. There's a race between one thread in jni_unlock() scheduling the GCLocker-initiated young GC (let's call it GC-L) and another thread also scheduling a young GC (let's call it GC-A) because it couldn't allocate due to the eden being full. Under certain circumstances, GC-A can happen first, with GC-L being scheduled and going ahead as soon as GC-A finishes. > > I'll open a CR and add a more detailed analysis to it. > > Tony > > On 6/27/14, 11:25 AM, Jon Masamitsu wrote: >> Tony, >> >> I don't recall talk within the GC group about this type of >> problem. I didn't find a CR that relates to that behavior. >> If there is one, I don't think it is on anyone's radar. >> >> Can I infer that the problem does not occur in jdk6? >> >> Any theories on what's going on? >> >> Jon >> >> >> On 6/27/2014 6:00 AM, Tony Printezis wrote: >>> Hi all, >>> >>> (trying again from my Twitter address; moderator: feel free to disregard the original I accidentally sent from my personal address) >>> >>> We have recently noticed an interesting problem which seems to happen quite frequently under certain circumstances. Immediately after a young GC, a second one happens which seems unnecessary given that it starts with an empty or almost empty eden. Here's an example: >>> >>> {Heap before GC invocations=2 (full 0): >>> par new generation total 471872K, used 433003K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 100% used [0x00000007bae00000, 0x00000007d47a0000, 0x00000007d47a0000) >>> from space 52416K, 25% used [0x00000007d47a0000, 0x00000007d54dacb0, 0x00000007d7ad0000) >>> to space 52416K, 0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] >>> Heap after GC invocations=3 (full 0): >>> par new generation total 471872K, used 15843K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000) >>> from space 52416K, 30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000) >>> to space 52416K, 0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> } >>> {Heap before GC invocations=3 (full 0): >>> par new generation total 471872K, used 24002K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 1% used [0x00000007bae00000, 0x00000007bb5f7c50, 0x00000007d47a0000) >>> from space 52416K, 30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000) >>> to space 52416K, 0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] >>> Heap after GC invocations=4 (full 0): >>> par new generation total 471872K, used 12748K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>> eden space 419456K, 0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000) >>> from space 52416K, 24% used [0x00000007d47a0000, 0x00000007d5413320, 0x00000007d7ad0000) >>> to space 52416K, 0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000) >>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>> No shared spaces configured. >>> } >>> >>> Notice that: >>> >>> * The timestamp of the second GC (1.130) is almost equal to the timestamp of the first GC plus the duration of the first GC (1.119 + 0.0103320 = 1.1293). In this test young GCs normally happen at a frequency of one every 100ms-110ms or so. >>> * The eden at the start of the second GC is almost empty (1% occupancy). We've also seen it very often with a completely empty eden. >>> * (the big hint) The second GC is GClocker-initiated. >>> >>> This happens most often with ParNew (in some cases, more than 30% of the GCs are those unnecessary ones) but also happens with ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are those unnecessary ones). I was unable to reproduce it with G1. >>> >>> I can reproduce it with with latest JDK 7, JDK 8, and also the latest hotspot-gc/hotspot workspace. >>> >>> Are you guys looking into this (and is there a CR?)? I have a small test I can reproduce it with and a diagnosis / proposed fix(es) if you're interested. >>> >>> Tony >>> >> > From jon.masamitsu at oracle.com Fri Jun 27 19:18:20 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Fri, 27 Jun 2014 12:18:20 -0700 Subject: The GCLocker blues... In-Reply-To: <53ADB966.7090805@Oracle.COM> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> <53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM> Message-ID: <53ADC37C.6070102@oracle.com> Peter, Some of the cases that Tony points out have edens with small amounts of used. That would indicate that a GC finished, the mutators were restarted and then the GC-locker needed GC was requested. Jon On 6/27/2014 11:35 AM, Peter B. Kessler wrote: > I thought there was code somewhere to address this in the case that > multiple threads requested collections (for whatever reason, not just > GC-locker). > > Something like: when the collection is requested, record the time (or > an epoch number like the collection count), and then when compare that > time (or epoch number) to the time (or epoch number) of the last > collection when the requested collection is processed. If there's > been a collection since the request, assume the requested collection > is redundant. > > There it is: line 88 of > > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp > > VM_GC_Operation::skip_operation(). With a comment describing why it's > there. Maybe it's not detailed enough to prevent an extra collection > in your situation? > > ... peter > > On 06/27/14 09:41, Tony Printezis wrote: >> Hi Jon, >> >> Great to hear from you! :-) >> >> I haven't actually tried running the test with JDK 6 (I could if it'd >> be helpful...). >> >> Yes, I know exactly what's going on. There's a race between one >> thread in jni_unlock() scheduling the GCLocker-initiated young GC >> (let's call it GC-L) and another thread also scheduling a young GC >> (let's call it GC-A) because it couldn't allocate due to the eden >> being full. Under certain circumstances, GC-A can happen first, with >> GC-L being scheduled and going ahead as soon as GC-A finishes. >> >> I'll open a CR and add a more detailed analysis to it. >> >> Tony >> >> On 6/27/14, 11:25 AM, Jon Masamitsu wrote: >>> Tony, >>> >>> I don't recall talk within the GC group about this type of >>> problem. I didn't find a CR that relates to that behavior. >>> If there is one, I don't think it is on anyone's radar. >>> >>> Can I infer that the problem does not occur in jdk6? >>> >>> Any theories on what's going on? >>> >>> Jon >>> >>> >>> On 6/27/2014 6:00 AM, Tony Printezis wrote: >>>> Hi all, >>>> >>>> (trying again from my Twitter address; moderator: feel free to >>>> disregard the original I accidentally sent from my personal address) >>>> >>>> We have recently noticed an interesting problem which seems to >>>> happen quite frequently under certain circumstances. Immediately >>>> after a young GC, a second one happens which seems unnecessary >>>> given that it starts with an empty or almost empty eden. Here's an >>>> example: >>>> >>>> {Heap before GC invocations=2 (full 0): >>>> par new generation total 471872K, used 433003K >>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>> eden space 419456K, 100% used [0x00000007bae00000, >>>> 0x00000007d47a0000, 0x00000007d47a0000) >>>> from space 52416K, 25% used [0x00000007d47a0000, >>>> 0x00000007d54dacb0, 0x00000007d7ad0000) >>>> to space 52416K, 0% used [0x00000007d7ad0000, >>>> 0x00000007d7ad0000, 0x00000007dae00000) >>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>> 0x00000007fae00000, 0x00000007fae00000) >>>> the space 524288K, 0% used [0x00000007dae00000, >>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>>> 0x00000007fc2c0000, 0x0000000800000000) >>>> the space 21248K, 12% used [0x00000007fae00000, >>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>> No shared spaces configured. >>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), >>>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: >>>> user=0.03 sys=0.00, real=0.01 secs] >>>> Heap after GC invocations=3 (full 0): >>>> par new generation total 471872K, used 15843K >>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>> eden space 419456K, 0% used [0x00000007bae00000, >>>> 0x00000007bae00000, 0x00000007d47a0000) >>>> from space 52416K, 30% used [0x00000007d7ad0000, >>>> 0x00000007d8a48c88, 0x00000007dae00000) >>>> to space 52416K, 0% used [0x00000007d47a0000, >>>> 0x00000007d47a0000, 0x00000007d7ad0000) >>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>> 0x00000007fae00000, 0x00000007fae00000) >>>> the space 524288K, 0% used [0x00000007dae00000, >>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>>> 0x00000007fc2c0000, 0x0000000800000000) >>>> the space 21248K, 12% used [0x00000007fae00000, >>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>> No shared spaces configured. >>>> } >>>> {Heap before GC invocations=3 (full 0): >>>> par new generation total 471872K, used 24002K >>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>> eden space 419456K, 1% used [0x00000007bae00000, >>>> 0x00000007bb5f7c50, 0x00000007d47a0000) >>>> from space 52416K, 30% used [0x00000007d7ad0000, >>>> 0x00000007d8a48c88, 0x00000007dae00000) >>>> to space 52416K, 0% used [0x00000007d47a0000, >>>> 0x00000007d47a0000, 0x00000007d7ad0000) >>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>> 0x00000007fae00000, 0x00000007fae00000) >>>> the space 524288K, 0% used [0x00000007dae00000, >>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>>> 0x00000007fc2c0000, 0x0000000800000000) >>>> the space 21248K, 12% used [0x00000007fae00000, >>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>> No shared spaces configured. >>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), >>>> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: >>>> user=0.04 sys=0.01, real=0.01 secs] >>>> Heap after GC invocations=4 (full 0): >>>> par new generation total 471872K, used 12748K >>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>> eden space 419456K, 0% used [0x00000007bae00000, >>>> 0x00000007bae00000, 0x00000007d47a0000) >>>> from space 52416K, 24% used [0x00000007d47a0000, >>>> 0x00000007d5413320, 0x00000007d7ad0000) >>>> to space 52416K, 0% used [0x00000007d7ad0000, >>>> 0x00000007d7ad0000, 0x00000007dae00000) >>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>> 0x00000007fae00000, 0x00000007fae00000) >>>> the space 524288K, 0% used [0x00000007dae00000, >>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, >>>> 0x00000007fc2c0000, 0x0000000800000000) >>>> the space 21248K, 12% used [0x00000007fae00000, >>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>> No shared spaces configured. >>>> } >>>> >>>> Notice that: >>>> >>>> * The timestamp of the second GC (1.130) is almost equal to the >>>> timestamp of the first GC plus the duration of the first GC (1.119 >>>> + 0.0103320 = 1.1293). In this test young GCs normally happen at a >>>> frequency of one every 100ms-110ms or so. >>>> * The eden at the start of the second GC is almost empty (1% >>>> occupancy). We've also seen it very often with a completely empty >>>> eden. >>>> * (the big hint) The second GC is GClocker-initiated. >>>> >>>> This happens most often with ParNew (in some cases, more than 30% >>>> of the GCs are those unnecessary ones) but also happens with >>>> ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are >>>> those unnecessary ones). I was unable to reproduce it with G1. >>>> >>>> I can reproduce it with with latest JDK 7, JDK 8, and also the >>>> latest hotspot-gc/hotspot workspace. >>>> >>>> Are you guys looking into this (and is there a CR?)? I have a small >>>> test I can reproduce it with and a diagnosis / proposed fix(es) if >>>> you're interested. >>>> >>>> Tony >>>> >>> >> From Peter.B.Kessler at Oracle.COM Fri Jun 27 21:04:32 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Fri, 27 Jun 2014 14:04:32 -0700 Subject: The GCLocker blues... In-Reply-To: <53ADC37C.6070102@oracle.com> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> <53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM> <53ADC37C.6070102@oracle.com> Message-ID: <53ADDC60.6040702@Oracle.COM> Right. VM_GC_Operation::skip_operation() could be made smarter. The current version just looks at the count of collections. There's lots of data available about why a collection was requested and again when it gets to the prologue. E.g., young generation occupancy. There's all the time in the world to make a good decision, relative to the time for the collection (if one happens) or the savings (if a collection is avoided). ... peter On 06/27/14 12:18, Jon Masamitsu wrote: > Peter, > > Some of the cases that Tony points out have edens > with small amounts of used. That would indicate that > a GC finished, the mutators were restarted and then > the GC-locker needed GC was requested. > > Jon > > On 6/27/2014 11:35 AM, Peter B. Kessler wrote: >> I thought there was code somewhere to address this in the case that multiple threads requested collections (for whatever reason, not just GC-locker). >> >> Something like: when the collection is requested, record the time (or an epoch number like the collection count), and then when compare that time (or epoch number) to the time (or epoch number) of the last collection when the requested collection is processed. If there's been a collection since the request, assume the requested collection is redundant. >> >> There it is: line 88 of >> >> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp >> >> VM_GC_Operation::skip_operation(). With a comment describing why it's there. Maybe it's not detailed enough to prevent an extra collection in your situation? >> >> ... peter >> >> On 06/27/14 09:41, Tony Printezis wrote: >>> Hi Jon, >>> >>> Great to hear from you! :-) >>> >>> I haven't actually tried running the test with JDK 6 (I could if it'd be helpful...). >>> >>> Yes, I know exactly what's going on. There's a race between one thread in jni_unlock() scheduling the GCLocker-initiated young GC (let's call it GC-L) and another thread also scheduling a young GC (let's call it GC-A) because it couldn't allocate due to the eden being full. Under certain circumstances, GC-A can happen first, with GC-L being scheduled and going ahead as soon as GC-A finishes. >>> >>> I'll open a CR and add a more detailed analysis to it. >>> >>> Tony >>> >>> On 6/27/14, 11:25 AM, Jon Masamitsu wrote: >>>> Tony, >>>> >>>> I don't recall talk within the GC group about this type of >>>> problem. I didn't find a CR that relates to that behavior. >>>> If there is one, I don't think it is on anyone's radar. >>>> >>>> Can I infer that the problem does not occur in jdk6? >>>> >>>> Any theories on what's going on? >>>> >>>> Jon >>>> >>>> >>>> On 6/27/2014 6:00 AM, Tony Printezis wrote: >>>>> Hi all, >>>>> >>>>> (trying again from my Twitter address; moderator: feel free to disregard the original I accidentally sent from my personal address) >>>>> >>>>> We have recently noticed an interesting problem which seems to happen quite frequently under certain circumstances. Immediately after a young GC, a second one happens which seems unnecessary given that it starts with an empty or almost empty eden. Here's an example: >>>>> >>>>> {Heap before GC invocations=2 (full 0): >>>>> par new generation total 471872K, used 433003K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>> eden space 419456K, 100% used [0x00000007bae00000, 0x00000007d47a0000, 0x00000007d47a0000) >>>>> from space 52416K, 25% used [0x00000007d47a0000, 0x00000007d54dacb0, 0x00000007d7ad0000) >>>>> to space 52416K, 0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000) >>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>>>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>> No shared spaces configured. >>>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] >>>>> Heap after GC invocations=3 (full 0): >>>>> par new generation total 471872K, used 15843K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>> eden space 419456K, 0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000) >>>>> from space 52416K, 30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000) >>>>> to space 52416K, 0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000) >>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>>>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>> No shared spaces configured. >>>>> } >>>>> {Heap before GC invocations=3 (full 0): >>>>> par new generation total 471872K, used 24002K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>> eden space 419456K, 1% used [0x00000007bae00000, 0x00000007bb5f7c50, 0x00000007d47a0000) >>>>> from space 52416K, 30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000) >>>>> to space 52416K, 0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000) >>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>>>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>> No shared spaces configured. >>>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] >>>>> Heap after GC invocations=4 (full 0): >>>>> par new generation total 471872K, used 12748K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>> eden space 419456K, 0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000) >>>>> from space 52416K, 24% used [0x00000007d47a0000, 0x00000007d5413320, 0x00000007d7ad0000) >>>>> to space 52416K, 0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000) >>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000) >>>>> the space 524288K, 0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>> compacting perm gen total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>> the space 21248K, 12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>> No shared spaces configured. >>>>> } >>>>> >>>>> Notice that: >>>>> >>>>> * The timestamp of the second GC (1.130) is almost equal to the timestamp of the first GC plus the duration of the first GC (1.119 + 0.0103320 = 1.1293). In this test young GCs normally happen at a frequency of one every 100ms-110ms or so. >>>>> * The eden at the start of the second GC is almost empty (1% occupancy). We've also seen it very often with a completely empty eden. >>>>> * (the big hint) The second GC is GClocker-initiated. >>>>> >>>>> This happens most often with ParNew (in some cases, more than 30% of the GCs are those unnecessary ones) but also happens with ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are those unnecessary ones). I was unable to reproduce it with G1. >>>>> >>>>> I can reproduce it with with latest JDK 7, JDK 8, and also the latest hotspot-gc/hotspot workspace. >>>>> >>>>> Are you guys looking into this (and is there a CR?)? I have a small test I can reproduce it with and a diagnosis / proposed fix(es) if you're interested. >>>>> >>>>> Tony >>>>> >>>> >>> > From tprintezis at twitter.com Mon Jun 30 13:28:21 2014 From: tprintezis at twitter.com (Tony Printezis) Date: Mon, 30 Jun 2014 09:28:21 -0400 Subject: The GCLocker blues... In-Reply-To: <53ADDC60.6040702@Oracle.COM> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> <53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM> <53ADC37C.6070102@oracle.com> <53ADDC60.6040702@Oracle.COM> Message-ID: <53B165F5.5090900@twitter.com> Peter, Yes, each GC VM op has the current GC counts and if by the time it runs the GC counts are out-of-date, the op bails out in the prologue without a safepoint. However, as I described on the CR, this doesn't work here. By the time the thread gets the Heap_lock and reads the GC counts, another GC has already happened. So, when the GC VM op runs, the GC counts are up-to-date. The GC count protocol works for threads that are trying to do allocations, given that those decisions are serialized using the Heap_lock. However, the thread jni_unlock() is not part of that protocol hence it is oblivious to those decisions. Tony On 6/27/14, 5:04 PM, Peter B. Kessler wrote: > Right. VM_GC_Operation::skip_operation() could be made smarter. The > current version just looks at the count of collections. There's lots > of data available about why a collection was requested and again when > it gets to the prologue. E.g., young generation occupancy. There's > all the time in the world to make a good decision, relative to the > time for the collection (if one happens) or the savings (if a > collection is avoided). > > ... peter > > On 06/27/14 12:18, Jon Masamitsu wrote: >> Peter, >> >> Some of the cases that Tony points out have edens >> with small amounts of used. That would indicate that >> a GC finished, the mutators were restarted and then >> the GC-locker needed GC was requested. >> >> Jon >> >> On 6/27/2014 11:35 AM, Peter B. Kessler wrote: >>> I thought there was code somewhere to address this in the case that >>> multiple threads requested collections (for whatever reason, not >>> just GC-locker). >>> >>> Something like: when the collection is requested, record the time >>> (or an epoch number like the collection count), and then when >>> compare that time (or epoch number) to the time (or epoch number) of >>> the last collection when the requested collection is processed. If >>> there's been a collection since the request, assume the requested >>> collection is redundant. >>> >>> There it is: line 88 of >>> >>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp >>> >>> >>> VM_GC_Operation::skip_operation(). With a comment describing why >>> it's there. Maybe it's not detailed enough to prevent an extra >>> collection in your situation? >>> >>> ... peter >>> >>> On 06/27/14 09:41, Tony Printezis wrote: >>>> Hi Jon, >>>> >>>> Great to hear from you! :-) >>>> >>>> I haven't actually tried running the test with JDK 6 (I could if >>>> it'd be helpful...). >>>> >>>> Yes, I know exactly what's going on. There's a race between one >>>> thread in jni_unlock() scheduling the GCLocker-initiated young GC >>>> (let's call it GC-L) and another thread also scheduling a young GC >>>> (let's call it GC-A) because it couldn't allocate due to the eden >>>> being full. Under certain circumstances, GC-A can happen first, >>>> with GC-L being scheduled and going ahead as soon as GC-A finishes. >>>> >>>> I'll open a CR and add a more detailed analysis to it. >>>> >>>> Tony >>>> >>>> On 6/27/14, 11:25 AM, Jon Masamitsu wrote: >>>>> Tony, >>>>> >>>>> I don't recall talk within the GC group about this type of >>>>> problem. I didn't find a CR that relates to that behavior. >>>>> If there is one, I don't think it is on anyone's radar. >>>>> >>>>> Can I infer that the problem does not occur in jdk6? >>>>> >>>>> Any theories on what's going on? >>>>> >>>>> Jon >>>>> >>>>> >>>>> On 6/27/2014 6:00 AM, Tony Printezis wrote: >>>>>> Hi all, >>>>>> >>>>>> (trying again from my Twitter address; moderator: feel free to >>>>>> disregard the original I accidentally sent from my personal address) >>>>>> >>>>>> We have recently noticed an interesting problem which seems to >>>>>> happen quite frequently under certain circumstances. Immediately >>>>>> after a young GC, a second one happens which seems unnecessary >>>>>> given that it starts with an empty or almost empty eden. Here's >>>>>> an example: >>>>>> >>>>>> {Heap before GC invocations=2 (full 0): >>>>>> par new generation total 471872K, used 433003K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 100% used [0x00000007bae00000, >>>>>> 0x00000007d47a0000, 0x00000007d47a0000) >>>>>> from space 52416K, 25% used [0x00000007d47a0000, >>>>>> 0x00000007d54dacb0, 0x00000007d7ad0000) >>>>>> to space 52416K, 0% used [0x00000007d7ad0000, >>>>>> 0x00000007d7ad0000, 0x00000007dae00000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), >>>>>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: >>>>>> user=0.03 sys=0.00, real=0.01 secs] >>>>>> Heap after GC invocations=3 (full 0): >>>>>> par new generation total 471872K, used 15843K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 0% used [0x00000007bae00000, >>>>>> 0x00000007bae00000, 0x00000007d47a0000) >>>>>> from space 52416K, 30% used [0x00000007d7ad0000, >>>>>> 0x00000007d8a48c88, 0x00000007dae00000) >>>>>> to space 52416K, 0% used [0x00000007d47a0000, >>>>>> 0x00000007d47a0000, 0x00000007d7ad0000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> } >>>>>> {Heap before GC invocations=3 (full 0): >>>>>> par new generation total 471872K, used 24002K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 1% used [0x00000007bae00000, >>>>>> 0x00000007bb5f7c50, 0x00000007d47a0000) >>>>>> from space 52416K, 30% used [0x00000007d7ad0000, >>>>>> 0x00000007d8a48c88, 0x00000007dae00000) >>>>>> to space 52416K, 0% used [0x00000007d47a0000, >>>>>> 0x00000007d47a0000, 0x00000007d7ad0000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: >>>>>> 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), >>>>>> 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] >>>>>> Heap after GC invocations=4 (full 0): >>>>>> par new generation total 471872K, used 12748K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 0% used [0x00000007bae00000, >>>>>> 0x00000007bae00000, 0x00000007d47a0000) >>>>>> from space 52416K, 24% used [0x00000007d47a0000, >>>>>> 0x00000007d5413320, 0x00000007d7ad0000) >>>>>> to space 52416K, 0% used [0x00000007d7ad0000, >>>>>> 0x00000007d7ad0000, 0x00000007dae00000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> } >>>>>> >>>>>> Notice that: >>>>>> >>>>>> * The timestamp of the second GC (1.130) is almost equal to the >>>>>> timestamp of the first GC plus the duration of the first GC >>>>>> (1.119 + 0.0103320 = 1.1293). In this test young GCs normally >>>>>> happen at a frequency of one every 100ms-110ms or so. >>>>>> * The eden at the start of the second GC is almost empty (1% >>>>>> occupancy). We've also seen it very often with a completely empty >>>>>> eden. >>>>>> * (the big hint) The second GC is GClocker-initiated. >>>>>> >>>>>> This happens most often with ParNew (in some cases, more than 30% >>>>>> of the GCs are those unnecessary ones) but also happens with >>>>>> ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are >>>>>> those unnecessary ones). I was unable to reproduce it with G1. >>>>>> >>>>>> I can reproduce it with with latest JDK 7, JDK 8, and also the >>>>>> latest hotspot-gc/hotspot workspace. >>>>>> >>>>>> Are you guys looking into this (and is there a CR?)? I have a >>>>>> small test I can reproduce it with and a diagnosis / proposed >>>>>> fix(es) if you're interested. >>>>>> >>>>>> Tony >>>>>> >>>>> >>>> >> -- Tony Printezis | JVM/GC Engineer / VM Team | Twitter @TonyPrintezis tprintezis at twitter.com From tprintezis at twitter.com Mon Jun 30 13:32:18 2014 From: tprintezis at twitter.com (Tony Printezis) Date: Mon, 30 Jun 2014 09:32:18 -0400 Subject: The GCLocker blues... In-Reply-To: <53ADDC60.6040702@Oracle.COM> References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com> <53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM> <53ADC37C.6070102@oracle.com> <53ADDC60.6040702@Oracle.COM> Message-ID: <53B166E2.40706@twitter.com> PS You're right that skip_operation() is not performance critical, however I'm not sure it's possible to make it smarter in this case. The GC VM op that's about to do the GCLocker-initiated GC doesn't have enough information to know whether another GC happened meanwhile or not. The only way to do that is to attach the correct GC counts to the VM op (see the CR for this and another suggested fix). On 6/27/14, 5:04 PM, Peter B. Kessler wrote: > Right. VM_GC_Operation::skip_operation() could be made smarter. The > current version just looks at the count of collections. There's lots > of data available about why a collection was requested and again when > it gets to the prologue. E.g., young generation occupancy. There's > all the time in the world to make a good decision, relative to the > time for the collection (if one happens) or the savings (if a > collection is avoided). > > ... peter > > On 06/27/14 12:18, Jon Masamitsu wrote: >> Peter, >> >> Some of the cases that Tony points out have edens >> with small amounts of used. That would indicate that >> a GC finished, the mutators were restarted and then >> the GC-locker needed GC was requested. >> >> Jon >> >> On 6/27/2014 11:35 AM, Peter B. Kessler wrote: >>> I thought there was code somewhere to address this in the case that >>> multiple threads requested collections (for whatever reason, not >>> just GC-locker). >>> >>> Something like: when the collection is requested, record the time >>> (or an epoch number like the collection count), and then when >>> compare that time (or epoch number) to the time (or epoch number) of >>> the last collection when the requested collection is processed. If >>> there's been a collection since the request, assume the requested >>> collection is redundant. >>> >>> There it is: line 88 of >>> >>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp >>> >>> >>> VM_GC_Operation::skip_operation(). With a comment describing why >>> it's there. Maybe it's not detailed enough to prevent an extra >>> collection in your situation? >>> >>> ... peter >>> >>> On 06/27/14 09:41, Tony Printezis wrote: >>>> Hi Jon, >>>> >>>> Great to hear from you! :-) >>>> >>>> I haven't actually tried running the test with JDK 6 (I could if >>>> it'd be helpful...). >>>> >>>> Yes, I know exactly what's going on. There's a race between one >>>> thread in jni_unlock() scheduling the GCLocker-initiated young GC >>>> (let's call it GC-L) and another thread also scheduling a young GC >>>> (let's call it GC-A) because it couldn't allocate due to the eden >>>> being full. Under certain circumstances, GC-A can happen first, >>>> with GC-L being scheduled and going ahead as soon as GC-A finishes. >>>> >>>> I'll open a CR and add a more detailed analysis to it. >>>> >>>> Tony >>>> >>>> On 6/27/14, 11:25 AM, Jon Masamitsu wrote: >>>>> Tony, >>>>> >>>>> I don't recall talk within the GC group about this type of >>>>> problem. I didn't find a CR that relates to that behavior. >>>>> If there is one, I don't think it is on anyone's radar. >>>>> >>>>> Can I infer that the problem does not occur in jdk6? >>>>> >>>>> Any theories on what's going on? >>>>> >>>>> Jon >>>>> >>>>> >>>>> On 6/27/2014 6:00 AM, Tony Printezis wrote: >>>>>> Hi all, >>>>>> >>>>>> (trying again from my Twitter address; moderator: feel free to >>>>>> disregard the original I accidentally sent from my personal address) >>>>>> >>>>>> We have recently noticed an interesting problem which seems to >>>>>> happen quite frequently under certain circumstances. Immediately >>>>>> after a young GC, a second one happens which seems unnecessary >>>>>> given that it starts with an empty or almost empty eden. Here's >>>>>> an example: >>>>>> >>>>>> {Heap before GC invocations=2 (full 0): >>>>>> par new generation total 471872K, used 433003K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 100% used [0x00000007bae00000, >>>>>> 0x00000007d47a0000, 0x00000007d47a0000) >>>>>> from space 52416K, 25% used [0x00000007d47a0000, >>>>>> 0x00000007d54dacb0, 0x00000007d7ad0000) >>>>>> to space 52416K, 0% used [0x00000007d7ad0000, >>>>>> 0x00000007d7ad0000, 0x00000007dae00000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), >>>>>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: >>>>>> user=0.03 sys=0.00, real=0.01 secs] >>>>>> Heap after GC invocations=3 (full 0): >>>>>> par new generation total 471872K, used 15843K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 0% used [0x00000007bae00000, >>>>>> 0x00000007bae00000, 0x00000007d47a0000) >>>>>> from space 52416K, 30% used [0x00000007d7ad0000, >>>>>> 0x00000007d8a48c88, 0x00000007dae00000) >>>>>> to space 52416K, 0% used [0x00000007d47a0000, >>>>>> 0x00000007d47a0000, 0x00000007d7ad0000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> } >>>>>> {Heap before GC invocations=3 (full 0): >>>>>> par new generation total 471872K, used 24002K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 1% used [0x00000007bae00000, >>>>>> 0x00000007bb5f7c50, 0x00000007d47a0000) >>>>>> from space 52416K, 30% used [0x00000007d7ad0000, >>>>>> 0x00000007d8a48c88, 0x00000007dae00000) >>>>>> to space 52416K, 0% used [0x00000007d47a0000, >>>>>> 0x00000007d47a0000, 0x00000007d7ad0000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: >>>>>> 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), >>>>>> 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] >>>>>> Heap after GC invocations=4 (full 0): >>>>>> par new generation total 471872K, used 12748K >>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000) >>>>>> eden space 419456K, 0% used [0x00000007bae00000, >>>>>> 0x00000007bae00000, 0x00000007d47a0000) >>>>>> from space 52416K, 24% used [0x00000007d47a0000, >>>>>> 0x00000007d5413320, 0x00000007d7ad0000) >>>>>> to space 52416K, 0% used [0x00000007d7ad0000, >>>>>> 0x00000007d7ad0000, 0x00000007dae00000) >>>>>> tenured generation total 524288K, used 0K [0x00000007dae00000, >>>>>> 0x00000007fae00000, 0x00000007fae00000) >>>>>> the space 524288K, 0% used [0x00000007dae00000, >>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000) >>>>>> compacting perm gen total 21248K, used 2549K >>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000) >>>>>> the space 21248K, 12% used [0x00000007fae00000, >>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000) >>>>>> No shared spaces configured. >>>>>> } >>>>>> >>>>>> Notice that: >>>>>> >>>>>> * The timestamp of the second GC (1.130) is almost equal to the >>>>>> timestamp of the first GC plus the duration of the first GC >>>>>> (1.119 + 0.0103320 = 1.1293). In this test young GCs normally >>>>>> happen at a frequency of one every 100ms-110ms or so. >>>>>> * The eden at the start of the second GC is almost empty (1% >>>>>> occupancy). We've also seen it very often with a completely empty >>>>>> eden. >>>>>> * (the big hint) The second GC is GClocker-initiated. >>>>>> >>>>>> This happens most often with ParNew (in some cases, more than 30% >>>>>> of the GCs are those unnecessary ones) but also happens with >>>>>> ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are >>>>>> those unnecessary ones). I was unable to reproduce it with G1. >>>>>> >>>>>> I can reproduce it with with latest JDK 7, JDK 8, and also the >>>>>> latest hotspot-gc/hotspot workspace. >>>>>> >>>>>> Are you guys looking into this (and is there a CR?)? I have a >>>>>> small test I can reproduce it with and a diagnosis / proposed >>>>>> fix(es) if you're interested. >>>>>> >>>>>> Tony >>>>>> >>>>> >>>> >> -- Tony Printezis | JVM/GC Engineer / VM Team | Twitter @TonyPrintezis tprintezis at twitter.com From serkanozal86 at hotmail.com Sat Jun 28 12:21:15 2014 From: serkanozal86 at hotmail.com (=?utf-8?B?c2Vya2FuIMO2emFs?=) Date: Sat, 28 Jun 2014 15:21:15 +0300 Subject: =?utf-8?Q?Compressed?= =?utf-8?Q?-OOP's_on_?= =?utf-8?B?SlZN4oCP?= Message-ID: Hi all, As you know, sometimes, although compressed-oops are used, if java heap size < 4Gb and it can be moved into low virtual address space (below 4Gb) then compressed oops can be used without encoding/decoding. (https://wikis.oracle.com/display/HotSpotInternals/CompressedOops) In 64 bit JVM with compressed-oops enable and and with minimum heap size 1G and maximum heap size 1G, object references are 4 byte. In this case, compressed-oop is real native address. But in 64 bit JVM with compressed-oops enable and and with minimum heap size 4G and maximum heap size 8G, object references are 4 byte. But in this case, compressed-oop is needed to be encoded/decoded (by 3 bit shifting) before getting real native address. In both of cases, compressed-oop is enable, but how can I detect compressed-oops are used as native address or are they need to be encoded/decoded ? If they are encoded/decoded, what is the value of bit shifting ? Thanks in advance. -- Serkan ?ZAL -------------- next part -------------- An HTML attachment was scrubbed... URL: