From per.liden at oracle.com  Mon Jun  2 08:38:44 2014
From: per.liden at oracle.com (Per Liden)
Date: Mon, 02 Jun 2014 10:38:44 +0200
Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop()
In-Reply-To: <538473BB.9070300@oracle.com>
References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com>
	<537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com>
Message-ID: <538C3814.9000906@oracle.com>

Ping!

/Per

On 05/27/2014 01:15 PM, Per Liden wrote:
> Hi,
>
> I did some additional testing and eyeballing of this fix and noticed
> that it would be a good idea to also tell concurrent mark to abort,
> otherwise we will always wait until concurrent mark has finished, which
> is unnecessary (and could potentially take some time if the live set is
> large). So, I added a call to _cm->set_has_aborted() to abort any
> ongoing concurrent mark.
>
> Updated webrev:
> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
>
> Diff against previous webrev:
> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/
>
> Testing:
> Wrote a simple test to provoke an concurrent mark followed by an
> immediate exit. With the first version of the patch, we would always
> wait until concurrent mark completes. Now it will instead show an
> concurrent-mark-abort, which happens much earlier.
>
> /Per
>
> On 05/22/2014 03:10 PM, Per Liden wrote:
>> Thanks Jon!
>>
>> /Per
>>
>> On 2014-05-20 16:45, Jon Masamitsu wrote:
>>> Looks good.
>>>
>>> Reviewed.
>>>
>>> Jon
>>>
>>> On 05/20/2014 04:59 AM, Per Liden wrote:
>>>> Looking for a couple of reviews in this patch.
>>>>
>>>> Summary: This patch re-enables the controlled stopping of G1's
>>>> concurrent threads at VM shutdown. This could potentially cause hangs
>>>> during VM shutdown because the G1 marking threads could get stuck in
>>>> various places and fail to terminate. JDK-8040803 and JDK-8040804
>>>> fixed these issues, so this is the final step to re-enable the actual
>>>> stopping of those threads. This patch also moves the call to
>>>> CollectedHeap::stop() a few lines down to group the GC related stuff
>>>> together. It also adjusts/removes some comments that are no longer
>>>> correct.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807
>>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/
>>>>
>>>> Testing:
>>>> - GC nightlies. 5 tests in this suite used to timeout because of the
>>>> issue with hanging threads. They now pass.
>>>> - JPRT
>>>>
>>>> Thanks!
>>>> /Per
>>>
>>
>


From serkanozal86 at hotmail.com  Sun Jun  1 18:42:56 2014
From: serkanozal86 at hotmail.com (=?utf-8?B?c2Vya2FuIMO2emFs?=)
Date: Sun, 1 Jun 2014 21:42:56 +0300
Subject: =?utf-8?Q?FW:_Hiding?= =?utf-8?Q?_Class_Def?=
	=?utf-8?Q?initions_f?= =?utf-8?Q?rom_Compac?= =?utf-8?Q?ting_At_GC?=
	=?utf-8?B?IEN5Y2xl4oCP4oCP?=
In-Reply-To: <DUB129-W8009B403E107F4E8607AF4D98E0@phx.gbl>
References: <DUB129-W8009B403E107F4E8607AF4D98E0@phx.gbl>
Message-ID: <DUB129-W197D090650EAEF77348CE3D9210@phx.gbl>

Hi all,
I am not sure that target of mail is this group or not but I don't know better one for asking :)
I am currently working on an OffHeap solution and I have a problem with "Compact" phase of GC.As I see at "Compact" phase, location of classes may be changed. I tried class pinning with JNI by "NewGlobalRef" method but it doesn't prevent compacting. As I understood, it only hides object from garbage collected.In brief, is there any way to prevent compacting of any specific class defition (or object) at GC cycle?Is there any bit, offset or field (such as mark_oop) in object header to prevent compacting of fully from GC for any specific object or class?
Thanks in advance.
--
Serkan ?ZAL 		 	   		  
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140601/ed7f1b72/attachment.htm>

From bengt.rutisson at oracle.com  Mon Jun  2 14:30:34 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Mon, 02 Jun 2014 16:30:34 +0200
Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing of
	j.l.ref.Reference objects
Message-ID: <538C8A8A.1080309@oracle.com>


Hi all,

Can I have a couple of reviews for this change?

http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/

https://bugs.openjdk.java.net/browse/JDK-8043239

As described in the bug report the reference processor was missing a 
write barrier call when manipulating the discovered list. This has 
always been the case but it was hidden because at the end of the 
reference processing we went through the complete discovered list and 
dirtied all the missed cards because we did an (unnecessary) write 
barrier when we set the next field to point to be a self pointer 
pointing back at the reference object itself.

The write barrier for setting the next field was removed since it was 
not needed, but that revealed the current bug. After some discussions 
and prototyping we came to the conclusion that there may be more 
barriers missing and that it is difficult to get the dirtying done the 
way our verification code assumes. A simpler solution seems to be to 
free the reference processing of all barriers and instead just make sure 
that we dirty all the right cards in the last pass.

The proposed fix thus re-introduces the post barrier when we iterate 
over the discovered list. This time it uses the discovered field for the 
barrier to be more explicit about what is going on.

Testing:
JPRT,
Kitchensink, 5 days
GC test suite
SPECjbb2013
Ad-hoc aurora run
Specific reproducer that illustrated the problem.

The specific reproducer was really good to pinpoint the problem but is 
hard to turn in to a JTreg test. Many thanks go to StefanK for helping 
out with creating the reproducer.

Thanks,
Bengt


From per.liden at oracle.com  Mon Jun  2 14:57:42 2014
From: per.liden at oracle.com (Per Liden)
Date: Mon, 02 Jun 2014 16:57:42 +0200
Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing
	of j.l.ref.Reference objects
In-Reply-To: <538C8A8A.1080309@oracle.com>
References: <538C8A8A.1080309@oracle.com>
Message-ID: <538C90E6.5080309@oracle.com>

Looks good to me Bengt.

Even if this means we sometimes dirty too many cards this looks like a 
much less error-prone approach, which I like.

/Per

On 06/02/2014 04:30 PM, Bengt Rutisson wrote:
>
> Hi all,
>
> Can I have a couple of reviews for this change?
>
> http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/
>
> https://bugs.openjdk.java.net/browse/JDK-8043239
>
> As described in the bug report the reference processor was missing a
> write barrier call when manipulating the discovered list. This has
> always been the case but it was hidden because at the end of the
> reference processing we went through the complete discovered list and
> dirtied all the missed cards because we did an (unnecessary) write
> barrier when we set the next field to point to be a self pointer
> pointing back at the reference object itself.
>
> The write barrier for setting the next field was removed since it was
> not needed, but that revealed the current bug. After some discussions
> and prototyping we came to the conclusion that there may be more
> barriers missing and that it is difficult to get the dirtying done the
> way our verification code assumes. A simpler solution seems to be to
> free the reference processing of all barriers and instead just make sure
> that we dirty all the right cards in the last pass.
>
> The proposed fix thus re-introduces the post barrier when we iterate
> over the discovered list. This time it uses the discovered field for the
> barrier to be more explicit about what is going on.
>
> Testing:
> JPRT,
> Kitchensink, 5 days
> GC test suite
> SPECjbb2013
> Ad-hoc aurora run
> Specific reproducer that illustrated the problem.
>
> The specific reproducer was really good to pinpoint the problem but is
> hard to turn in to a JTreg test. Many thanks go to StefanK for helping
> out with creating the reproducer.
>
> Thanks,
> Bengt


From bengt.rutisson at oracle.com  Mon Jun  2 15:22:30 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Mon, 02 Jun 2014 17:22:30 +0200
Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing
	of j.l.ref.Reference objects
In-Reply-To: <538C90E6.5080309@oracle.com>
References: <538C8A8A.1080309@oracle.com> <538C90E6.5080309@oracle.com>
Message-ID: <538C96B6.6020705@oracle.com>

On 6/2/14 4:57 PM, Per Liden wrote:
> Looks good to me Bengt.
>
> Even if this means we sometimes dirty too many cards this looks like a 
> much less error-prone approach, which I like.

Thanks for the quick review, Per!

Bengt

>
> /Per
>
> On 06/02/2014 04:30 PM, Bengt Rutisson wrote:
>>
>> Hi all,
>>
>> Can I have a couple of reviews for this change?
>>
>> http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/
>>
>> https://bugs.openjdk.java.net/browse/JDK-8043239
>>
>> As described in the bug report the reference processor was missing a
>> write barrier call when manipulating the discovered list. This has
>> always been the case but it was hidden because at the end of the
>> reference processing we went through the complete discovered list and
>> dirtied all the missed cards because we did an (unnecessary) write
>> barrier when we set the next field to point to be a self pointer
>> pointing back at the reference object itself.
>>
>> The write barrier for setting the next field was removed since it was
>> not needed, but that revealed the current bug. After some discussions
>> and prototyping we came to the conclusion that there may be more
>> barriers missing and that it is difficult to get the dirtying done the
>> way our verification code assumes. A simpler solution seems to be to
>> free the reference processing of all barriers and instead just make sure
>> that we dirty all the right cards in the last pass.
>>
>> The proposed fix thus re-introduces the post barrier when we iterate
>> over the discovered list. This time it uses the discovered field for the
>> barrier to be more explicit about what is going on.
>>
>> Testing:
>> JPRT,
>> Kitchensink, 5 days
>> GC test suite
>> SPECjbb2013
>> Ad-hoc aurora run
>> Specific reproducer that illustrated the problem.
>>
>> The specific reproducer was really good to pinpoint the problem but is
>> hard to turn in to a JTreg test. Many thanks go to StefanK for helping
>> out with creating the reproducer.
>>
>> Thanks,
>> Bengt
>


From rednaxelafx at gmail.com  Mon Jun  2 18:27:52 2014
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Mon, 2 Jun 2014 11:27:52 -0700
Subject: =?UTF-8?Q?Re=3A_FW=3A_Hiding_Class_Definitions_from_Compacting_At_?=
	=?UTF-8?Q?GC_Cycle=E2=80=8F=E2=80=8F?=
In-Reply-To: <DUB129-W197D090650EAEF77348CE3D9210@phx.gbl>
References: <DUB129-W8009B403E107F4E8607AF4D98E0@phx.gbl>
	<DUB129-W197D090650EAEF77348CE3D9210@phx.gbl>
Message-ID: <CA+cQ+tSo1nR2z3yvLPzaTXMzA=ufcdxbFwZCGvcptZio=PkKzA@mail.gmail.com>

Hi Serkan,

Taobao developed something called "GCIH" (GC-Invisible Heap), which is also
an off-heap solution, that might be similar to what you're trying to do. I
was a part of the effort when I worked there.

The JVM part of source code of a very very early version of the GCIH is
available here: http://jvm.taobao.org/images/4/49/Jvm_gcih.patch

We record the high-water mark whenever we touch a PermGen object when
moving objects into GCIH:

+  if (p->is_klass() || p->is_perm()) {
+    if (GCInvisibleHeap::_top_klass_addr < p) {
+      GCInvisibleHeap::_top_klass_addr = p;
+    }
+    return;
+  }

And then there were multiple ways to do things. In this version of GCIH we
would traverse all objects in GCIH to fixup their klass pointers after the
PermGen has been compacted. There was another version that would simply
prevent the GC from compacting the part of PermGen below our high-water
mark. I'm not sure how it evolved after I left, but the implementation is
much more stable now, so they might have come up with a better way to do it.

Nonetheless, all these solutions require customizing the JVM internals,
which might not be the thing you want to do.

If you target your off-heap solution to only Java 8 or above, and only
targeting HotSpot, however, then you don't have to worry about metadata
objects moving around (at least for now). That's because the PermGen is
removed from GC and moved into a piece of native memory called "Metaspace",
so they're not subject to GC compaction anymore.

I should mention both JRockit and IBM J9 don't have a PermGen even before
Java 8, and their metadata objects are not subject to compaction either.
I guess you're trying to be JVM-implementation-agnostic here, but since not
all JVMs do compaction, and not all JVMs support object pinning, there's no
cross-JVM compatible API that allows you to prevent metadata object from
being compacted.

- Kris


On Sun, Jun 1, 2014 at 11:42 AM, serkan ?zal <serkanozal86 at hotmail.com>
wrote:

> Hi all,
>
> I am not sure that target of mail is this group or not but I don't know
> better one for asking :)
>
> I am currently working on an OffHeap solution and I have a problem with
> "Compact" phase of GC.As I see at "Compact" phase, location of classes may
> be changed. I tried class pinning with JNI by "NewGlobalRef" method but it
> doesn't prevent compacting. As I understood, it only hides object from
> garbage collected.
> In brief, is there any way to prevent compacting of any specific class
> defition (or object) at GC cycle?Is there any bit, offset or field (such as
> mark_oop) in object header to prevent compacting of fully from GC for any
> specific object or class?
>
> Thanks in advance.
>
> --
>
> Serkan ?ZAL
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140602/96cf6f9b/attachment.htm>

From jon.masamitsu at oracle.com  Mon Jun  2 21:22:14 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 02 Jun 2014 14:22:14 -0700
Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop()
In-Reply-To: <538C3814.9000906@oracle.com>
References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com>
	<537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com>
	<538C3814.9000906@oracle.com>
Message-ID: <538CEB06.7050707@oracle.com>


On 06/02/2014 01:38 AM, Per Liden wrote:
> Ping!
>
> /Per
>
> On 05/27/2014 01:15 PM, Per Liden wrote:
>> Hi,
>>
>> I did some additional testing and eyeballing of this fix and noticed
>> that it would be a good idea to also tell concurrent mark to abort,
>> otherwise we will always wait until concurrent mark has finished, which
>> is unnecessary (and could potentially take some time if the live set is
>> large). So, I added a call to _cm->set_has_aborted() to abort any
>> ongoing concurrent mark.
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
>>
>> Diff against previous webrev:
>> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/

Looks good.  Reviewed.

Jon

>>
>> Testing:
>> Wrote a simple test to provoke an concurrent mark followed by an
>> immediate exit. With the first version of the patch, we would always
>> wait until concurrent mark completes. Now it will instead show an
>> concurrent-mark-abort, which happens much earlier.
>>
>> /Per
>>
>> On 05/22/2014 03:10 PM, Per Liden wrote:
>>> Thanks Jon!
>>>
>>> /Per
>>>
>>> On 2014-05-20 16:45, Jon Masamitsu wrote:
>>>> Looks good.
>>>>
>>>> Reviewed.
>>>>
>>>> Jon
>>>>
>>>> On 05/20/2014 04:59 AM, Per Liden wrote:
>>>>> Looking for a couple of reviews in this patch.
>>>>>
>>>>> Summary: This patch re-enables the controlled stopping of G1's
>>>>> concurrent threads at VM shutdown. This could potentially cause hangs
>>>>> during VM shutdown because the G1 marking threads could get stuck in
>>>>> various places and fail to terminate. JDK-8040803 and JDK-8040804
>>>>> fixed these issues, so this is the final step to re-enable the actual
>>>>> stopping of those threads. This patch also moves the call to
>>>>> CollectedHeap::stop() a few lines down to group the GC related stuff
>>>>> together. It also adjusts/removes some comments that are no longer
>>>>> correct.
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807
>>>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/
>>>>>
>>>>> Testing:
>>>>> - GC nightlies. 5 tests in this suite used to timeout because of the
>>>>> issue with hanging threads. They now pass.
>>>>> - JPRT
>>>>>
>>>>> Thanks!
>>>>> /Per
>>>>
>>>
>>
>


From vladimir.kozlov at oracle.com  Mon Jun  2 21:55:15 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 02 Jun 2014 14:55:15 -0700
Subject: RFR(XS) : 8044575 :
	testlibrary_tests/whitebox/vm_flags/UintxTest.java
	failed: assert(!res ||
	TypeEntriesAtCall::arguments_profiling_enabled())
	failed: no profiling of arguments
In-Reply-To: <538CE366.90806@oracle.com>
References: <538CE366.90806@oracle.com>
Message-ID: <538CF2C3.8090107@oracle.com>

Hi Igor,

Looks good to me but I would ask GC group to comment on this change.

Thanks,
Vladimir

On 6/2/14 1:49 PM, Igor Ignatyev wrote:
> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/
> 4 lines changed: 0 ins; 2 del; 2 mod;
>
> Hi all,
>
> Please review patch:
>
> Problem:
> the test changes 'TypeProfileLevel' via WhiteBox during execution, but
> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts
> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers
> one of these asserts.
>
> Fix:
> - as a flag to change, the test uses 'VerifyGCStartAt' instead of
> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during execution
> - removed 'System.out.println' which was left by accident
>
> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575
> testing: failing tests locally w/ different flags combinations


From jon.masamitsu at oracle.com  Mon Jun  2 22:25:43 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 02 Jun 2014 15:25:43 -0700
Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing
	of j.l.ref.Reference objects
In-Reply-To: <538C8A8A.1080309@oracle.com>
References: <538C8A8A.1080309@oracle.com>
Message-ID: <538CF9E7.6020301@oracle.com>

Bengt,

http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/src/share/vm/memory/referenceProcessor.cpp.frames.html

The change to always use "set_next_raw()" here

  520     java_lang_ref_Reference::set_next_raw(_ref, NULL);
  521   } else {
  522     java_lang_ref_Reference::set_next(_ref, NULL);
  523   }

was always the correct thing to use?  Does not have to do with
extra / unneeded write barrier?

Looks good.

Reviewed.

Jon

On 06/02/2014 07:30 AM, Bengt Rutisson wrote:
>
> Hi all,
>
> Can I have a couple of reviews for this change?
>
> http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/
>
> https://bugs.openjdk.java.net/browse/JDK-8043239
>
> As described in the bug report the reference processor was missing a 
> write barrier call when manipulating the discovered list. This has 
> always been the case but it was hidden because at the end of the 
> reference processing we went through the complete discovered list and 
> dirtied all the missed cards because we did an (unnecessary) write 
> barrier when we set the next field to point to be a self pointer 
> pointing back at the reference object itself.
>
> The write barrier for setting the next field was removed since it was 
> not needed, but that revealed the current bug. After some discussions 
> and prototyping we came to the conclusion that there may be more 
> barriers missing and that it is difficult to get the dirtying done the 
> way our verification code assumes. A simpler solution seems to be to 
> free the reference processing of all barriers and instead just make 
> sure that we dirty all the right cards in the last pass.
>
> The proposed fix thus re-introduces the post barrier when we iterate 
> over the discovered list. This time it uses the discovered field for 
> the barrier to be more explicit about what is going on.
>
> Testing:
> JPRT,
> Kitchensink, 5 days
> GC test suite
> SPECjbb2013
> Ad-hoc aurora run
> Specific reproducer that illustrated the problem.
>
> The specific reproducer was really good to pinpoint the problem but is 
> hard to turn in to a JTreg test. Many thanks go to StefanK for helping 
> out with creating the reproducer.
>
> Thanks,
> Bengt


From thomas.schatzl at oracle.com  Tue Jun  3 07:15:54 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 03 Jun 2014 09:15:54 +0200
Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop()
In-Reply-To: <538473BB.9070300@oracle.com>
References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com>
	<537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com>
Message-ID: <1401779754.2592.0.camel@cirrus>

Hi,

On Tue, 2014-05-27 at 13:15 +0200, Per Liden wrote:
> Hi,
> 
> I did some additional testing and eyeballing of this fix and noticed 
> that it would be a good idea to also tell concurrent mark to abort, 
> otherwise we will always wait until concurrent mark has finished, which 
> is unnecessary (and could potentially take some time if the live set is 
> large). So, I added a call to _cm->set_has_aborted() to abort any 
> ongoing concurrent mark.
> 
> Updated webrev:
> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
> 
> Diff against previous webrev:
> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/
> 
> Testing:
> Wrote a simple test to provoke an concurrent mark followed by an 
> immediate exit. With the first version of the patch, we would always 
> wait until concurrent mark completes. Now it will instead show an 
> concurrent-mark-abort, which happens much earlier.

Looks okay.

Thomas


From bengt.rutisson at oracle.com  Tue Jun  3 08:04:57 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Tue, 03 Jun 2014 10:04:57 +0200
Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop()
In-Reply-To: <538C3814.9000906@oracle.com>
References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com>
	<537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com>
	<538C3814.9000906@oracle.com>
Message-ID: <538D81A9.9000109@oracle.com>


Hi Per,

Looks good.

Bengt

On 2014-06-02 10:38, Per Liden wrote:
> Ping!
>
> /Per
>
> On 05/27/2014 01:15 PM, Per Liden wrote:
>> Hi,
>>
>> I did some additional testing and eyeballing of this fix and noticed
>> that it would be a good idea to also tell concurrent mark to abort,
>> otherwise we will always wait until concurrent mark has finished, which
>> is unnecessary (and could potentially take some time if the live set is
>> large). So, I added a call to _cm->set_has_aborted() to abort any
>> ongoing concurrent mark.
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
>>
>> Diff against previous webrev:
>> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/
>>
>> Testing:
>> Wrote a simple test to provoke an concurrent mark followed by an
>> immediate exit. With the first version of the patch, we would always
>> wait until concurrent mark completes. Now it will instead show an
>> concurrent-mark-abort, which happens much earlier.
>>
>> /Per
>>
>> On 05/22/2014 03:10 PM, Per Liden wrote:
>>> Thanks Jon!
>>>
>>> /Per
>>>
>>> On 2014-05-20 16:45, Jon Masamitsu wrote:
>>>> Looks good.
>>>>
>>>> Reviewed.
>>>>
>>>> Jon
>>>>
>>>> On 05/20/2014 04:59 AM, Per Liden wrote:
>>>>> Looking for a couple of reviews in this patch.
>>>>>
>>>>> Summary: This patch re-enables the controlled stopping of G1's
>>>>> concurrent threads at VM shutdown. This could potentially cause hangs
>>>>> during VM shutdown because the G1 marking threads could get stuck in
>>>>> various places and fail to terminate. JDK-8040803 and JDK-8040804
>>>>> fixed these issues, so this is the final step to re-enable the actual
>>>>> stopping of those threads. This patch also moves the call to
>>>>> CollectedHeap::stop() a few lines down to group the GC related stuff
>>>>> together. It also adjusts/removes some comments that are no longer
>>>>> correct.
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807
>>>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/
>>>>>
>>>>> Testing:
>>>>> - GC nightlies. 5 tests in this suite used to timeout because of the
>>>>> issue with hanging threads. They now pass.
>>>>> - JPRT
>>>>>
>>>>> Thanks!
>>>>> /Per
>>>>
>>>
>>
>


From bengt.rutisson at oracle.com  Tue Jun  3 08:37:47 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Tue, 03 Jun 2014 10:37:47 +0200
Subject: RFR (M): JDK-8043239: G1: Missing post barrier in processing
	of j.l.ref.Reference objects
In-Reply-To: <538CF9E7.6020301@oracle.com>
References: <538C8A8A.1080309@oracle.com> <538CF9E7.6020301@oracle.com>
Message-ID: <538D895B.7090903@oracle.com>


Hi Jon,

Thanks for the review!

On 2014-06-03 00:25, Jon Masamitsu wrote:
> Bengt,
>
> http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/src/share/vm/memory/referenceProcessor.cpp.frames.html 
>
>
> The change to always use "set_next_raw()" here
>
>  520     java_lang_ref_Reference::set_next_raw(_ref, NULL);
>  521   } else {
>  522     java_lang_ref_Reference::set_next(_ref, NULL);
>  523   }
>
> was always the correct thing to use?  Does not have to do with
> extra / unneeded write barrier?

Right. Since we are writing NULL we don't need a post barrier and for G1 
it is important to avoid the post barrier because it will dirty cards in 
a way that will make the card table verification to fail.

>
> Looks good.
>
> Reviewed.

Thanks!

I will push this change, but Thomas suggested to move one of the 
comments that I added to be more visible.

Here's what he suggested:
http://cr.openjdk.java.net/~brutisso/8043239/webrev.00-01.diff/

Since it is only a comment change I will go ahead and push this. Would 
be nice to get nightly testing as soon as possible to be able to 
backport to 8u20. I hope that is ok with you.

Here is the full webrev of what I will push:

http://cr.openjdk.java.net/~brutisso/8043239/webrev.01/

Thanks,
Bengt

>
> Jon
>
> On 06/02/2014 07:30 AM, Bengt Rutisson wrote:
>>
>> Hi all,
>>
>> Can I have a couple of reviews for this change?
>>
>> http://cr.openjdk.java.net/~brutisso/8043239/webrev.00/
>>
>> https://bugs.openjdk.java.net/browse/JDK-8043239
>>
>> As described in the bug report the reference processor was missing a 
>> write barrier call when manipulating the discovered list. This has 
>> always been the case but it was hidden because at the end of the 
>> reference processing we went through the complete discovered list and 
>> dirtied all the missed cards because we did an (unnecessary) write 
>> barrier when we set the next field to point to be a self pointer 
>> pointing back at the reference object itself.
>>
>> The write barrier for setting the next field was removed since it was 
>> not needed, but that revealed the current bug. After some discussions 
>> and prototyping we came to the conclusion that there may be more 
>> barriers missing and that it is difficult to get the dirtying done 
>> the way our verification code assumes. A simpler solution seems to be 
>> to free the reference processing of all barriers and instead just 
>> make sure that we dirty all the right cards in the last pass.
>>
>> The proposed fix thus re-introduces the post barrier when we iterate 
>> over the discovered list. This time it uses the discovered field for 
>> the barrier to be more explicit about what is going on.
>>
>> Testing:
>> JPRT,
>> Kitchensink, 5 days
>> GC test suite
>> SPECjbb2013
>> Ad-hoc aurora run
>> Specific reproducer that illustrated the problem.
>>
>> The specific reproducer was really good to pinpoint the problem but 
>> is hard to turn in to a JTreg test. Many thanks go to StefanK for 
>> helping out with creating the reproducer.
>>
>> Thanks,
>> Bengt
>


From per.liden at oracle.com  Tue Jun  3 08:50:34 2014
From: per.liden at oracle.com (Per Liden)
Date: Tue, 03 Jun 2014 10:50:34 +0200
Subject: RFR(S): 8040807: G1: Enable G1CollectedHeap::stop()
In-Reply-To: <538D81A9.9000109@oracle.com>
References: <537B43BC.2090004@oracle.com> <537B6A8A.5030607@oracle.com>
	<537DF73D.6080204@oracle.com> <538473BB.9070300@oracle.com>
	<538C3814.9000906@oracle.com> <538D81A9.9000109@oracle.com>
Message-ID: <538D8C5A.6020909@oracle.com>

Thanks Jon, Thomas and Bengt!

/Per

On 06/03/2014 10:04 AM, Bengt Rutisson wrote:
>
> Hi Per,
>
> Looks good.
>
> Bengt
>
> On 2014-06-02 10:38, Per Liden wrote:
>> Ping!
>>
>> /Per
>>
>> On 05/27/2014 01:15 PM, Per Liden wrote:
>>> Hi,
>>>
>>> I did some additional testing and eyeballing of this fix and noticed
>>> that it would be a good idea to also tell concurrent mark to abort,
>>> otherwise we will always wait until concurrent mark has finished, which
>>> is unnecessary (and could potentially take some time if the live set is
>>> large). So, I added a call to _cm->set_has_aborted() to abort any
>>> ongoing concurrent mark.
>>>
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
>>>
>>> Diff against previous webrev:
>>> http://cr.openjdk.java.net/~pliden/8040807/webrev.diff_0vs1/
>>>
>>> Testing:
>>> Wrote a simple test to provoke an concurrent mark followed by an
>>> immediate exit. With the first version of the patch, we would always
>>> wait until concurrent mark completes. Now it will instead show an
>>> concurrent-mark-abort, which happens much earlier.
>>>
>>> /Per
>>>
>>> On 05/22/2014 03:10 PM, Per Liden wrote:
>>>> Thanks Jon!
>>>>
>>>> /Per
>>>>
>>>> On 2014-05-20 16:45, Jon Masamitsu wrote:
>>>>> Looks good.
>>>>>
>>>>> Reviewed.
>>>>>
>>>>> Jon
>>>>>
>>>>> On 05/20/2014 04:59 AM, Per Liden wrote:
>>>>>> Looking for a couple of reviews in this patch.
>>>>>>
>>>>>> Summary: This patch re-enables the controlled stopping of G1's
>>>>>> concurrent threads at VM shutdown. This could potentially cause hangs
>>>>>> during VM shutdown because the G1 marking threads could get stuck in
>>>>>> various places and fail to terminate. JDK-8040803 and JDK-8040804
>>>>>> fixed these issues, so this is the final step to re-enable the actual
>>>>>> stopping of those threads. This patch also moves the call to
>>>>>> CollectedHeap::stop() a few lines down to group the GC related stuff
>>>>>> together. It also adjusts/removes some comments that are no longer
>>>>>> correct.
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8040807
>>>>>> Webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.0/
>>>>>>
>>>>>> Testing:
>>>>>> - GC nightlies. 5 tests in this suite used to timeout because of the
>>>>>> issue with hanging threads. They now pass.
>>>>>> - JPRT
>>>>>>
>>>>>> Thanks!
>>>>>> /Per
>>>>>
>>>>
>>>
>>
>


From harvey at actenum.com  Tue Jun  3 18:41:30 2014
From: harvey at actenum.com (Peter Harvey)
Date: Tue, 3 Jun 2014 12:41:30 -0600
Subject: G1 GC consuming all CPU time
Message-ID: <CACaNnWo=-jfP2f9Gytsu48ZZOX8x422e3eV4kh2zPogBrg2EYA@mail.gmail.com>

I have an algorithm (at bottom of email) which builds a graph of 'Node'
objects with random connections between them. It then repeatedly processes
a queue of those Nodes, adding new Nodes to the queue as it goes. This is a
single-threaded algorithm that will never terminate. Our actual production
code is much more complex, but I've trimmed it down as much as possible.

On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause
the JRE to consume all 8 cores of my CPU. No other garbage collector does
this. You can see the differences in CPU load in the example output below.
It's also worth nothing that "-verbose:gc" with the G1 garbage collector
prints nothing after my algorithm starts. Presumably the G1 garbage
collector is doing something (concurrent mark?), but it's not printing
anything about it.

When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this
(note the huge CPU load value which should not be this high for a
single-threaded algorithm on an 8 core CPU):

[GC pause (young) 62M->62M(254M), 0.0394214 secs]
[GC pause (young) 73M->83M(508M), 0.0302781 secs]
[GC pause (young) 106M->111M(1016M), 0.0442273 secs]
[GC pause (young) 157M->161M(1625M), 0.0660902 secs]
[GC pause (young) 235M->240M(2112M), 0.0907231 secs]
[GC pause (young) 334M->337M(2502M), 0.1356917 secs]
[GC pause (young) 448M->450M(2814M), 0.1219090 secs]
[GC pause (young) 574M->577M(3064M), 0.1778062 secs]
[GC pause (young) 712M->715M(3264M), 0.1878443 secs]
CPU Load Is -1.0

Start
Stop
Sleep
CPU Load Is 0.9196154547182949

Start
Stop
Sleep
CPU Load Is 0.9150735995043818

...


When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like
this:

[GC 65536K->64198K(249344K), 0.0628289 secs]
[GC 129734K->127974K(314880K), 0.1583369 secs]
[Full GC 127974K->127630K(451072K), 0.9675224 secs]
[GC 258702K->259102K(451072K), 0.3543645 secs]
[Full GC 259102K->258701K(732672K), 1.8085702 secs]
[GC 389773K->390181K(790528K), 0.3332060 secs]
[GC 579109K->579717K(803328K), 0.5126388 secs]
[Full GC 579717K->578698K(1300480K), 4.0647303 secs]
[GC 780426K->780842K(1567232K), 0.4364933 secs]
CPU Load Is -1.0

Start
Stop
Sleep
CPU Load Is 0.03137771539054431

Start
Stop
Sleep
CPU Load Is 0.032351299224373145

...


When run with VM args "-verbose:gc" I get output like this:

[GC 69312K->67824K(251136K), 0.1533803 secs]
[GC 137136K->135015K(251136K), 0.0970460 secs]
[GC 137245K(251136K), 0.0095245 secs]
[GC 204327K->204326K(274368K), 0.1056259 secs]
[GC 273638K->273636K(343680K), 0.1081515 secs]
[GC 342948K->342946K(412992K), 0.1181966 secs]
[GC 412258K->412257K(482304K), 0.1126966 secs]
[GC 481569K->481568K(551808K), 0.1156015 secs]
[GC 550880K->550878K(620928K), 0.1184089 secs]
[GC 620190K->620189K(690048K), 0.1209312 secs]
[GC 689501K->689499K(759552K), 0.1199338 secs]
[GC 758811K->758809K(828864K), 0.1162532 secs]
CPU Load Is -1.0

Start
Stop
Sleep
CPU Load Is 0.10791719146608299

Start
[GC 821213K(828864K), 0.1966807 secs]
Stop
Sleep
CPU Load Is 0.1540065314146181

Start
Stop
Sleep
[GC 821213K(1328240K), 0.1962688 secs]
CPU Load Is 0.08427292195744103

...


Why is the G1 garbage collector consuming so much CPU time? Is it stuck in
the mark phase as I am modifying the graph structure?

I'm not a subscriber to the list, so please CC me in any response.

Thanks,
Peter.

--

import java.lang.management.ManagementFactory;
import com.sun.management.OperatingSystemMXBean;
import java.util.Random;

@SuppressWarnings("restriction")
public class Node {
private static OperatingSystemMXBean os = (OperatingSystemMXBean)
ManagementFactory.getOperatingSystemMXBean();

private Node next;

private Node[] others = new Node[10];

public static void main(String[] args) throws InterruptedException {

// Build a graph of Nodes
Node head = buildGraph();

while (true) {
// Print CPU load for this process
System.out.println("CPU Load Is " + os.getProcessCpuLoad());
System.out.println();

// Modify the graph
System.out.println("Start");
head = modifyGraph(head);
System.out.println("Stop");

// Sleep, as otherwise we tend to DoS the host computer...
System.out.println("Sleep");
Thread.sleep(1000);
}
}

private static Node buildGraph() {

// Create a collection of Node objects
Node[] array = new Node[10000000];
for (int i = 0; i < array.length; i++) {
array[i] = new Node();
}

// Each Node refers to 10 other random Nodes
Random random = new Random(12);
for (int i = 0; i < array.length; i++) {
for (int j = 0; j < array[i].others.length; j++) {
int k = random.nextInt(array.length);
array[i].others[j] = array[k];
}
}

// The first Node serves as the head of a queue
return array[0];
}

private static Node modifyGraph(Node head) {

// Perform a million iterations
for (int i = 0; i < 1000000; i++) {

// Pop a Node off the head of the queue
Node node = head;
head = node.next;
node.next = null;

// Add the other Nodes to the head of the queue
for (Node other : node.others) {
other.next = head;
head = other;
}
}
return head;
}

}

-- 
*Actenum Corporation*
Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
www.actenum.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140603/3149900e/attachment.htm>

From harvey at actenum.com  Tue Jun  3 18:49:09 2014
From: harvey at actenum.com (Peter Harvey)
Date: Tue, 3 Jun 2014 12:49:09 -0600
Subject: G1 GC consuming all CPU time
In-Reply-To: <CACaNnWo=-jfP2f9Gytsu48ZZOX8x422e3eV4kh2zPogBrg2EYA@mail.gmail.com>
References: <CACaNnWo=-jfP2f9Gytsu48ZZOX8x422e3eV4kh2zPogBrg2EYA@mail.gmail.com>
Message-ID: <CACaNnWqa6M3Se8i4YmadP+4Aqiyw7P5hBie6zY6LhS+bv4j3Dg@mail.gmail.com>

Small correction. The last example of output was with
"-XX:+UseConcMarkSweepGC -verbose:gc".


On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey <harvey at actenum.com> wrote:

> I have an algorithm (at bottom of email) which builds a graph of 'Node'
> objects with random connections between them. It then repeatedly processes
> a queue of those Nodes, adding new Nodes to the queue as it goes. This is a
> single-threaded algorithm that will never terminate. Our actual production
> code is much more complex, but I've trimmed it down as much as possible.
>
> On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause
> the JRE to consume all 8 cores of my CPU. No other garbage collector does
> this. You can see the differences in CPU load in the example output below.
> It's also worth nothing that "-verbose:gc" with the G1 garbage collector
> prints nothing after my algorithm starts. Presumably the G1 garbage
> collector is doing something (concurrent mark?), but it's not printing
> anything about it.
>
> When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this
> (note the huge CPU load value which should not be this high for a
> single-threaded algorithm on an 8 core CPU):
>
> [GC pause (young) 62M->62M(254M), 0.0394214 secs]
> [GC pause (young) 73M->83M(508M), 0.0302781 secs]
> [GC pause (young) 106M->111M(1016M), 0.0442273 secs]
> [GC pause (young) 157M->161M(1625M), 0.0660902 secs]
> [GC pause (young) 235M->240M(2112M), 0.0907231 secs]
> [GC pause (young) 334M->337M(2502M), 0.1356917 secs]
> [GC pause (young) 448M->450M(2814M), 0.1219090 secs]
> [GC pause (young) 574M->577M(3064M), 0.1778062 secs]
> [GC pause (young) 712M->715M(3264M), 0.1878443 secs]
> CPU Load Is -1.0
>
> Start
> Stop
> Sleep
> CPU Load Is 0.9196154547182949
>
> Start
> Stop
> Sleep
> CPU Load Is 0.9150735995043818
>
> ...
>
>
>
> When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like
> this:
>
> [GC 65536K->64198K(249344K), 0.0628289 secs]
> [GC 129734K->127974K(314880K), 0.1583369 secs]
> [Full GC 127974K->127630K(451072K), 0.9675224 secs]
> [GC 258702K->259102K(451072K), 0.3543645 secs]
> [Full GC 259102K->258701K(732672K), 1.8085702 secs]
> [GC 389773K->390181K(790528K), 0.3332060 secs]
> [GC 579109K->579717K(803328K), 0.5126388 secs]
> [Full GC 579717K->578698K(1300480K), 4.0647303 secs]
> [GC 780426K->780842K(1567232K), 0.4364933 secs]
> CPU Load Is -1.0
>
> Start
> Stop
> Sleep
> CPU Load Is 0.03137771539054431
>
> Start
> Stop
> Sleep
> CPU Load Is 0.032351299224373145
>
> ...
>
>
>
> When run with VM args "-verbose:gc" I get output like this:
>
> [GC 69312K->67824K(251136K), 0.1533803 secs]
> [GC 137136K->135015K(251136K), 0.0970460 secs]
> [GC 137245K(251136K), 0.0095245 secs]
> [GC 204327K->204326K(274368K), 0.1056259 secs]
> [GC 273638K->273636K(343680K), 0.1081515 secs]
> [GC 342948K->342946K(412992K), 0.1181966 secs]
> [GC 412258K->412257K(482304K), 0.1126966 secs]
> [GC 481569K->481568K(551808K), 0.1156015 secs]
> [GC 550880K->550878K(620928K), 0.1184089 secs]
> [GC 620190K->620189K(690048K), 0.1209312 secs]
> [GC 689501K->689499K(759552K), 0.1199338 secs]
> [GC 758811K->758809K(828864K), 0.1162532 secs]
> CPU Load Is -1.0
>
> Start
> Stop
> Sleep
> CPU Load Is 0.10791719146608299
>
> Start
> [GC 821213K(828864K), 0.1966807 secs]
> Stop
> Sleep
> CPU Load Is 0.1540065314146181
>
> Start
> Stop
> Sleep
> [GC 821213K(1328240K), 0.1962688 secs]
> CPU Load Is 0.08427292195744103
>
> ...
>
>
>
> Why is the G1 garbage collector consuming so much CPU time? Is it stuck in
> the mark phase as I am modifying the graph structure?
>
> I'm not a subscriber to the list, so please CC me in any response.
>
> Thanks,
> Peter.
>
> --
>
> import java.lang.management.ManagementFactory;
> import com.sun.management.OperatingSystemMXBean;
> import java.util.Random;
>
> @SuppressWarnings("restriction")
> public class Node {
> private static OperatingSystemMXBean os = (OperatingSystemMXBean)
> ManagementFactory.getOperatingSystemMXBean();
>
> private Node next;
>
> private Node[] others = new Node[10];
>
>  public static void main(String[] args) throws InterruptedException {
>
> // Build a graph of Nodes
>  Node head = buildGraph();
>
> while (true) {
> // Print CPU load for this process
>  System.out.println("CPU Load Is " + os.getProcessCpuLoad());
> System.out.println();
>
>  // Modify the graph
> System.out.println("Start");
> head = modifyGraph(head);
>  System.out.println("Stop");
>
> // Sleep, as otherwise we tend to DoS the host computer...
>  System.out.println("Sleep");
> Thread.sleep(1000);
> }
>  }
>
> private static Node buildGraph() {
>
> // Create a collection of Node objects
>  Node[] array = new Node[10000000];
> for (int i = 0; i < array.length; i++) {
> array[i] = new Node();
>  }
>
> // Each Node refers to 10 other random Nodes
> Random random = new Random(12);
>  for (int i = 0; i < array.length; i++) {
> for (int j = 0; j < array[i].others.length; j++) {
> int k = random.nextInt(array.length);
>  array[i].others[j] = array[k];
> }
> }
>
>  // The first Node serves as the head of a queue
> return array[0];
> }
>
> private static Node modifyGraph(Node head) {
>
> // Perform a million iterations
>  for (int i = 0; i < 1000000; i++) {
>
> // Pop a Node off the head of the queue
> Node node = head;
>  head = node.next;
> node.next = null;
>
> // Add the other Nodes to the head of the queue
>  for (Node other : node.others) {
> other.next = head;
> head = other;
>  }
> }
> return head;
> }
>
> }
>
> --
> *Actenum Corporation*
> Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
> www.actenum.com
>


-- 
*Actenum Corporation*
Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
www.actenum.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140603/33e31f29/attachment.htm>

From yiyeguhu at gmail.com  Tue Jun  3 21:13:45 2014
From: yiyeguhu at gmail.com (Tao Mao)
Date: Tue, 3 Jun 2014 14:13:45 -0700
Subject: G1 GC consuming all CPU time
In-Reply-To: <CACaNnWqa6M3Se8i4YmadP+4Aqiyw7P5hBie6zY6LhS+bv4j3Dg@mail.gmail.com>
References: <CACaNnWo=-jfP2f9Gytsu48ZZOX8x422e3eV4kh2zPogBrg2EYA@mail.gmail.com>
	<CACaNnWqa6M3Se8i4YmadP+4Aqiyw7P5hBie6zY6LhS+bv4j3Dg@mail.gmail.com>
Message-ID: <CANrGW1xkK=M1UOSuRkfFNYAaUi=G19h=FydY1A=c2LhhsUXxYw@mail.gmail.com>

Hi Peter,

What was your actual question? Try -XX:ParallelGCThreads=<value> if you
want less CPU usage from GC.

Thanks.
Tao


On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey <harvey at actenum.com> wrote:

> Small correction. The last example of output was with
> "-XX:+UseConcMarkSweepGC -verbose:gc".
>
>
> On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey <harvey at actenum.com> wrote:
>
>> I have an algorithm (at bottom of email) which builds a graph of 'Node'
>> objects with random connections between them. It then repeatedly processes
>> a queue of those Nodes, adding new Nodes to the queue as it goes. This is a
>> single-threaded algorithm that will never terminate. Our actual production
>> code is much more complex, but I've trimmed it down as much as possible.
>>
>> On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause
>> the JRE to consume all 8 cores of my CPU. No other garbage collector does
>> this. You can see the differences in CPU load in the example output below.
>> It's also worth nothing that "-verbose:gc" with the G1 garbage collector
>> prints nothing after my algorithm starts. Presumably the G1 garbage
>> collector is doing something (concurrent mark?), but it's not printing
>> anything about it.
>>
>> When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this
>> (note the huge CPU load value which should not be this high for a
>> single-threaded algorithm on an 8 core CPU):
>>
>> [GC pause (young) 62M->62M(254M), 0.0394214 secs]
>> [GC pause (young) 73M->83M(508M), 0.0302781 secs]
>>  [GC pause (young) 106M->111M(1016M), 0.0442273 secs]
>> [GC pause (young) 157M->161M(1625M), 0.0660902 secs]
>> [GC pause (young) 235M->240M(2112M), 0.0907231 secs]
>> [GC pause (young) 334M->337M(2502M), 0.1356917 secs]
>> [GC pause (young) 448M->450M(2814M), 0.1219090 secs]
>> [GC pause (young) 574M->577M(3064M), 0.1778062 secs]
>> [GC pause (young) 712M->715M(3264M), 0.1878443 secs]
>> CPU Load Is -1.0
>>
>> Start
>> Stop
>> Sleep
>> CPU Load Is 0.9196154547182949
>>
>> Start
>> Stop
>> Sleep
>> CPU Load Is 0.9150735995043818
>>
>> ...
>>
>>
>>
>> When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like
>> this:
>>
>> [GC 65536K->64198K(249344K), 0.0628289 secs]
>> [GC 129734K->127974K(314880K), 0.1583369 secs]
>> [Full GC 127974K->127630K(451072K), 0.9675224 secs]
>> [GC 258702K->259102K(451072K), 0.3543645 secs]
>> [Full GC 259102K->258701K(732672K), 1.8085702 secs]
>> [GC 389773K->390181K(790528K), 0.3332060 secs]
>> [GC 579109K->579717K(803328K), 0.5126388 secs]
>> [Full GC 579717K->578698K(1300480K), 4.0647303 secs]
>> [GC 780426K->780842K(1567232K), 0.4364933 secs]
>> CPU Load Is -1.0
>>
>> Start
>> Stop
>> Sleep
>> CPU Load Is 0.03137771539054431
>>
>> Start
>> Stop
>> Sleep
>> CPU Load Is 0.032351299224373145
>>
>> ...
>>
>>
>>
>> When run with VM args "-verbose:gc" I get output like this:
>>
>> [GC 69312K->67824K(251136K), 0.1533803 secs]
>> [GC 137136K->135015K(251136K), 0.0970460 secs]
>> [GC 137245K(251136K), 0.0095245 secs]
>> [GC 204327K->204326K(274368K), 0.1056259 secs]
>> [GC 273638K->273636K(343680K), 0.1081515 secs]
>> [GC 342948K->342946K(412992K), 0.1181966 secs]
>> [GC 412258K->412257K(482304K), 0.1126966 secs]
>> [GC 481569K->481568K(551808K), 0.1156015 secs]
>> [GC 550880K->550878K(620928K), 0.1184089 secs]
>> [GC 620190K->620189K(690048K), 0.1209312 secs]
>> [GC 689501K->689499K(759552K), 0.1199338 secs]
>> [GC 758811K->758809K(828864K), 0.1162532 secs]
>> CPU Load Is -1.0
>>
>> Start
>> Stop
>> Sleep
>> CPU Load Is 0.10791719146608299
>>
>> Start
>> [GC 821213K(828864K), 0.1966807 secs]
>> Stop
>> Sleep
>> CPU Load Is 0.1540065314146181
>>
>> Start
>> Stop
>> Sleep
>> [GC 821213K(1328240K), 0.1962688 secs]
>> CPU Load Is 0.08427292195744103
>>
>> ...
>>
>>
>>
>> Why is the G1 garbage collector consuming so much CPU time? Is it stuck
>> in the mark phase as I am modifying the graph structure?
>>
>> I'm not a subscriber to the list, so please CC me in any response.
>>
>> Thanks,
>> Peter.
>>
>> --
>>
>> import java.lang.management.ManagementFactory;
>> import com.sun.management.OperatingSystemMXBean;
>> import java.util.Random;
>>
>> @SuppressWarnings("restriction")
>> public class Node {
>> private static OperatingSystemMXBean os = (OperatingSystemMXBean)
>> ManagementFactory.getOperatingSystemMXBean();
>>
>> private Node next;
>>
>> private Node[] others = new Node[10];
>>
>>  public static void main(String[] args) throws InterruptedException {
>>
>> // Build a graph of Nodes
>>  Node head = buildGraph();
>>
>> while (true) {
>> // Print CPU load for this process
>>  System.out.println("CPU Load Is " + os.getProcessCpuLoad());
>> System.out.println();
>>
>>  // Modify the graph
>> System.out.println("Start");
>> head = modifyGraph(head);
>>  System.out.println("Stop");
>>
>> // Sleep, as otherwise we tend to DoS the host computer...
>>  System.out.println("Sleep");
>> Thread.sleep(1000);
>> }
>>  }
>>
>> private static Node buildGraph() {
>>
>> // Create a collection of Node objects
>>  Node[] array = new Node[10000000];
>> for (int i = 0; i < array.length; i++) {
>> array[i] = new Node();
>>  }
>>
>> // Each Node refers to 10 other random Nodes
>> Random random = new Random(12);
>>  for (int i = 0; i < array.length; i++) {
>> for (int j = 0; j < array[i].others.length; j++) {
>> int k = random.nextInt(array.length);
>>  array[i].others[j] = array[k];
>> }
>> }
>>
>>  // The first Node serves as the head of a queue
>> return array[0];
>> }
>>
>> private static Node modifyGraph(Node head) {
>>
>> // Perform a million iterations
>>  for (int i = 0; i < 1000000; i++) {
>>
>> // Pop a Node off the head of the queue
>> Node node = head;
>>  head = node.next;
>> node.next = null;
>>
>> // Add the other Nodes to the head of the queue
>>  for (Node other : node.others) {
>> other.next = head;
>> head = other;
>>  }
>> }
>> return head;
>> }
>>
>> }
>>
>> --
>> *Actenum Corporation*
>> Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
>> www.actenum.com
>>
>
>
>
> --
> *Actenum Corporation*
> Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
> www.actenum.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140603/544d5f7d/attachment.htm>

From yiyeguhu at gmail.com  Tue Jun  3 21:16:52 2014
From: yiyeguhu at gmail.com (Tao Mao)
Date: Tue, 3 Jun 2014 14:16:52 -0700
Subject: G1 GC consuming all CPU time
In-Reply-To: <CANrGW1xkK=M1UOSuRkfFNYAaUi=G19h=FydY1A=c2LhhsUXxYw@mail.gmail.com>
References: <CACaNnWo=-jfP2f9Gytsu48ZZOX8x422e3eV4kh2zPogBrg2EYA@mail.gmail.com>
	<CACaNnWqa6M3Se8i4YmadP+4Aqiyw7P5hBie6zY6LhS+bv4j3Dg@mail.gmail.com>
	<CANrGW1xkK=M1UOSuRkfFNYAaUi=G19h=FydY1A=c2LhhsUXxYw@mail.gmail.com>
Message-ID: <CANrGW1xEQZRv55FEwhX=EVHiQKD2kiBGF54fUNcuinaRmTSJ1g@mail.gmail.com>

And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more log.
Thanks. -Tao


On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao <yiyeguhu at gmail.com> wrote:

> Hi Peter,
>
> What was your actual question? Try -XX:ParallelGCThreads=<value> if you
> want less CPU usage from GC.
>
> Thanks.
> Tao
>
>
> On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey <harvey at actenum.com> wrote:
>
>> Small correction. The last example of output was with
>> "-XX:+UseConcMarkSweepGC -verbose:gc".
>>
>>
>> On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey <harvey at actenum.com> wrote:
>>
>>> I have an algorithm (at bottom of email) which builds a graph of 'Node'
>>> objects with random connections between them. It then repeatedly processes
>>> a queue of those Nodes, adding new Nodes to the queue as it goes. This is a
>>> single-threaded algorithm that will never terminate. Our actual production
>>> code is much more complex, but I've trimmed it down as much as possible.
>>>
>>> On Windows 7 with JRE 7u60, enabling the G1 garbage collector will cause
>>> the JRE to consume all 8 cores of my CPU. No other garbage collector does
>>> this. You can see the differences in CPU load in the example output below.
>>> It's also worth nothing that "-verbose:gc" with the G1 garbage collector
>>> prints nothing after my algorithm starts. Presumably the G1 garbage
>>> collector is doing something (concurrent mark?), but it's not printing
>>> anything about it.
>>>
>>> When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this
>>> (note the huge CPU load value which should not be this high for a
>>> single-threaded algorithm on an 8 core CPU):
>>>
>>> [GC pause (young) 62M->62M(254M), 0.0394214 secs]
>>> [GC pause (young) 73M->83M(508M), 0.0302781 secs]
>>>  [GC pause (young) 106M->111M(1016M), 0.0442273 secs]
>>> [GC pause (young) 157M->161M(1625M), 0.0660902 secs]
>>> [GC pause (young) 235M->240M(2112M), 0.0907231 secs]
>>> [GC pause (young) 334M->337M(2502M), 0.1356917 secs]
>>> [GC pause (young) 448M->450M(2814M), 0.1219090 secs]
>>> [GC pause (young) 574M->577M(3064M), 0.1778062 secs]
>>> [GC pause (young) 712M->715M(3264M), 0.1878443 secs]
>>> CPU Load Is -1.0
>>>
>>> Start
>>> Stop
>>> Sleep
>>> CPU Load Is 0.9196154547182949
>>>
>>> Start
>>> Stop
>>> Sleep
>>> CPU Load Is 0.9150735995043818
>>>
>>> ...
>>>
>>>
>>>
>>> When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output like
>>> this:
>>>
>>> [GC 65536K->64198K(249344K), 0.0628289 secs]
>>> [GC 129734K->127974K(314880K), 0.1583369 secs]
>>> [Full GC 127974K->127630K(451072K), 0.9675224 secs]
>>> [GC 258702K->259102K(451072K), 0.3543645 secs]
>>> [Full GC 259102K->258701K(732672K), 1.8085702 secs]
>>> [GC 389773K->390181K(790528K), 0.3332060 secs]
>>> [GC 579109K->579717K(803328K), 0.5126388 secs]
>>> [Full GC 579717K->578698K(1300480K), 4.0647303 secs]
>>> [GC 780426K->780842K(1567232K), 0.4364933 secs]
>>> CPU Load Is -1.0
>>>
>>> Start
>>> Stop
>>> Sleep
>>> CPU Load Is 0.03137771539054431
>>>
>>> Start
>>> Stop
>>> Sleep
>>> CPU Load Is 0.032351299224373145
>>>
>>> ...
>>>
>>>
>>>
>>> When run with VM args "-verbose:gc" I get output like this:
>>>
>>> [GC 69312K->67824K(251136K), 0.1533803 secs]
>>> [GC 137136K->135015K(251136K), 0.0970460 secs]
>>> [GC 137245K(251136K), 0.0095245 secs]
>>> [GC 204327K->204326K(274368K), 0.1056259 secs]
>>> [GC 273638K->273636K(343680K), 0.1081515 secs]
>>> [GC 342948K->342946K(412992K), 0.1181966 secs]
>>> [GC 412258K->412257K(482304K), 0.1126966 secs]
>>> [GC 481569K->481568K(551808K), 0.1156015 secs]
>>> [GC 550880K->550878K(620928K), 0.1184089 secs]
>>> [GC 620190K->620189K(690048K), 0.1209312 secs]
>>> [GC 689501K->689499K(759552K), 0.1199338 secs]
>>> [GC 758811K->758809K(828864K), 0.1162532 secs]
>>> CPU Load Is -1.0
>>>
>>> Start
>>> Stop
>>> Sleep
>>> CPU Load Is 0.10791719146608299
>>>
>>> Start
>>> [GC 821213K(828864K), 0.1966807 secs]
>>> Stop
>>> Sleep
>>> CPU Load Is 0.1540065314146181
>>>
>>> Start
>>> Stop
>>> Sleep
>>> [GC 821213K(1328240K), 0.1962688 secs]
>>> CPU Load Is 0.08427292195744103
>>>
>>> ...
>>>
>>>
>>>
>>> Why is the G1 garbage collector consuming so much CPU time? Is it stuck
>>> in the mark phase as I am modifying the graph structure?
>>>
>>> I'm not a subscriber to the list, so please CC me in any response.
>>>
>>> Thanks,
>>> Peter.
>>>
>>> --
>>>
>>> import java.lang.management.ManagementFactory;
>>> import com.sun.management.OperatingSystemMXBean;
>>> import java.util.Random;
>>>
>>> @SuppressWarnings("restriction")
>>> public class Node {
>>> private static OperatingSystemMXBean os = (OperatingSystemMXBean)
>>> ManagementFactory.getOperatingSystemMXBean();
>>>
>>> private Node next;
>>>
>>> private Node[] others = new Node[10];
>>>
>>>  public static void main(String[] args) throws InterruptedException {
>>>
>>> // Build a graph of Nodes
>>>  Node head = buildGraph();
>>>
>>> while (true) {
>>> // Print CPU load for this process
>>>  System.out.println("CPU Load Is " + os.getProcessCpuLoad());
>>> System.out.println();
>>>
>>>  // Modify the graph
>>> System.out.println("Start");
>>> head = modifyGraph(head);
>>>  System.out.println("Stop");
>>>
>>> // Sleep, as otherwise we tend to DoS the host computer...
>>>  System.out.println("Sleep");
>>> Thread.sleep(1000);
>>> }
>>>  }
>>>
>>> private static Node buildGraph() {
>>>
>>> // Create a collection of Node objects
>>>  Node[] array = new Node[10000000];
>>> for (int i = 0; i < array.length; i++) {
>>> array[i] = new Node();
>>>  }
>>>
>>> // Each Node refers to 10 other random Nodes
>>> Random random = new Random(12);
>>>  for (int i = 0; i < array.length; i++) {
>>> for (int j = 0; j < array[i].others.length; j++) {
>>> int k = random.nextInt(array.length);
>>>  array[i].others[j] = array[k];
>>> }
>>> }
>>>
>>>  // The first Node serves as the head of a queue
>>> return array[0];
>>> }
>>>
>>> private static Node modifyGraph(Node head) {
>>>
>>> // Perform a million iterations
>>>  for (int i = 0; i < 1000000; i++) {
>>>
>>> // Pop a Node off the head of the queue
>>> Node node = head;
>>>  head = node.next;
>>> node.next = null;
>>>
>>> // Add the other Nodes to the head of the queue
>>>  for (Node other : node.others) {
>>> other.next = head;
>>> head = other;
>>>  }
>>> }
>>> return head;
>>> }
>>>
>>> }
>>>
>>> --
>>> *Actenum Corporation*
>>> Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
>>> www.actenum.com
>>>
>>
>>
>>
>> --
>> *Actenum Corporation*
>> Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
>> www.actenum.com
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140603/c8e19cf1/attachment.htm>

From harvey at actenum.com  Tue Jun  3 21:43:03 2014
From: harvey at actenum.com (Peter Harvey)
Date: Tue, 3 Jun 2014 15:43:03 -0600
Subject: G1 GC consuming all CPU time
In-Reply-To: <CANrGW1xEQZRv55FEwhX=EVHiQKD2kiBGF54fUNcuinaRmTSJ1g@mail.gmail.com>
References: <CACaNnWo=-jfP2f9Gytsu48ZZOX8x422e3eV4kh2zPogBrg2EYA@mail.gmail.com>
	<CACaNnWqa6M3Se8i4YmadP+4Aqiyw7P5hBie6zY6LhS+bv4j3Dg@mail.gmail.com>
	<CANrGW1xkK=M1UOSuRkfFNYAaUi=G19h=FydY1A=c2LhhsUXxYw@mail.gmail.com>
	<CANrGW1xEQZRv55FEwhX=EVHiQKD2kiBGF54fUNcuinaRmTSJ1g@mail.gmail.com>
Message-ID: <CACaNnWraHP_ntaqEJSH4a_nFk6Tae7wmMXW5ujCoBWXF1R6HwA@mail.gmail.com>

Thanks for the response. Here are the additional logs.

0.094: [GC pause (young), 0.0347877 secs]
   [Parallel Time: 34.1 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 94.2, Avg: 104.4, Max: 126.4, Diff: 32.2]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 3.3, Max: 25.0, Diff: 25.0,
Sum: 26.6]
      [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 5.3, Diff: 5.3, Sum: 16.7]
         [Processed Buffers: Min: 0, Avg: 2.3, Max: 9, Diff: 9, Sum: 18]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 1.8, Avg: 18.3, Max: 29.9, Diff: 28.2, Sum:
146.4]
      [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.6]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.1]
      [GC Worker Total (ms): Min: 1.9, Avg: 23.8, Max: 34.1, Diff: 32.2,
Sum: 190.4]
      [GC Worker End (ms): Min: 128.2, Avg: 128.3, Max: 128.3, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.0 ms]
   [Other: 0.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.3 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.0 ms]
   [Eden: 24.0M(24.0M)->0.0B(11.0M) Survivors: 0.0B->3072.0K Heap:
62.1M(254.0M)->62.2M(254.0M)]
 [Times: user=0.09 sys=0.03, real=0.04 secs]
0.131: [GC pause (young), 0.0295093 secs]
   [Parallel Time: 28.1 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 130.9, Avg: 135.5, Max: 158.7, Diff: 27.8]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4,
Sum: 1.2]
      [Update RS (ms): Min: 0.0, Avg: 11.4, Max: 27.5, Diff: 27.5, Sum:
90.8]
         [Processed Buffers: Min: 0, Avg: 23.8, Max: 42, Diff: 42, Sum: 190]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.0, Avg: 11.7, Max: 17.1, Diff: 17.1, Sum:
93.8]
      [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.7]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.1]
      [GC Worker Total (ms): Min: 0.2, Avg: 23.5, Max: 28.1, Diff: 27.8,
Sum: 187.7]
      [GC Worker End (ms): Min: 159.0, Avg: 159.0, Max: 159.0, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.1 ms]
   [Other: 1.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.0 ms]
   [Eden: 11.0M(11.0M)->0.0B(23.0M) Survivors: 3072.0K->2048.0K Heap:
73.2M(254.0M)->82.7M(508.0M)]
 [Times: user=0.19 sys=0.00, real=0.03 secs]
0.166: [GC pause (young), 0.0385523 secs]
   [Parallel Time: 35.9 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 166.4, Avg: 169.8, Max: 192.4, Diff: 25.9]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4,
Sum: 1.9]
      [Update RS (ms): Min: 0.0, Avg: 10.9, Max: 31.9, Diff: 31.9, Sum:
87.2]
         [Processed Buffers: Min: 0, Avg: 14.6, Max: 26, Diff: 26, Sum: 117]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [Object Copy (ms): Min: 3.5, Avg: 21.4, Max: 27.0, Diff: 23.4, Sum:
171.1]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.1]
      [GC Worker Total (ms): Min: 10.0, Avg: 32.6, Max: 35.9, Diff: 25.9,
Sum: 260.7]
      [GC Worker End (ms): Min: 202.3, Avg: 202.4, Max: 202.4, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.0 ms]
   [Other: 2.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.0 ms]
   [Eden: 23.0M(23.0M)->0.0B(46.0M) Survivors: 2048.0K->4096.0K Heap:
105.7M(508.0M)->110.1M(1016.0M)]
 [Times: user=0.19 sys=0.00, real=0.04 secs]
0.222: [GC pause (young), 0.0558720 secs]
   [Parallel Time: 53.0 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 222.0, Avg: 222.2, Max: 222.5, Diff: 0.5]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4,
Sum: 1.5]
      [Update RS (ms): Min: 7.7, Avg: 8.7, Max: 10.9, Diff: 3.2, Sum: 69.4]
         [Processed Buffers: Min: 7, Avg: 8.5, Max: 12, Diff: 5, Sum: 68]
      [Scan RS (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.6, Sum: 2.3]
      [Object Copy (ms): Min: 41.7, Avg: 43.6, Max: 44.3, Diff: 2.7, Sum:
348.5]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.2]
      [GC Worker Total (ms): Min: 52.4, Avg: 52.7, Max: 52.9, Diff: 0.5,
Sum: 421.8]
      [GC Worker End (ms): Min: 274.9, Avg: 274.9, Max: 274.9, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.0 ms]
   [Other: 2.8 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.0 ms]
   [Eden: 46.0M(46.0M)->0.0B(74.0M) Survivors: 4096.0K->7168.0K Heap:
156.1M(1016.0M)->158.6M(1625.0M)]
 [Times: user=0.48 sys=0.01, real=0.06 secs]
0.328: [GC pause (young), 0.0853794 secs]
   [Parallel Time: 82.8 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 327.9, Avg: 330.8, Max: 351.1, Diff: 23.2]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3,
Sum: 2.0]
      [Update RS (ms): Min: 0.0, Avg: 5.5, Max: 8.3, Diff: 8.3, Sum: 43.9]
         [Processed Buffers: Min: 0, Avg: 2.3, Max: 3, Diff: 3, Sum: 18]
      [Scan RS (ms): Min: 0.0, Avg: 2.2, Max: 3.3, Diff: 3.3, Sum: 17.4]
      [Object Copy (ms): Min: 59.5, Avg: 71.8, Max: 73.7, Diff: 14.2, Sum:
574.7]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.2]
      [GC Worker Total (ms): Min: 59.5, Avg: 79.8, Max: 82.7, Diff: 23.2,
Sum: 638.4]
      [GC Worker End (ms): Min: 410.6, Avg: 410.7, Max: 410.7, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.1 ms]
   [Other: 2.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 74.0M(74.0M)->0.0B(94.0M) Survivors: 7168.0K->11.0M Heap:
232.6M(1625.0M)->237.6M(2112.0M)]
 [Times: user=0.59 sys=0.00, real=0.09 secs]
0.447: [GC pause (young), 0.1239103 secs]
   [Parallel Time: 121.5 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 447.5, Avg: 447.7, Max: 448.5, Diff: 0.9]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3,
Sum: 1.9]
      [Update RS (ms): Min: 26.5, Avg: 28.2, Max: 28.7, Diff: 2.2, Sum:
225.7]
         [Processed Buffers: Min: 38, Avg: 39.8, Max: 44, Diff: 6, Sum: 318]
      [Scan RS (ms): Min: 0.3, Avg: 0.7, Max: 1.9, Diff: 1.6, Sum: 5.3]
      [Object Copy (ms): Min: 92.1, Avg: 92.2, Max: 92.3, Diff: 0.2, Sum:
737.5]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.2]
      [GC Worker Total (ms): Min: 120.6, Avg: 121.4, Max: 121.5, Diff: 0.9,
Sum: 970.8]
      [GC Worker End (ms): Min: 569.0, Avg: 569.0, Max: 569.0, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.1 ms]
   [Other: 2.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 94.0M(94.0M)->0.0B(111.0M) Survivors: 11.0M->14.0M Heap:
331.6M(2112.0M)->334.6M(2502.0M)]
 [Times: user=0.80 sys=0.05, real=0.12 secs]
0.599: [GC pause (young), 0.1479438 secs]
   [Parallel Time: 145.7 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 599.4, Avg: 599.5, Max: 599.8, Diff: 0.4]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3,
Sum: 1.9]
      [Update RS (ms): Min: 41.8, Avg: 43.0, Max: 44.0, Diff: 2.1, Sum:
343.6]
         [Processed Buffers: Min: 67, Avg: 70.9, Max: 73, Diff: 6, Sum: 567]
      [Scan RS (ms): Min: 0.0, Avg: 0.8, Max: 1.9, Diff: 1.9, Sum: 6.2]
      [Object Copy (ms): Min: 101.3, Avg: 101.6, Max: 101.7, Diff: 0.3,
Sum: 812.6]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.2]
      [GC Worker Total (ms): Min: 145.2, Avg: 145.6, Max: 145.6, Diff: 0.4,
Sum: 1164.6]
      [GC Worker End (ms): Min: 745.1, Avg: 745.1, Max: 745.1, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.1 ms]
   [Other: 2.2 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 111.0M(111.0M)->0.0B(124.0M) Survivors: 14.0M->16.0M Heap:
445.6M(2502.0M)->448.6M(2814.0M)]
 [Times: user=1.20 sys=0.05, real=0.15 secs]
0.787: [GC pause (young), 0.1625321 secs]
   [Parallel Time: 160.0 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 786.6, Avg: 786.7, Max: 786.9, Diff: 0.4]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3,
Sum: 1.8]
      [Update RS (ms): Min: 46.4, Avg: 47.0, Max: 49.0, Diff: 2.5, Sum:
376.0]
         [Processed Buffers: Min: 75, Avg: 78.0, Max: 79, Diff: 4, Sum: 624]
      [Scan RS (ms): Min: 0.0, Avg: 0.9, Max: 1.5, Diff: 1.5, Sum: 7.4]
      [Object Copy (ms): Min: 110.6, Avg: 111.7, Max: 112.0, Diff: 1.4,
Sum: 893.5]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum:
0.3]
      [GC Worker Total (ms): Min: 159.6, Avg: 159.9, Max: 160.0, Diff: 0.4,
Sum: 1279.0]
      [GC Worker End (ms): Min: 946.5, Avg: 946.5, Max: 946.6, Diff: 0.1]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.1 ms]
   [Other: 2.4 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.2 ms]
   [Eden: 124.0M(124.0M)->0.0B(135.0M) Survivors: 16.0M->18.0M Heap:
572.6M(2814.0M)->576.6M(3064.0M)]
 [Times: user=1.37 sys=0.00, real=0.16 secs]
0.981: [GC pause (young), 0.2063055 secs]
   [Parallel Time: 204.1 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 980.8, Avg: 980.9, Max: 981.0, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.3, Diff: 0.2,
Sum: 2.1]
      [Update RS (ms): Min: 55.9, Avg: 57.8, Max: 58.8, Diff: 2.9, Sum:
462.8]
         [Processed Buffers: Min: 100, Avg: 101.5, Max: 103, Diff: 3, Sum:
812]
      [Scan RS (ms): Min: 0.0, Avg: 1.0, Max: 3.1, Diff: 3.1, Sum: 8.3]
      [Object Copy (ms): Min: 144.7, Avg: 144.8, Max: 144.9, Diff: 0.1,
Sum: 1158.3]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.3]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.2]
      [GC Worker Total (ms): Min: 203.8, Avg: 204.0, Max: 204.0, Diff: 0.2,
Sum: 1631.9]
      [GC Worker End (ms): Min: 1184.9, Avg: 1184.9, Max: 1184.9, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.1 ms]
   [Other: 2.1 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 135.0M(135.0M)->0.0B(143.0M) Survivors: 18.0M->20.0M Heap:
711.6M(3064.0M)->714.6M(3264.0M)]
 [Times: user=1.40 sys=0.11, real=0.21 secs]
CPU Load Is -1.0

Start
Stop
Sleep
CPU Load Is 0.9166222455142531

Start
Stop
Sleep
CPU Load Is 0.907013989900451

Start
Stop
Sleep
CPU Load Is 0.9085635227776081

Start
Stop
Sleep
CPU Load Is 0.909945506396622


Note that all the logged GC occurs during the construction of my graph of
Nodes, which is *before* my algorithm (modifyGraph) starts, There is no log
of GC activity once the algorithm starts, but there is significant (100%)
CPU usage.

My questions are:

   - Why is the G1 garbage collector consuming so much CPU time? What is it
   doing?
   - Why is the G1 garbage collector not logging anything? The only reason
   I even know it's the garbage collector consuming my CPU time is that (a) I
   only see this behaviour when the G1 collector is enabled and (b) the load
   on the CPU correlates with the value of -XX:ParallelGCThreads.
   - Are there particular object-graph structures that the G1 garbage
   collector will struggle with? Should complex graphs be considered bad
   coding practice?
   - How can I write my code to avoid this behaviour in the G1 garbage
   collector? For example, if all my Nodes are in an array, will this fix it?
   - Should this be considered a bug in the G1 garbage collector? This is
   far beyond 'a small increase in CPU usage'.

Just to demonstrate the issue further, I timed my calls to modifyGraph()
and trialled different GC parameters:

   - -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU load
   was 0.1247
   - -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU load
   was 0.5960
   - -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU load
   was 0.9184
   - -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119
   regardless of the setting of -XX:ParallelGCThreads

So using the CMS GC made my application 44x faster (14.815 seconds versus
0.322 seconds) and placed 1/8th of the load (0.9184 versus 0.1119) on the
CPU.

If my code represents some kind of hypothetical worst case for the G1
garbage collector, I think it should be documented and/or fixed somehow.

Regards,
Peter.


On Tue, Jun 3, 2014 at 3:16 PM, Tao Mao <yiyeguhu at gmail.com> wrote:

> And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more log.
> Thanks. -Tao
>
>
> On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao <yiyeguhu at gmail.com> wrote:
>
>> Hi Peter,
>>
>> What was your actual question? Try -XX:ParallelGCThreads=<value> if you
>> want less CPU usage from GC.
>>
>> Thanks.
>> Tao
>>
>>
>> On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey <harvey at actenum.com> wrote:
>>
>>> Small correction. The last example of output was with
>>> "-XX:+UseConcMarkSweepGC -verbose:gc".
>>>
>>>
>>> On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey <harvey at actenum.com>
>>> wrote:
>>>
>>>> I have an algorithm (at bottom of email) which builds a graph of 'Node'
>>>> objects with random connections between them. It then repeatedly processes
>>>> a queue of those Nodes, adding new Nodes to the queue as it goes. This is a
>>>> single-threaded algorithm that will never terminate. Our actual production
>>>> code is much more complex, but I've trimmed it down as much as possible.
>>>>
>>>> On Windows 7 with JRE 7u60, enabling the G1 garbage collector will
>>>> cause the JRE to consume all 8 cores of my CPU. No other garbage collector
>>>> does this. You can see the differences in CPU load in the example output
>>>> below. It's also worth nothing that "-verbose:gc" with the G1 garbage
>>>> collector prints nothing after my algorithm starts. Presumably the G1
>>>> garbage collector is doing something (concurrent mark?), but it's not
>>>> printing anything about it.
>>>>
>>>> When run with VM args "-XX:+UseG1GC -verbose:gc" I get output like this
>>>> (note the huge CPU load value which should not be this high for a
>>>> single-threaded algorithm on an 8 core CPU):
>>>>
>>>> [GC pause (young) 62M->62M(254M), 0.0394214 secs]
>>>> [GC pause (young) 73M->83M(508M), 0.0302781 secs]
>>>>  [GC pause (young) 106M->111M(1016M), 0.0442273 secs]
>>>> [GC pause (young) 157M->161M(1625M), 0.0660902 secs]
>>>> [GC pause (young) 235M->240M(2112M), 0.0907231 secs]
>>>> [GC pause (young) 334M->337M(2502M), 0.1356917 secs]
>>>> [GC pause (young) 448M->450M(2814M), 0.1219090 secs]
>>>> [GC pause (young) 574M->577M(3064M), 0.1778062 secs]
>>>> [GC pause (young) 712M->715M(3264M), 0.1878443 secs]
>>>> CPU Load Is -1.0
>>>>
>>>> Start
>>>> Stop
>>>> Sleep
>>>> CPU Load Is 0.9196154547182949
>>>>
>>>> Start
>>>> Stop
>>>> Sleep
>>>> CPU Load Is 0.9150735995043818
>>>>
>>>> ...
>>>>
>>>>
>>>>
>>>> When run with VM args "-XX:+UseParallelGC -verbose:gc" I get output
>>>> like this:
>>>>
>>>> [GC 65536K->64198K(249344K), 0.0628289 secs]
>>>> [GC 129734K->127974K(314880K), 0.1583369 secs]
>>>> [Full GC 127974K->127630K(451072K), 0.9675224 secs]
>>>> [GC 258702K->259102K(451072K), 0.3543645 secs]
>>>> [Full GC 259102K->258701K(732672K), 1.8085702 secs]
>>>> [GC 389773K->390181K(790528K), 0.3332060 secs]
>>>> [GC 579109K->579717K(803328K), 0.5126388 secs]
>>>> [Full GC 579717K->578698K(1300480K), 4.0647303 secs]
>>>> [GC 780426K->780842K(1567232K), 0.4364933 secs]
>>>> CPU Load Is -1.0
>>>>
>>>> Start
>>>> Stop
>>>> Sleep
>>>> CPU Load Is 0.03137771539054431
>>>>
>>>> Start
>>>> Stop
>>>> Sleep
>>>> CPU Load Is 0.032351299224373145
>>>>
>>>> ...
>>>>
>>>>
>>>>
>>>> When run with VM args "-verbose:gc" I get output like this:
>>>>
>>>> [GC 69312K->67824K(251136K), 0.1533803 secs]
>>>> [GC 137136K->135015K(251136K), 0.0970460 secs]
>>>> [GC 137245K(251136K), 0.0095245 secs]
>>>> [GC 204327K->204326K(274368K), 0.1056259 secs]
>>>> [GC 273638K->273636K(343680K), 0.1081515 secs]
>>>> [GC 342948K->342946K(412992K), 0.1181966 secs]
>>>> [GC 412258K->412257K(482304K), 0.1126966 secs]
>>>> [GC 481569K->481568K(551808K), 0.1156015 secs]
>>>> [GC 550880K->550878K(620928K), 0.1184089 secs]
>>>> [GC 620190K->620189K(690048K), 0.1209312 secs]
>>>> [GC 689501K->689499K(759552K), 0.1199338 secs]
>>>> [GC 758811K->758809K(828864K), 0.1162532 secs]
>>>> CPU Load Is -1.0
>>>>
>>>> Start
>>>> Stop
>>>> Sleep
>>>> CPU Load Is 0.10791719146608299
>>>>
>>>> Start
>>>> [GC 821213K(828864K), 0.1966807 secs]
>>>> Stop
>>>> Sleep
>>>> CPU Load Is 0.1540065314146181
>>>>
>>>> Start
>>>> Stop
>>>> Sleep
>>>> [GC 821213K(1328240K), 0.1962688 secs]
>>>> CPU Load Is 0.08427292195744103
>>>>
>>>> ...
>>>>
>>>>
>>>>
>>>> Why is the G1 garbage collector consuming so much CPU time? Is it stuck
>>>> in the mark phase as I am modifying the graph structure?
>>>>
>>>> I'm not a subscriber to the list, so please CC me in any response.
>>>>
>>>> Thanks,
>>>> Peter.
>>>>
>>>> --
>>>>
>>>> import java.lang.management.ManagementFactory;
>>>> import com.sun.management.OperatingSystemMXBean;
>>>> import java.util.Random;
>>>>
>>>> @SuppressWarnings("restriction")
>>>> public class Node {
>>>> private static OperatingSystemMXBean os = (OperatingSystemMXBean)
>>>> ManagementFactory.getOperatingSystemMXBean();
>>>>
>>>> private Node next;
>>>>
>>>> private Node[] others = new Node[10];
>>>>
>>>>  public static void main(String[] args) throws InterruptedException {
>>>>
>>>> // Build a graph of Nodes
>>>>  Node head = buildGraph();
>>>>
>>>> while (true) {
>>>> // Print CPU load for this process
>>>>  System.out.println("CPU Load Is " + os.getProcessCpuLoad());
>>>> System.out.println();
>>>>
>>>>  // Modify the graph
>>>> System.out.println("Start");
>>>> head = modifyGraph(head);
>>>>  System.out.println("Stop");
>>>>
>>>> // Sleep, as otherwise we tend to DoS the host computer...
>>>>  System.out.println("Sleep");
>>>> Thread.sleep(1000);
>>>> }
>>>>  }
>>>>
>>>> private static Node buildGraph() {
>>>>
>>>> // Create a collection of Node objects
>>>>  Node[] array = new Node[10000000];
>>>> for (int i = 0; i < array.length; i++) {
>>>> array[i] = new Node();
>>>>  }
>>>>
>>>> // Each Node refers to 10 other random Nodes
>>>> Random random = new Random(12);
>>>>  for (int i = 0; i < array.length; i++) {
>>>> for (int j = 0; j < array[i].others.length; j++) {
>>>> int k = random.nextInt(array.length);
>>>>  array[i].others[j] = array[k];
>>>> }
>>>> }
>>>>
>>>>  // The first Node serves as the head of a queue
>>>> return array[0];
>>>> }
>>>>
>>>> private static Node modifyGraph(Node head) {
>>>>
>>>> // Perform a million iterations
>>>>  for (int i = 0; i < 1000000; i++) {
>>>>
>>>> // Pop a Node off the head of the queue
>>>> Node node = head;
>>>>  head = node.next;
>>>> node.next = null;
>>>>
>>>> // Add the other Nodes to the head of the queue
>>>>  for (Node other : node.others) {
>>>> other.next = head;
>>>> head = other;
>>>>  }
>>>> }
>>>> return head;
>>>> }
>>>>
>>>> }
>>>>
>>>> --
>>>> *Actenum Corporation*
>>>> Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
>>>> www.actenum.com
>>>>
>>>
>>>
>>>
>>> --
>>> *Actenum Corporation*
>>> Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
>>> www.actenum.com
>>>
>>
>>
>


-- 
*Actenum Corporation*
Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |
www.actenum.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140603/aad6421c/attachment.htm>

From claes.redestad at oracle.com  Tue Jun  3 22:57:45 2014
From: claes.redestad at oracle.com (Claes Redestad)
Date: Wed, 04 Jun 2014 00:57:45 +0200
Subject: G1 GC consuming all CPU time
In-Reply-To: <CACaNnWraHP_ntaqEJSH4a_nFk6Tae7wmMXW5ujCoBWXF1R6HwA@mail.gmail.com>
References: <CACaNnWo=-jfP2f9Gytsu48ZZOX8x422e3eV4kh2zPogBrg2EYA@mail.gmail.com>	<CACaNnWqa6M3Se8i4YmadP+4Aqiyw7P5hBie6zY6LhS+bv4j3Dg@mail.gmail.com>	<CANrGW1xkK=M1UOSuRkfFNYAaUi=G19h=FydY1A=c2LhhsUXxYw@mail.gmail.com>	<CANrGW1xEQZRv55FEwhX=EVHiQKD2kiBGF54fUNcuinaRmTSJ1g@mail.gmail.com>
	<CACaNnWraHP_ntaqEJSH4a_nFk6Tae7wmMXW5ujCoBWXF1R6HwA@mail.gmail.com>
Message-ID: <538E52E9.2010904@oracle.com>

Hi,

  guessing it's due to the concurrent GC threads tripping over 
themselves: the microbenchmark is creating one, big linked structure 
that will occupy most of the old gen, and then you're doing intense 
pointer updates which will trigger scans and updates of remembered sets 
etc. I actually don't know half the details and am mostly just guessing. :-)

  Converted your micro to a JMH micro to ease with experimenting[1] 
(hope you don't mind) then verified the regression reproduces:

  Parallel:
  java -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.*
  ~1625 ops/ms

  G1:
  java -XX:+UseG1GC -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 
.*G1GraphBench.*
  ~12 ops/ms

  Testing my hunch, let's try forcing the concurrent refinement to use 
only one thread:

  java -XX:+UseG1GC -XX:-G1UseAdaptiveConcRefinement 
-XX:G1ConcRefinementThreads=1 -jar target/microbenchmarks.jar -wi 3 -i 
10 -f 1 .*G1GraphBench.*
~1550 ops/ms

  I guess we have a winner! I won't hazard to try and answer your 
questions about how this should be resolved - perhaps the adaptive 
policy can detect this corner case and scale down the number of 
refinement threads when they start interfering with each other, or 
something.

  /Claes

[1]

package org.sample;

import org.openjdk.jmh.annotations.GenerateMicroBenchmark;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;

import java.util.Random;

@State(Scope.Thread)
public class G1GraphBench {

     private static class Node {
         private Node next;
         private Node[] others = new Node[10];
     }

     Node head = buildGraph();

     private static Node buildGraph() {

         // Create a collection of Node objects
         Node[] array = new Node[10000000];
         for (int i = 0; i < array.length; i++) {
             array[i] = new Node();
         }

         // Each Node refers to 10 other random Nodes
         Random random = new Random(12);
         for (int i = 0; i < array.length; i++) {
             for (int j = 0; j < array[i].others.length; j++) {
                 int k = random.nextInt(array.length);
                 array[i].others[j] = array[k];
             }
         }

         // The first Node serves as the head of a queue
         return array[0];
     }

     @GenerateMicroBenchmark
     public Node nodeBench() {
         Node node = head;
         head = node.next;
         node.next = null;

         // Add the other Nodes to the head of the queue
         for (Node other : node.others) {
             other.next = head;
             head = other;
         }
         return head;
     }

}

On 2014-06-03 23:43, Peter Harvey wrote:
> Thanks for the response. Here are the additional logs.
>
>     0.094: [GC pause (young), 0.0347877 secs]
>        [Parallel Time: 34.1 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 94.2, Avg: 104.4, Max: 126.4,
>     Diff: 32.2]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 3.3, Max: 25.0,
>     Diff: 25.0, Sum: 26.6]
>           [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 5.3, Diff: 5.3,
>     Sum: 16.7]
>              [Processed Buffers: Min: 0, Avg: 2.3, Max: 9, Diff: 9,
>     Sum: 18]
>           [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
>     0.0]
>           [Object Copy (ms): Min: 1.8, Avg: 18.3, Max: 29.9, Diff:
>     28.2, Sum: 146.4]
>           [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1,
>     Sum: 0.6]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.1]
>           [GC Worker Total (ms): Min: 1.9, Avg: 23.8, Max: 34.1, Diff:
>     32.2, Sum: 190.4]
>           [GC Worker End (ms): Min: 128.2, Avg: 128.3, Max: 128.3,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.0 ms]
>        [Other: 0.6 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.3 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.0 ms]
>        [Eden: 24.0M(24.0M)->0.0B(11.0M) Survivors: 0.0B->3072.0K Heap:
>     62.1M(254.0M)->62.2M(254.0M)]
>      [Times: user=0.09 sys=0.03, real=0.04 secs]
>     0.131: [GC pause (young), 0.0295093 secs]
>        [Parallel Time: 28.1 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 130.9, Avg: 135.5, Max: 158.7,
>     Diff: 27.8]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
>     0.4, Sum: 1.2]
>           [Update RS (ms): Min: 0.0, Avg: 11.4, Max: 27.5, Diff: 27.5,
>     Sum: 90.8]
>              [Processed Buffers: Min: 0, Avg: 23.8, Max: 42, Diff: 42,
>     Sum: 190]
>           [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
>     0.0]
>           [Object Copy (ms): Min: 0.0, Avg: 11.7, Max: 17.1, Diff:
>     17.1, Sum: 93.8]
>           [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3,
>     Sum: 1.7]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.1]
>           [GC Worker Total (ms): Min: 0.2, Avg: 23.5, Max: 28.1, Diff:
>     27.8, Sum: 187.7]
>           [GC Worker End (ms): Min: 159.0, Avg: 159.0, Max: 159.0,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.1 ms]
>        [Other: 1.3 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.0 ms]
>        [Eden: 11.0M(11.0M)->0.0B(23.0M) Survivors: 3072.0K->2048.0K
>     Heap: 73.2M(254.0M)->82.7M(508.0M)]
>      [Times: user=0.19 sys=0.00, real=0.03 secs]
>     0.166: [GC pause (young), 0.0385523 secs]
>        [Parallel Time: 35.9 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 166.4, Avg: 169.8, Max: 192.4,
>     Diff: 25.9]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
>     0.4, Sum: 1.9]
>           [Update RS (ms): Min: 0.0, Avg: 10.9, Max: 31.9, Diff: 31.9,
>     Sum: 87.2]
>              [Processed Buffers: Min: 0, Avg: 14.6, Max: 26, Diff: 26,
>     Sum: 117]
>           [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
>     0.1]
>           [Object Copy (ms): Min: 3.5, Avg: 21.4, Max: 27.0, Diff:
>     23.4, Sum: 171.1]
>           [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
>     Sum: 0.4]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.1]
>           [GC Worker Total (ms): Min: 10.0, Avg: 32.6, Max: 35.9,
>     Diff: 25.9, Sum: 260.7]
>           [GC Worker End (ms): Min: 202.3, Avg: 202.4, Max: 202.4,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.0 ms]
>        [Other: 2.6 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.0 ms]
>        [Eden: 23.0M(23.0M)->0.0B(46.0M) Survivors: 2048.0K->4096.0K
>     Heap: 105.7M(508.0M)->110.1M(1016.0M)]
>      [Times: user=0.19 sys=0.00, real=0.04 secs]
>     0.222: [GC pause (young), 0.0558720 secs]
>        [Parallel Time: 53.0 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 222.0, Avg: 222.2, Max: 222.5,
>     Diff: 0.5]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
>     0.4, Sum: 1.5]
>           [Update RS (ms): Min: 7.7, Avg: 8.7, Max: 10.9, Diff: 3.2,
>     Sum: 69.4]
>              [Processed Buffers: Min: 7, Avg: 8.5, Max: 12, Diff: 5,
>     Sum: 68]
>           [Scan RS (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.6, Sum:
>     2.3]
>           [Object Copy (ms): Min: 41.7, Avg: 43.6, Max: 44.3, Diff:
>     2.7, Sum: 348.5]
>           [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>     Sum: 0.0]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.2]
>           [GC Worker Total (ms): Min: 52.4, Avg: 52.7, Max: 52.9,
>     Diff: 0.5, Sum: 421.8]
>           [GC Worker End (ms): Min: 274.9, Avg: 274.9, Max: 274.9,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.0 ms]
>        [Other: 2.8 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.0 ms]
>        [Eden: 46.0M(46.0M)->0.0B(74.0M) Survivors: 4096.0K->7168.0K
>     Heap: 156.1M(1016.0M)->158.6M(1625.0M)]
>      [Times: user=0.48 sys=0.01, real=0.06 secs]
>     0.328: [GC pause (young), 0.0853794 secs]
>        [Parallel Time: 82.8 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 327.9, Avg: 330.8, Max: 351.1,
>     Diff: 23.2]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
>     0.3, Sum: 2.0]
>           [Update RS (ms): Min: 0.0, Avg: 5.5, Max: 8.3, Diff: 8.3,
>     Sum: 43.9]
>              [Processed Buffers: Min: 0, Avg: 2.3, Max: 3, Diff: 3,
>     Sum: 18]
>           [Scan RS (ms): Min: 0.0, Avg: 2.2, Max: 3.3, Diff: 3.3, Sum:
>     17.4]
>           [Object Copy (ms): Min: 59.5, Avg: 71.8, Max: 73.7, Diff:
>     14.2, Sum: 574.7]
>           [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>     Sum: 0.2]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.2]
>           [GC Worker Total (ms): Min: 59.5, Avg: 79.8, Max: 82.7,
>     Diff: 23.2, Sum: 638.4]
>           [GC Worker End (ms): Min: 410.6, Avg: 410.7, Max: 410.7,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.1 ms]
>        [Other: 2.6 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.1 ms]
>        [Eden: 74.0M(74.0M)->0.0B(94.0M) Survivors: 7168.0K->11.0M
>     Heap: 232.6M(1625.0M)->237.6M(2112.0M)]
>      [Times: user=0.59 sys=0.00, real=0.09 secs]
>     0.447: [GC pause (young), 0.1239103 secs]
>        [Parallel Time: 121.5 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 447.5, Avg: 447.7, Max: 448.5,
>     Diff: 0.9]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
>     0.3, Sum: 1.9]
>           [Update RS (ms): Min: 26.5, Avg: 28.2, Max: 28.7, Diff: 2.2,
>     Sum: 225.7]
>              [Processed Buffers: Min: 38, Avg: 39.8, Max: 44, Diff: 6,
>     Sum: 318]
>           [Scan RS (ms): Min: 0.3, Avg: 0.7, Max: 1.9, Diff: 1.6, Sum:
>     5.3]
>           [Object Copy (ms): Min: 92.1, Avg: 92.2, Max: 92.3, Diff:
>     0.2, Sum: 737.5]
>           [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
>     Sum: 0.3]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.2]
>           [GC Worker Total (ms): Min: 120.6, Avg: 121.4, Max: 121.5,
>     Diff: 0.9, Sum: 970.8]
>           [GC Worker End (ms): Min: 569.0, Avg: 569.0, Max: 569.0,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.1 ms]
>        [Other: 2.3 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.1 ms]
>        [Eden: 94.0M(94.0M)->0.0B(111.0M) Survivors: 11.0M->14.0M Heap:
>     331.6M(2112.0M)->334.6M(2502.0M)]
>      [Times: user=0.80 sys=0.05, real=0.12 secs]
>     0.599: [GC pause (young), 0.1479438 secs]
>        [Parallel Time: 145.7 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 599.4, Avg: 599.5, Max: 599.8,
>     Diff: 0.4]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
>     0.3, Sum: 1.9]
>           [Update RS (ms): Min: 41.8, Avg: 43.0, Max: 44.0, Diff: 2.1,
>     Sum: 343.6]
>              [Processed Buffers: Min: 67, Avg: 70.9, Max: 73, Diff: 6,
>     Sum: 567]
>           [Scan RS (ms): Min: 0.0, Avg: 0.8, Max: 1.9, Diff: 1.9, Sum:
>     6.2]
>           [Object Copy (ms): Min: 101.3, Avg: 101.6, Max: 101.7, Diff:
>     0.3, Sum: 812.6]
>           [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>     Sum: 0.1]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.2]
>           [GC Worker Total (ms): Min: 145.2, Avg: 145.6, Max: 145.6,
>     Diff: 0.4, Sum: 1164.6]
>           [GC Worker End (ms): Min: 745.1, Avg: 745.1, Max: 745.1,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.1 ms]
>        [Other: 2.2 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.1 ms]
>        [Eden: 111.0M(111.0M)->0.0B(124.0M) Survivors: 14.0M->16.0M
>     Heap: 445.6M(2502.0M)->448.6M(2814.0M)]
>      [Times: user=1.20 sys=0.05, real=0.15 secs]
>     0.787: [GC pause (young), 0.1625321 secs]
>        [Parallel Time: 160.0 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 786.6, Avg: 786.7, Max: 786.9,
>     Diff: 0.4]
>           [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
>     0.3, Sum: 1.8]
>           [Update RS (ms): Min: 46.4, Avg: 47.0, Max: 49.0, Diff: 2.5,
>     Sum: 376.0]
>              [Processed Buffers: Min: 75, Avg: 78.0, Max: 79, Diff: 4,
>     Sum: 624]
>           [Scan RS (ms): Min: 0.0, Avg: 0.9, Max: 1.5, Diff: 1.5, Sum:
>     7.4]
>           [Object Copy (ms): Min: 110.6, Avg: 111.7, Max: 112.0, Diff:
>     1.4, Sum: 893.5]
>           [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>     Sum: 0.1]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff:
>     0.1, Sum: 0.3]
>           [GC Worker Total (ms): Min: 159.6, Avg: 159.9, Max: 160.0,
>     Diff: 0.4, Sum: 1279.0]
>           [GC Worker End (ms): Min: 946.5, Avg: 946.5, Max: 946.6,
>     Diff: 0.1]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.1 ms]
>        [Other: 2.4 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.2 ms]
>        [Eden: 124.0M(124.0M)->0.0B(135.0M) Survivors: 16.0M->18.0M
>     Heap: 572.6M(2814.0M)->576.6M(3064.0M)]
>      [Times: user=1.37 sys=0.00, real=0.16 secs]
>     0.981: [GC pause (young), 0.2063055 secs]
>        [Parallel Time: 204.1 ms, GC Workers: 8]
>           [GC Worker Start (ms): Min: 980.8, Avg: 980.9, Max: 981.0,
>     Diff: 0.2]
>           [Ext Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.3, Diff:
>     0.2, Sum: 2.1]
>           [Update RS (ms): Min: 55.9, Avg: 57.8, Max: 58.8, Diff: 2.9,
>     Sum: 462.8]
>              [Processed Buffers: Min: 100, Avg: 101.5, Max: 103, Diff:
>     3, Sum: 812]
>           [Scan RS (ms): Min: 0.0, Avg: 1.0, Max: 3.1, Diff: 3.1, Sum:
>     8.3]
>           [Object Copy (ms): Min: 144.7, Avg: 144.8, Max: 144.9, Diff:
>     0.1, Sum: 1158.3]
>           [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>     Sum: 0.3]
>           [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>     0.0, Sum: 0.2]
>           [GC Worker Total (ms): Min: 203.8, Avg: 204.0, Max: 204.0,
>     Diff: 0.2, Sum: 1631.9]
>           [GC Worker End (ms): Min: 1184.9, Avg: 1184.9, Max: 1184.9,
>     Diff: 0.0]
>        [Code Root Fixup: 0.0 ms]
>        [Clear CT: 0.1 ms]
>        [Other: 2.1 ms]
>           [Choose CSet: 0.0 ms]
>           [Ref Proc: 0.1 ms]
>           [Ref Enq: 0.0 ms]
>           [Free CSet: 0.1 ms]
>        [Eden: 135.0M(135.0M)->0.0B(143.0M) Survivors: 18.0M->20.0M
>     Heap: 711.6M(3064.0M)->714.6M(3264.0M)]
>      [Times: user=1.40 sys=0.11, real=0.21 secs]
>     CPU Load Is -1.0
>
>     Start
>     Stop
>     Sleep
>     CPU Load Is 0.9166222455142531
>
>     Start
>     Stop
>     Sleep
>     CPU Load Is 0.907013989900451
>
>     Start
>     Stop
>     Sleep
>     CPU Load Is 0.9085635227776081
>
>     Start
>     Stop
>     Sleep
>     CPU Load Is 0.909945506396622
>
>
>
> Note that all the logged GC occurs during the construction of my graph 
> of Nodes, which is /before/ my algorithm (modifyGraph) starts, There 
> is no log of GC activity once the algorithm starts, but there is 
> significant (100%) CPU usage.
>
> My questions are:
>
>   * Why is the G1 garbage collector consuming so much CPU time? What
>     is it doing?
>   * Why is the G1 garbage collector not logging anything? The only
>     reason I even know it's the garbage collector consuming my CPU
>     time is that (a) I only see this behaviour when the G1 collector
>     is enabled and (b) the load on the CPU correlates with the value
>     of -XX:ParallelGCThreads.
>   * Are there particular object-graph structures that the G1 garbage
>     collector will struggle with? Should complex graphs be considered
>     bad coding practice?
>   * How can I write my code to avoid this behaviour in the G1 garbage
>     collector? For example, if all my Nodes are in an array, will this
>     fix it?
>   * Should this be considered a bug in the G1 garbage collector? This
>     is far beyond 'a small increase in CPU usage'.
>
> Just to demonstrate the issue further, I timed my calls to 
> modifyGraph() and trialled different GC parameters:
>
>   * -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU
>     load was 0.1247
>   * -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU
>     load was 0.5960
>   * -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU
>     load was 0.9184
>   * -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119
>     regardless of the setting of -XX:ParallelGCThreads
>
> So using the CMS GC made my application 44x faster (14.815 seconds 
> versus 0.322 seconds) and placed 1/8th of the load (0.9184 versus 
> 0.1119) on the CPU.
>
> If my code represents some kind of hypothetical worst case for the G1 
> garbage collector, I think it should be documented and/or fixed somehow.
>
> Regards,
> Peter.
>
>
>
> On Tue, Jun 3, 2014 at 3:16 PM, Tao Mao <yiyeguhu at gmail.com 
> <mailto:yiyeguhu at gmail.com>> wrote:
>
>     And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more
>     log. Thanks. -Tao
>
>
>     On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao <yiyeguhu at gmail.com
>     <mailto:yiyeguhu at gmail.com>> wrote:
>
>         Hi Peter,
>
>         What was your actual question?
>         Try -XX:ParallelGCThreads=<value> if you want less CPU usage
>         from GC.
>
>         Thanks.
>         Tao
>
>
>         On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey
>         <harvey at actenum.com <mailto:harvey at actenum.com>> wrote:
>
>             Small correction. The last example of output was with
>             "-XX:+UseConcMarkSweepGC -verbose:gc".
>
>
>             On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey
>             <harvey at actenum.com <mailto:harvey at actenum.com>> wrote:
>
>                 I have an algorithm (at bottom of email) which builds
>                 a graph of 'Node' objects with random connections
>                 between them. It then repeatedly processes a queue of
>                 those Nodes, adding new Nodes to the queue as it goes.
>                 This is a single-threaded algorithm that will never
>                 terminate. Our actual production code is much more
>                 complex, but I've trimmed it down as much as possible.
>
>                 On Windows 7 with JRE 7u60, enabling the G1 garbage
>                 collector will cause the JRE to consume all 8 cores of
>                 my CPU. No other garbage collector does this. You can
>                 see the differences in CPU load in the example output
>                 below. It's also worth nothing that "-verbose:gc" with
>                 the G1 garbage collector prints nothing after my
>                 algorithm starts. Presumably the G1 garbage collector
>                 is doing something (concurrent mark?), but it's not
>                 printing anything about it.
>
>                 When run with VM args "-XX:+UseG1GC -verbose:gc" I get
>                 output like this (note the huge CPU load value which
>                 should not be this high for a single-threaded
>                 algorithm on an 8 core CPU):
>
>                     [GC pause (young) 62M->62M(254M), 0.0394214 secs]
>                     [GC pause (young) 73M->83M(508M), 0.0302781 secs]
>                     [GC pause (young) 106M->111M(1016M), 0.0442273 secs]
>                     [GC pause (young) 157M->161M(1625M), 0.0660902 secs]
>                     [GC pause (young) 235M->240M(2112M), 0.0907231 secs]
>                     [GC pause (young) 334M->337M(2502M), 0.1356917 secs]
>                     [GC pause (young) 448M->450M(2814M), 0.1219090 secs]
>                     [GC pause (young) 574M->577M(3064M), 0.1778062 secs]
>                     [GC pause (young) 712M->715M(3264M), 0.1878443 secs]
>                     CPU Load Is -1.0
>
>                     Start
>                     Stop
>                     Sleep
>                     CPU Load Is 0.9196154547182949
>
>                     Start
>                     Stop
>                     Sleep
>                     CPU Load Is 0.9150735995043818
>
>                     ...
>
>
>
>                 When run with VM args "-XX:+UseParallelGC -verbose:gc"
>                 I get output like this:
>
>                     [GC 65536K->64198K(249344K), 0.0628289 secs]
>                     [GC 129734K->127974K(314880K), 0.1583369 secs]
>                     [Full GC 127974K->127630K(451072K), 0.9675224 secs]
>                     [GC 258702K->259102K(451072K), 0.3543645 secs]
>                     [Full GC 259102K->258701K(732672K), 1.8085702 secs]
>                     [GC 389773K->390181K(790528K), 0.3332060 secs]
>                     [GC 579109K->579717K(803328K), 0.5126388 secs]
>                     [Full GC 579717K->578698K(1300480K), 4.0647303 secs]
>                     [GC 780426K->780842K(1567232K), 0.4364933 secs]
>                     CPU Load Is -1.0
>
>                     Start
>                     Stop
>                     Sleep
>                     CPU Load Is 0.03137771539054431
>
>                     Start
>                     Stop
>                     Sleep
>                     CPU Load Is 0.032351299224373145
>
>                     ...
>
>
>
>                 When run with VM args "-verbose:gc" I get output like
>                 this:
>
>                     [GC 69312K->67824K(251136K), 0.1533803 secs]
>                     [GC 137136K->135015K(251136K), 0.0970460 secs]
>                     [GC 137245K(251136K), 0.0095245 secs]
>                     [GC 204327K->204326K(274368K), 0.1056259 secs]
>                     [GC 273638K->273636K(343680K), 0.1081515 secs]
>                     [GC 342948K->342946K(412992K), 0.1181966 secs]
>                     [GC 412258K->412257K(482304K), 0.1126966 secs]
>                     [GC 481569K->481568K(551808K), 0.1156015 secs]
>                     [GC 550880K->550878K(620928K), 0.1184089 secs]
>                     [GC 620190K->620189K(690048K), 0.1209312 secs]
>                     [GC 689501K->689499K(759552K), 0.1199338 secs]
>                     [GC 758811K->758809K(828864K), 0.1162532 secs]
>                     CPU Load Is -1.0
>
>                     Start
>                     Stop
>                     Sleep
>                     CPU Load Is 0.10791719146608299
>
>                     Start
>                     [GC 821213K(828864K), 0.1966807 secs]
>                     Stop
>                     Sleep
>                     CPU Load Is 0.1540065314146181
>
>                     Start
>                     Stop
>                     Sleep
>                     [GC 821213K(1328240K), 0.1962688 secs]
>                     CPU Load Is 0.08427292195744103
>
>                     ...
>
>
>
>                 Why is the G1 garbage collector consuming so much CPU
>                 time? Is it stuck in the mark phase as I am modifying
>                 the graph structure?
>
>                 I'm not a subscriber to the list, so please CC me in
>                 any response.
>
>                 Thanks,
>                 Peter.
>
>                 --
>
>                 import java.lang.management.ManagementFactory;
>                 import com.sun.management.OperatingSystemMXBean;
>                 import java.util.Random;
>
>                 @SuppressWarnings("restriction")
>                 public class Node {
>                 private static OperatingSystemMXBean os =
>                 (OperatingSystemMXBean)
>                 ManagementFactory.getOperatingSystemMXBean();
>
>                 private Node next;
>
>                 private Node[] others = new Node[10];
>
>                 public static void main(String[] args) throws
>                 InterruptedException {
>
>                 // Build a graph of Nodes
>                 Node head = buildGraph();
>
>                 while (true) {
>                 // Print CPU load for this process
>                 System.out.println("CPU Load Is " +
>                 os.getProcessCpuLoad());
>                 System.out.println();
>
>                 // Modify the graph
>                 System.out.println("Start");
>                 head = modifyGraph(head);
>                 System.out.println("Stop");
>
>                 // Sleep, as otherwise we tend to DoS the host computer...
>                 System.out.println("Sleep");
>                 Thread.sleep(1000);
>                 }
>                 }
>
>                 private static Node buildGraph() {
>
>                 // Create a collection of Node objects
>                 Node[] array = new Node[10000000];
>                 for (int i = 0; i < array.length; i++) {
>                 array[i] = new Node();
>                 }
>
>                 // Each Node refers to 10 other random Nodes
>                 Random random = new Random(12);
>                 for (int i = 0; i < array.length; i++) {
>                 for (int j = 0; j < array[i].others.length; j++) {
>                 int k = random.nextInt(array.length);
>                 array[i].others[j] = array[k];
>                 }
>                 }
>
>                 // The first Node serves as the head of a queue
>                 return array[0];
>                 }
>
>                 private static Node modifyGraph(Node head) {
>
>                 // Perform a million iterations
>                 for (int i = 0; i < 1000000; i++) {
>
>                 // Pop a Node off the head of the queue
>                 Node node = head;
>                 head = node.next;
>                 node.next = null;
>
>                 // Add the other Nodes to the head of the queue
>                 for (Node other : node.others) {
>                 other.next = head;
>                 head = other;
>                 }
>                 }
>                 return head;
>                 }
>
>                 }
>
>                 -- 
>                 *Actenum Corporation*
>                 Peter Harvey  |  Cell: 780.729.8192 <tel:780.729.8192>
>                  | harvey at actenum.com <mailto:harvey at actenum.com>  |
>                 www.actenum.com <http://www.actenum.com>
>
>
>
>
>             -- 
>             *Actenum Corporation*
>             Peter Harvey  |  Cell: 780.729.8192 <tel:780.729.8192>  |
>             harvey at actenum.com <mailto:harvey at actenum.com>  |
>             www.actenum.com <http://www.actenum.com>
>
>
>
>
>
>
> -- 
> *Actenum Corporation*
> Peter Harvey  |  Cell: 780.729.8192  | harvey at actenum.com 
> <mailto:harvey at actenum.com>  | www.actenum.com <http://www.actenum.com>


From claes.redestad at oracle.com  Tue Jun  3 23:28:49 2014
From: claes.redestad at oracle.com (claes.redestad)
Date: Wed, 04 Jun 2014 01:28:49 +0200
Subject: G1 GC consuming all CPU time
Message-ID: <eumpf9vc4whr7elfkfhk87w4.1401838129772@email.android.com>

At least CPU load is down, suggesting its no longer a concurrency issue. One thing that comes to mind is that G1 emits costly write and read barriers that heavily penalize interpreted code, while JMH generally avoid that benchmarking trap. Try extracting the loop body in your test into a method to help the JIT along and see if that evens out the playing field?

/Claes.


-------- Originalmeddelande --------
Fr?n: Peter Harvey <harvey at actenum.com> 
Datum:04-06-2014  01:12  (GMT+01:00) 
Till: Claes Redestad <claes.redestad at oracle.com> 
Kopia: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net> 
Rubrik: Re: G1 GC consuming all CPU time 

Here's my list of benchmarks from the previous email:
-XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU load was 0.1247
-XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU load was 0.5960
-XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU load was 0.9184
-XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119 regardless of the setting of -XX:ParallelGCThreads
And here's using those new parameters with my original code:
-XX:+UseG1GC -XX:-G1UseAdaptiveConcRefinement -XX:G1ConcRefinementThreads=1 took 53.077 seconds and CPU load was 0.1237
I may be completely misunderstanding the implications of your JMH tests, but those parameters don't seem to improve overall application performance when using my original code.

It looks to me like the use of the G1 garbage collector is somehow also slowing down the application itself (but not necessarily your JMH test?). Not only are the GC threads tripping over each other, they seem to be tripping up the main application thread too.

Regards,
Peter.


On Tue, Jun 3, 2014 at 4:57 PM, Claes Redestad <claes.redestad at oracle.com> wrote:
Hi,

 guessing it's due to the concurrent GC threads tripping over themselves: the microbenchmark is creating one, big linked structure that will occupy most of the old gen, and then you're doing intense pointer updates which will trigger scans and updates of remembered sets etc. I actually don't know half the details and am mostly just guessing. :-)

 Converted your micro to a JMH micro to ease with experimenting[1] (hope you don't mind) then verified the regression reproduces:

 Parallel:
 java -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.*
 ~1625 ops/ms

 G1:
 java -XX:+UseG1GC -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.*
 ~12 ops/ms

 Testing my hunch, let's try forcing the concurrent refinement to use only one thread:

 java -XX:+UseG1GC -XX:-G1UseAdaptiveConcRefinement -XX:G1ConcRefinementThreads=1 -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.*
~1550 ops/ms

 I guess we have a winner! I won't hazard to try and answer your questions about how this should be resolved - perhaps the adaptive policy can detect this corner case and scale down the number of refinement threads when they start interfering with each other, or something.

 /Claes

[1]

package org.sample;

import org.openjdk.jmh.annotations.GenerateMicroBenchmark;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;

import java.util.Random;

@State(Scope.Thread)
public class G1GraphBench {

    private static class Node {

        private Node next;
        private Node[] others = new Node[10];
    }

    Node head = buildGraph();


    private static Node buildGraph() {

        // Create a collection of Node objects
        Node[] array = new Node[10000000];
        for (int i = 0; i < array.length; i++) {
            array[i] = new Node();
        }

        // Each Node refers to 10 other random Nodes
        Random random = new Random(12);
        for (int i = 0; i < array.length; i++) {
            for (int j = 0; j < array[i].others.length; j++) {
                int k = random.nextInt(array.length);
                array[i].others[j] = array[k];
            }
        }

        // The first Node serves as the head of a queue
        return array[0];
    }

    @GenerateMicroBenchmark
    public Node nodeBench() {

        Node node = head;
        head = node.next;
        node.next = null;

        // Add the other Nodes to the head of the queue
        for (Node other : node.others) {
            other.next = head;
            head = other;
        }
        return head;
    }

}

On 2014-06-03 23:43, Peter Harvey wrote:
Thanks for the response. Here are the additional logs.

    0.094: [GC pause (young), 0.0347877 secs]
       [Parallel Time: 34.1 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 94.2, Avg: 104.4, Max: 126.4,
    Diff: 32.2]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 3.3, Max: 25.0,
    Diff: 25.0, Sum: 26.6]
          [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 5.3, Diff: 5.3,
    Sum: 16.7]
             [Processed Buffers: Min: 0, Avg: 2.3, Max: 9, Diff: 9,
    Sum: 18]
          [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
    0.0]
          [Object Copy (ms): Min: 1.8, Avg: 18.3, Max: 29.9, Diff:
    28.2, Sum: 146.4]
          [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1,
    Sum: 0.6]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.1]
          [GC Worker Total (ms): Min: 1.9, Avg: 23.8, Max: 34.1, Diff:
    32.2, Sum: 190.4]
          [GC Worker End (ms): Min: 128.2, Avg: 128.3, Max: 128.3,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.0 ms]
       [Other: 0.6 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.3 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.0 ms]
       [Eden: 24.0M(24.0M)->0.0B(11.0M) Survivors: 0.0B->3072.0K Heap:
    62.1M(254.0M)->62.2M(254.0M)]
     [Times: user=0.09 sys=0.03, real=0.04 secs]
    0.131: [GC pause (young), 0.0295093 secs]
       [Parallel Time: 28.1 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 130.9, Avg: 135.5, Max: 158.7,
    Diff: 27.8]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
    0.4, Sum: 1.2]
          [Update RS (ms): Min: 0.0, Avg: 11.4, Max: 27.5, Diff: 27.5,
    Sum: 90.8]
             [Processed Buffers: Min: 0, Avg: 23.8, Max: 42, Diff: 42,
    Sum: 190]
          [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
    0.0]
          [Object Copy (ms): Min: 0.0, Avg: 11.7, Max: 17.1, Diff:
    17.1, Sum: 93.8]
          [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3,
    Sum: 1.7]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.1]
          [GC Worker Total (ms): Min: 0.2, Avg: 23.5, Max: 28.1, Diff:
    27.8, Sum: 187.7]
          [GC Worker End (ms): Min: 159.0, Avg: 159.0, Max: 159.0,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.1 ms]
       [Other: 1.3 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.0 ms]
       [Eden: 11.0M(11.0M)->0.0B(23.0M) Survivors: 3072.0K->2048.0K
    Heap: 73.2M(254.0M)->82.7M(508.0M)]
     [Times: user=0.19 sys=0.00, real=0.03 secs]
    0.166: [GC pause (young), 0.0385523 secs]
       [Parallel Time: 35.9 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 166.4, Avg: 169.8, Max: 192.4,
    Diff: 25.9]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
    0.4, Sum: 1.9]
          [Update RS (ms): Min: 0.0, Avg: 10.9, Max: 31.9, Diff: 31.9,
    Sum: 87.2]
             [Processed Buffers: Min: 0, Avg: 14.6, Max: 26, Diff: 26,
    Sum: 117]
          [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
    0.1]
          [Object Copy (ms): Min: 3.5, Avg: 21.4, Max: 27.0, Diff:
    23.4, Sum: 171.1]
          [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
    Sum: 0.4]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.1]
          [GC Worker Total (ms): Min: 10.0, Avg: 32.6, Max: 35.9,
    Diff: 25.9, Sum: 260.7]
          [GC Worker End (ms): Min: 202.3, Avg: 202.4, Max: 202.4,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.0 ms]
       [Other: 2.6 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.0 ms]
       [Eden: 23.0M(23.0M)->0.0B(46.0M) Survivors: 2048.0K->4096.0K
    Heap: 105.7M(508.0M)->110.1M(1016.0M)]
     [Times: user=0.19 sys=0.00, real=0.04 secs]
    0.222: [GC pause (young), 0.0558720 secs]
       [Parallel Time: 53.0 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 222.0, Avg: 222.2, Max: 222.5,
    Diff: 0.5]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
    0.4, Sum: 1.5]
          [Update RS (ms): Min: 7.7, Avg: 8.7, Max: 10.9, Diff: 3.2,
    Sum: 69.4]
             [Processed Buffers: Min: 7, Avg: 8.5, Max: 12, Diff: 5,
    Sum: 68]
          [Scan RS (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.6, Sum:
    2.3]
          [Object Copy (ms): Min: 41.7, Avg: 43.6, Max: 44.3, Diff:
    2.7, Sum: 348.5]
          [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
    Sum: 0.0]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.2]
          [GC Worker Total (ms): Min: 52.4, Avg: 52.7, Max: 52.9,
    Diff: 0.5, Sum: 421.8]
          [GC Worker End (ms): Min: 274.9, Avg: 274.9, Max: 274.9,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.0 ms]
       [Other: 2.8 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.0 ms]
       [Eden: 46.0M(46.0M)->0.0B(74.0M) Survivors: 4096.0K->7168.0K
    Heap: 156.1M(1016.0M)->158.6M(1625.0M)]
     [Times: user=0.48 sys=0.01, real=0.06 secs]
    0.328: [GC pause (young), 0.0853794 secs]
       [Parallel Time: 82.8 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 327.9, Avg: 330.8, Max: 351.1,
    Diff: 23.2]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
    0.3, Sum: 2.0]
          [Update RS (ms): Min: 0.0, Avg: 5.5, Max: 8.3, Diff: 8.3,
    Sum: 43.9]
             [Processed Buffers: Min: 0, Avg: 2.3, Max: 3, Diff: 3,
    Sum: 18]
          [Scan RS (ms): Min: 0.0, Avg: 2.2, Max: 3.3, Diff: 3.3, Sum:
    17.4]
          [Object Copy (ms): Min: 59.5, Avg: 71.8, Max: 73.7, Diff:
    14.2, Sum: 574.7]
          [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
    Sum: 0.2]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.2]
          [GC Worker Total (ms): Min: 59.5, Avg: 79.8, Max: 82.7,
    Diff: 23.2, Sum: 638.4]
          [GC Worker End (ms): Min: 410.6, Avg: 410.7, Max: 410.7,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.1 ms]
       [Other: 2.6 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.1 ms]
       [Eden: 74.0M(74.0M)->0.0B(94.0M) Survivors: 7168.0K->11.0M
    Heap: 232.6M(1625.0M)->237.6M(2112.0M)]
     [Times: user=0.59 sys=0.00, real=0.09 secs]
    0.447: [GC pause (young), 0.1239103 secs]
       [Parallel Time: 121.5 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 447.5, Avg: 447.7, Max: 448.5,
    Diff: 0.9]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
    0.3, Sum: 1.9]
          [Update RS (ms): Min: 26.5, Avg: 28.2, Max: 28.7, Diff: 2.2,
    Sum: 225.7]
             [Processed Buffers: Min: 38, Avg: 39.8, Max: 44, Diff: 6,
    Sum: 318]
          [Scan RS (ms): Min: 0.3, Avg: 0.7, Max: 1.9, Diff: 1.6, Sum:
    5.3]
          [Object Copy (ms): Min: 92.1, Avg: 92.2, Max: 92.3, Diff:
    0.2, Sum: 737.5]
          [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
    Sum: 0.3]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.2]
          [GC Worker Total (ms): Min: 120.6, Avg: 121.4, Max: 121.5,
    Diff: 0.9, Sum: 970.8]
          [GC Worker End (ms): Min: 569.0, Avg: 569.0, Max: 569.0,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.1 ms]
       [Other: 2.3 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.1 ms]
       [Eden: 94.0M(94.0M)->0.0B(111.0M) Survivors: 11.0M->14.0M Heap:
    331.6M(2112.0M)->334.6M(2502.0M)]
     [Times: user=0.80 sys=0.05, real=0.12 secs]
    0.599: [GC pause (young), 0.1479438 secs]
       [Parallel Time: 145.7 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 599.4, Avg: 599.5, Max: 599.8,
    Diff: 0.4]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
    0.3, Sum: 1.9]
          [Update RS (ms): Min: 41.8, Avg: 43.0, Max: 44.0, Diff: 2.1,
    Sum: 343.6]
             [Processed Buffers: Min: 67, Avg: 70.9, Max: 73, Diff: 6,
    Sum: 567]
          [Scan RS (ms): Min: 0.0, Avg: 0.8, Max: 1.9, Diff: 1.9, Sum:
    6.2]
          [Object Copy (ms): Min: 101.3, Avg: 101.6, Max: 101.7, Diff:
    0.3, Sum: 812.6]
          [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
    Sum: 0.1]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.2]
          [GC Worker Total (ms): Min: 145.2, Avg: 145.6, Max: 145.6,
    Diff: 0.4, Sum: 1164.6]
          [GC Worker End (ms): Min: 745.1, Avg: 745.1, Max: 745.1,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.1 ms]
       [Other: 2.2 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.1 ms]
       [Eden: 111.0M(111.0M)->0.0B(124.0M) Survivors: 14.0M->16.0M
    Heap: 445.6M(2502.0M)->448.6M(2814.0M)]
     [Times: user=1.20 sys=0.05, real=0.15 secs]
    0.787: [GC pause (young), 0.1625321 secs]
       [Parallel Time: 160.0 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 786.6, Avg: 786.7, Max: 786.9,
    Diff: 0.4]
          [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
    0.3, Sum: 1.8]
          [Update RS (ms): Min: 46.4, Avg: 47.0, Max: 49.0, Diff: 2.5,
    Sum: 376.0]
             [Processed Buffers: Min: 75, Avg: 78.0, Max: 79, Diff: 4,
    Sum: 624]
          [Scan RS (ms): Min: 0.0, Avg: 0.9, Max: 1.5, Diff: 1.5, Sum:
    7.4]
          [Object Copy (ms): Min: 110.6, Avg: 111.7, Max: 112.0, Diff:
    1.4, Sum: 893.5]
          [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
    Sum: 0.1]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff:
    0.1, Sum: 0.3]
          [GC Worker Total (ms): Min: 159.6, Avg: 159.9, Max: 160.0,
    Diff: 0.4, Sum: 1279.0]
          [GC Worker End (ms): Min: 946.5, Avg: 946.5, Max: 946.6,
    Diff: 0.1]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.1 ms]
       [Other: 2.4 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.2 ms]
       [Eden: 124.0M(124.0M)->0.0B(135.0M) Survivors: 16.0M->18.0M
    Heap: 572.6M(2814.0M)->576.6M(3064.0M)]
     [Times: user=1.37 sys=0.00, real=0.16 secs]
    0.981: [GC pause (young), 0.2063055 secs]
       [Parallel Time: 204.1 ms, GC Workers: 8]
          [GC Worker Start (ms): Min: 980.8, Avg: 980.9, Max: 981.0,
    Diff: 0.2]
          [Ext Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.3, Diff:
    0.2, Sum: 2.1]
          [Update RS (ms): Min: 55.9, Avg: 57.8, Max: 58.8, Diff: 2.9,
    Sum: 462.8]
             [Processed Buffers: Min: 100, Avg: 101.5, Max: 103, Diff:
    3, Sum: 812]
          [Scan RS (ms): Min: 0.0, Avg: 1.0, Max: 3.1, Diff: 3.1, Sum:
    8.3]
          [Object Copy (ms): Min: 144.7, Avg: 144.8, Max: 144.9, Diff:
    0.1, Sum: 1158.3]
          [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
    Sum: 0.3]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
    0.0, Sum: 0.2]
          [GC Worker Total (ms): Min: 203.8, Avg: 204.0, Max: 204.0,
    Diff: 0.2, Sum: 1631.9]
          [GC Worker End (ms): Min: 1184.9, Avg: 1184.9, Max: 1184.9,
    Diff: 0.0]
       [Code Root Fixup: 0.0 ms]
       [Clear CT: 0.1 ms]
       [Other: 2.1 ms]
          [Choose CSet: 0.0 ms]
          [Ref Proc: 0.1 ms]
          [Ref Enq: 0.0 ms]
          [Free CSet: 0.1 ms]
       [Eden: 135.0M(135.0M)->0.0B(143.0M) Survivors: 18.0M->20.0M
    Heap: 711.6M(3064.0M)->714.6M(3264.0M)]
     [Times: user=1.40 sys=0.11, real=0.21 secs]
    CPU Load Is -1.0

    Start
    Stop
    Sleep
    CPU Load Is 0.9166222455142531

    Start
    Stop
    Sleep
    CPU Load Is 0.907013989900451

    Start
    Stop
    Sleep
    CPU Load Is 0.9085635227776081

    Start
    Stop
    Sleep
    CPU Load Is 0.909945506396622


Note that all the logged GC occurs during the construction of my graph of Nodes, which is /before/ my algorithm (modifyGraph) starts, There is no log of GC activity once the algorithm starts, but there is significant (100%) CPU usage.

My questions are:

  * Why is the G1 garbage collector consuming so much CPU time? What
    is it doing?
  * Why is the G1 garbage collector not logging anything? The only

    reason I even know it's the garbage collector consuming my CPU
    time is that (a) I only see this behaviour when the G1 collector
    is enabled and (b) the load on the CPU correlates with the value
    of -XX:ParallelGCThreads.
  * Are there particular object-graph structures that the G1 garbage

    collector will struggle with? Should complex graphs be considered
    bad coding practice?
  * How can I write my code to avoid this behaviour in the G1 garbage

    collector? For example, if all my Nodes are in an array, will this
    fix it?
  * Should this be considered a bug in the G1 garbage collector? This

    is far beyond 'a small increase in CPU usage'.

Just to demonstrate the issue further, I timed my calls to modifyGraph() and trialled different GC parameters:

  * -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU
    load was 0.1247
  * -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU
    load was 0.5960
  * -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU
    load was 0.9184
  * -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119

    regardless of the setting of -XX:ParallelGCThreads

So using the CMS GC made my application 44x faster (14.815 seconds versus 0.322 seconds) and placed 1/8th of the load (0.9184 versus 0.1119) on the CPU.

If my code represents some kind of hypothetical worst case for the G1 garbage collector, I think it should be documented and/or fixed somehow.

Regards,
Peter.


On Tue, Jun 3, 2014 at 3:16 PM, Tao Mao <yiyeguhu at gmail.com <mailto:yiyeguhu at gmail.com>> wrote:

    And, use ?XX:+PrintGCDetails ?XX:+PrintGCTimeStamps to get more
    log. Thanks. -Tao


    On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao <yiyeguhu at gmail.com
    <mailto:yiyeguhu at gmail.com>> wrote:

        Hi Peter,

        What was your actual question?
        Try -XX:ParallelGCThreads=<value> if you want less CPU usage
        from GC.

        Thanks.
        Tao


        On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey
        <harvey at actenum.com <mailto:harvey at actenum.com>> wrote:

            Small correction. The last example of output was with
            "-XX:+UseConcMarkSweepGC -verbose:gc".


            On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey
            <harvey at actenum.com <mailto:harvey at actenum.com>> wrote:

                I have an algorithm (at bottom of email) which builds
                a graph of 'Node' objects with random connections
                between them. It then repeatedly processes a queue of
                those Nodes, adding new Nodes to the queue as it goes.
                This is a single-threaded algorithm that will never
                terminate. Our actual production code is much more
                complex, but I've trimmed it down as much as possible.

                On Windows 7 with JRE 7u60, enabling the G1 garbage
                collector will cause the JRE to consume all 8 cores of
                my CPU. No other garbage collector does this. You can
                see the differences in CPU load in the example output
                below. It's also worth nothing that "-verbose:gc" with
                the G1 garbage collector prints nothing after my
                algorithm starts. Presumably the G1 garbage collector
                is doing something (concurrent mark?), but it's not
                printing anything about it.

                When run with VM args "-XX:+UseG1GC -verbose:gc" I get
                output like this (note the huge CPU load value which
                should not be this high for a single-threaded
                algorithm on an 8 core CPU):

                    [GC pause (young) 62M->62M(254M), 0.0394214 secs]
                    [GC pause (young) 73M->83M(508M), 0.0302781 secs]
                    [GC pause (young) 106M->111M(1016M), 0.0442273 secs]
                    [GC pause (young) 157M->161M(1625M), 0.0660902 secs]
                    [GC pause (young) 235M->240M(2112M), 0.0907231 secs]
                    [GC pause (young) 334M->337M(2502M), 0.1356917 secs]
                    [GC pause (young) 448M->450M(2814M), 0.1219090 secs]
                    [GC pause (young) 574M->577M(3064M), 0.1778062 secs]
                    [GC pause (young) 712M->715M(3264M), 0.1878443 secs]
                    CPU Load Is -1.0

                    Start
                    Stop
                    Sleep
                    CPU Load Is 0.9196154547182949

                    Start
                    Stop
                    Sleep
                    CPU Load Is 0.9150735995043818

                    ...


                When run with VM args "-XX:+UseParallelGC -verbose:gc"
                I get output like this:

                    [GC 65536K->64198K(249344K), 0.0628289 secs]
                    [GC 129734K->127974K(314880K), 0.1583369 secs]
                    [Full GC 127974K->127630K(451072K), 0.9675224 secs]
                    [GC 258702K->259102K(451072K), 0.3543645 secs]
                    [Full GC 259102K->258701K(732672K), 1.8085702 secs]
                    [GC 389773K->390181K(790528K), 0.3332060 secs]
                    [GC 579109K->579717K(803328K), 0.5126388 secs]
                    [Full GC 579717K->578698K(1300480K), 4.0647303 secs]
                    [GC 780426K->780842K(1567232K), 0.4364933 secs]
                    CPU Load Is -1.0

                    Start
                    Stop
                    Sleep
                    CPU Load Is 0.03137771539054431

                    Start
                    Stop
                    Sleep
                    CPU Load Is 0.032351299224373145

                    ...


                When run with VM args "-verbose:gc" I get output like
                this:

                    [GC 69312K->67824K(251136K), 0.1533803 secs]
                    [GC 137136K->135015K(251136K), 0.0970460 secs]
                    [GC 137245K(251136K), 0.0095245 secs]
                    [GC 204327K->204326K(274368K), 0.1056259 secs]
                    [GC 273638K->273636K(343680K), 0.1081515 secs]
                    [GC 342948K->342946K(412992K), 0.1181966 secs]
                    [GC 412258K->412257K(482304K), 0.1126966 secs]
                    [GC 481569K->481568K(551808K), 0.1156015 secs]
                    [GC 550880K->550878K(620928K), 0.1184089 secs]
                    [GC 620190K->620189K(690048K), 0.1209312 secs]
                    [GC 689501K->689499K(759552K), 0.1199338 secs]
                    [GC 758811K->758809K(828864K), 0.1162532 secs]
                    CPU Load Is -1.0

                    Start
                    Stop
                    Sleep
                    CPU Load Is 0.10791719146608299

                    Start
                    [GC 821213K(828864K), 0.1966807 secs]
                    Stop
                    Sleep
                    CPU Load Is 0.1540065314146181

                    Start
                    Stop
                    Sleep
                    [GC 821213K(1328240K), 0.1962688 secs]
                    CPU Load Is 0.08427292195744103

                    ...


                Why is the G1 garbage collector consuming so much CPU
                time? Is it stuck in the mark phase as I am modifying
                the graph structure?

                I'm not a subscriber to the list, so please CC me in
                any response.

                Thanks,
                Peter.

                --

                import java.lang.management.ManagementFactory;
                import com.sun.management.OperatingSystemMXBean;
                import java.util.Random;

                @SuppressWarnings("restriction")
                public class Node {
                private static OperatingSystemMXBean os =
                (OperatingSystemMXBean)
                ManagementFactory.getOperatingSystemMXBean();

                private Node next;

                private Node[] others = new Node[10];

                public static void main(String[] args) throws
                InterruptedException {

                // Build a graph of Nodes
                Node head = buildGraph();

                while (true) {
                // Print CPU load for this process
                System.out.println("CPU Load Is " +
                os.getProcessCpuLoad());
                System.out.println();

                // Modify the graph
                System.out.println("Start");
                head = modifyGraph(head);
                System.out.println("Stop");

                // Sleep, as otherwise we tend to DoS the host computer...
                System.out.println("Sleep");
                Thread.sleep(1000);
                }
                }

                private static Node buildGraph() {

                // Create a collection of Node objects
                Node[] array = new Node[10000000];
                for (int i = 0; i < array.length; i++) {
                array[i] = new Node();
                }

                // Each Node refers to 10 other random Nodes
                Random random = new Random(12);
                for (int i = 0; i < array.length; i++) {
                for (int j = 0; j < array[i].others.length; j++) {
                int k = random.nextInt(array.length);
                array[i].others[j] = array[k];
                }
                }

                // The first Node serves as the head of a queue
                return array[0];
                }

                private static Node modifyGraph(Node head) {

                // Perform a million iterations
                for (int i = 0; i < 1000000; i++) {

                // Pop a Node off the head of the queue
                Node node = head;
                head = node.next;
                node.next = null;

                // Add the other Nodes to the head of the queue
                for (Node other : node.others) {
                other.next = head;
                head = other;
                }
                }
                return head;
                }

                }

                --                 *Actenum Corporation*
                Peter Harvey  |  Cell: 780.729.8192 <tel:780.729.8192>
                 | harvey at actenum.com <mailto:harvey at actenum.com>  |
                www.actenum.com <http://www.actenum.com>


            --             *Actenum Corporation*
            Peter Harvey  |  Cell: 780.729.8192 <tel:780.729.8192>  |
            harvey at actenum.com <mailto:harvey at actenum.com>  |
            www.actenum.com <http://www.actenum.com>


-- 
*Actenum Corporation*
Peter Harvey  |  Cell: 780.729.8192  | harvey at actenum.com <mailto:harvey at actenum.com>  | www.actenum.com <http://www.actenum.com>


-- 
Actenum Corporation
Peter Harvey  |  Cell: 780.729.8192  |  harvey at actenum.com  |  www.actenum.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140604/16ea063a/attachment.htm>

From per.liden at oracle.com  Wed Jun  4 09:21:56 2014
From: per.liden at oracle.com (Per Liden)
Date: Wed, 04 Jun 2014 11:21:56 +0200
Subject: RFR(s): 8044768: Backout fix for JDK-8040807
Message-ID: <538EE534.2000600@oracle.com>

Hi,

Requesting reviews on this anti-delta to backout the fix for 
JDK-8040807. It turns out that there are still issues here, which for 
some reason didn't show up in the testing I did :(

Bug: https://bugs.openjdk.java.net/browse/JDK-8044768
Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/

Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807
Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
Original review: 
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html

Thanks!
/Per


From bengt.rutisson at oracle.com  Wed Jun  4 10:35:02 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 04 Jun 2014 12:35:02 +0200
Subject: RFR(s): 8044768: Backout fix for JDK-8040807
In-Reply-To: <538EE534.2000600@oracle.com>
References: <538EE534.2000600@oracle.com>
Message-ID: <538EF656.1030306@oracle.com>


Hi Per,

Looks good.

Bengt


On 2014-06-04 11:21, Per Liden wrote:
> Hi,
>
> Requesting reviews on this anti-delta to backout the fix for 
> JDK-8040807. It turns out that there are still issues here, which for 
> some reason didn't show up in the testing I did :(
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8044768
> Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/
>
> Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807
> Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
> Original review: 
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html
>
> Thanks!
> /Per


From erik.helin at oracle.com  Wed Jun  4 12:02:45 2014
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 04 Jun 2014 14:02:45 +0200
Subject: RFR(s): 8044768: Backout fix for JDK-8040807
In-Reply-To: <538EE534.2000600@oracle.com>
References: <538EE534.2000600@oracle.com>
Message-ID: <1938981.x4JAnBs7PM@ehelin-desktop>

Hi Per,

looks good.

Thanks,
Erik

On Wednesday 04 June 2014 11.21.56 Per Liden wrote:
> Hi,
> 
> Requesting reviews on this anti-delta to backout the fix for
> JDK-8040807. It turns out that there are still issues here, which for
> some reason didn't show up in the testing I did :(
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8044768
> Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/
> 
> Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807
> Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
> Original review:
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html
> 
> Thanks!
> /Per


From per.liden at oracle.com  Wed Jun  4 12:14:36 2014
From: per.liden at oracle.com (Per Liden)
Date: Wed, 04 Jun 2014 14:14:36 +0200
Subject: RFR(s): 8044768: Backout fix for JDK-8040807
In-Reply-To: <1938981.x4JAnBs7PM@ehelin-desktop>
References: <538EE534.2000600@oracle.com> <1938981.x4JAnBs7PM@ehelin-desktop>
Message-ID: <538F0DAB.6050004@oracle.com>

Thanks Bengt, Erik!

/Per

On 2014-06-04 14:02, Erik Helin wrote:
> Hi Per,
>
> looks good.
>
> Thanks,
> Erik
>
> On Wednesday 04 June 2014 11.21.56 Per Liden wrote:
>> Hi,
>>
>> Requesting reviews on this anti-delta to backout the fix for
>> JDK-8040807. It turns out that there are still issues here, which for
>> some reason didn't show up in the testing I did :(
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8044768
>> Webrev: http://cr.openjdk.java.net/~pliden/8044768/webrev.0/
>>
>> Original bug: https://bugs.openjdk.java.net/browse/JDK-8040807
>> Original webrev: http://cr.openjdk.java.net/~pliden/8040807/webrev.1/
>> Original review:
>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010102.html
>>
>> Thanks!
>> /Per


From andrey.x.zakharov at oracle.com  Wed Jun  4 14:40:47 2014
From: andrey.x.zakharov at oracle.com (Andrey Zakharov)
Date: Wed, 04 Jun 2014 18:40:47 +0400
Subject: RFR: 8041506 - The test gc/g1/TestHumongousShrinkHeap.java reports
	that memory is not de-committed
In-Reply-To: <53835E9B.7080102@oracle.com>
References: <5374A502.5030409@oracle.com> <53835E9B.7080102@oracle.com>
Message-ID: <538F2FEF.30105@oracle.com>

Hi, Dmitry. Thanks for corrections.
Here is updated webrev: 
http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/
testing: 
http://aurora.ru.oracle.com/functional/faces/ChessBoard.xhtml?reportName=J2SEFailures&parameters=[batchNames]501230.ute.hs_jtreg.accept.full
bug: https://bugs.openjdk.java.net/browse/JDK-8041946


On 26.05.2014 19:32, Dmitry Fazunenko wrote:
> Hi Andrey,
>
> Sorry, it took too long from me to review you change.
> I have several comments:
> - overall fix looks good
> - I think you need to change the subject: you fix 8041946, not 8041506
> - Replace 'TestHumongousShrinkHeap' with 'TestShrinkDefragmentedHeap'
> - Make MemoryUsagePrinter as static inner class of test (to avoid 
> possible conflicts with other tests)
> - It would be good if you provide more text description to the test, like
>   * allocate small objects mixed with humongous ones
>       "ssssHssssHssssHssssHssssH"
>   * release all allocated object except the last humongous one
>       ".............................................H"
>   * invoke gc and check that  memory returned to the system (amount of 
> committed memory got down)
Done
> - I'm not sure that you can predict the expected amount of committed 
> memory at the end... I wouldn't use the expectedCommitted in the test 
> (there are many memory consumers, not only your test, so the final 
> committed should be either less or greater than expectedCommitted )
Well, I have tested it a lot with JFR command line options, on all 
platforms. I found a lag with JMX on Solaris, and just put sleep before 
measure. Also I replaced run/othervm with ProcessBuilder. I'm planning 
to replace it in other early our CMM tests.
> - I think you don't need to touch  'test/TEST.groups'. There is 
> :needs_g1gc tests group (hs/test/closed/TEST.group) which lists all g1 
> specific tests.
> - Please provide information on how you tested your change.
http://aurora.ru.oracle.com/functional/faces/ChessBoard.xhtml?reportName=J2SEFailures&parameters=[batchNames]501230.ute.hs_jtreg.accept.full
Thanks

>
> Thanks,
> Dima
>
>
> On 15.05.2014 15:29, Andrey Zakharov wrote:
>> Hi.
>> To proper testing of free list sorting we need to defragment memory 
>> with small young and humongous objects
>> This is test scenario:
>> Make enough space for new objects to prevent it going old.
>>  - allocate bunch of small objects, and a bit of humongous
>> several times.
>>
>> Free almost all of allocated stuff. Check that heap shrinks after GC.
>>
>> webrev: http://cr.openjdk.java.net/~jwilhelm/8041506/webrev.02/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8041506
>>
>> Thanks.
>>
>


From jon.masamitsu at oracle.com  Thu Jun  5 17:53:40 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 05 Jun 2014 10:53:40 -0700
Subject: G1 GC consuming all CPU time
Message-ID: <5390AEA4.4030605@oracle.com>

Forwarding this for Peter Harvey because I don't know what
happened to it (while it was waiting to be moderated).

==============================================

Hi,

I've constructed a more practical microbenchmark that demonstrates the 
previously-described issue in the G1 collector. The code at the end of 
this email will repeatedly prepend a million randomly-valued nodes to a 
doubly-linked list, and then apply a simple merge sort to that list. The 
merge sort manipulates many reference values, resulting in the same 
issues as described earlier.

Using just -XX:+UseG1GC with no other options, the VM will seemingly try 
to balance the number of concurrent refinement threads. But no matter 
how many threads it chooses to use, performance is significantly 
degraded when compare to the CMS collector.

When using the CMS collector my microbenchmark has output like:

Took 752 to prepend 1000000 and then sort all 1000000
Took 2114 to prepend 1000000 and then sort all 2000000
Took 2672 to prepend 1000000 and then sort all 3000000
Took 2752 to prepend 1000000 and then sort all 4000000
Took 2056 to prepend 1000000 and then sort all 5000000

When using the G1 collector my microbenchmark has output like:

Took 1693 to prepend 1000000 and then sort all 1000000
Took 5774 to prepend 1000000 and then sort all 2000000
Took 9546 to prepend 1000000 and then sort all 3000000
Took 15480 to prepend 1000000 and then sort all 4000000
Took 20235 to prepend 1000000 and then sort all 5000000

With the -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeRSetStats 
-XX:G1SummarizeRSetStatsPeriod=1 options enabled I get diagnostic output 
like:

  Concurrent RS processed 29981518 cards
   Of 117869 completed buffers:
         18647 ( 15.8%) by conc RS threads.
         99222 ( 84.2%) by mutator threads.

  Concurrent RS processed 68819227 cards
   Of 273465 completed buffers:
        164272 ( 60.1%) by conc RS threads.
        109193 ( 39.9%) by mutator threads.


My original code was an extreme corner case of graph manipulation 
(though yes, we do ship a commercial product with that kind of code in 
it). I hope that 'merge sort on a linked list of random data' can serve 
as a more useful example of where the G1 collector will not perform 
well. From what I understand, any algorithm that modifies a large number 
of references connecting many small objects will bring out this 
behaviour in the G1 collector. For example, I would suspect that large 
reference-based heap structures (where inserted nodes have random 
values) may also cause issues for the G1 collector.

Regards,
Peter.

----

package linkedlist;

import java.util.Random;

public class List {
// Node in the linked list
public final static class Node {
Node prev;

Node next;

double value;
}

// Random number generator
private final Random random = new Random(12);

// Dummy node for the head
private final Node head = new Node();

// Split the list at the given node, and sort the right-hand side
private static Node splitAndSort(Node node, boolean ascending) {
// Split the list at the given node
if (node.prev != null)
node.prev.next = null;
node.prev = null;

// Ensure we have at LEAST two elements
if (node.next == null)
return node;

// Find the midpoint to split the list
Node mid = node.next;
Node end = node.next;
do {
end = end.next;
if (end != null) {
end = end.next;
mid = mid.next;
}
} while (end != null);

// Sort the two sides
Node list2 = splitAndSort(mid, ascending);
Node list1 = splitAndSort(node, ascending);

// Merge the two lists (setting prev only)
node = null;
while (true) {

if (list1 == null) {
list2.prev = node;
node = list2;
break;
} else if (list2 == null) {
list1.prev = node;
node = list1;
break;
} else if (ascending == (list1.value < list2.value)) {
list2.prev = node;
node = list2;
list2 = list2.next;
} else {
list1.prev = node;
node = list1;
list1 = list1.next;
}
}

// Fix all the nexts (based on the prevs)
while (node.prev != null) {
node.prev.next = node;
node = node.prev;
}

return node;
}

// Sort the nodes in ascending order
public void sortNodes() {
if (head.next != null) {
head.next = splitAndSort(head.next, true);
head.next.prev = head;
}
}

// Prepend a number of nodes with random values
public void prependNodes(int count) {
for (int i = 0; i < count; i++) {
Node node = new Node();
if (head.next != null) {
node.next = head.next;
head.next.prev = node;
}
node.value = random.nextDouble();
node.prev = head;
head.next = node;
}
}

public static void main(String[] args) {
List list = new List();
int count = 0;
long start = System.currentTimeMillis();
while (true) {
// Append a million random entries
list.prependNodes(1000000);
count += 1000000;

// Sort the entire list
list.sortNodes();

// Print the time taken for this pass
long end = System.currentTimeMillis();
System.out.println("Took " + (end - start) + " to prepend 1000000 and 
then sort all " + count);
start = end;
}
}
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140605/f9434b4b/attachment.htm>

From John.Coomes at oracle.com  Fri Jun  6 14:06:11 2014
From: John.Coomes at oracle.com (John Coomes)
Date: Fri, 6 Jun 2014 07:06:11 -0700
Subject: RFR(S): 8026396 - Remove information duplication in the collector
	policy
In-Reply-To: <5362F22A.1030505@oracle.com>
References: <5362F22A.1030505@oracle.com>
Message-ID: <21393.51923.883180.555814@mykonos.us.oracle.com>

Jesper Wilhelmsson (jesper.wilhelmsson at oracle.com) wrote:
> Hi,
> 
> Another step towards cleaner collector policy code.
> 
> This cleanup removes the need to keep the generation sizing flags in sync with 
> the collector policy version of the same variables during setup. The collector 
> policy variables are initialized in the start and then used throughout the setup 
> code. In the end we write the values back to the flags if needed.
> 
> This change builds upon the merged collector policy (8027643) currently in review.
> 
> Webrev: http://cr.openjdk.java.net/~jwilhelm/8026396/webrev/
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8026396

This looks good to me.

There seems to be a chance for underflow here:

534     _min_gen1_size = MIN2(_initial_gen1_size, _min_heap_byte_size - _min_gen0_size);

but your change is ok, since that also existed in the original code.

-John


From jesper.wilhelmsson at oracle.com  Mon Jun  9 07:45:55 2014
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Mon, 09 Jun 2014 09:45:55 +0200
Subject: RFR(S): 8026396 - Remove information duplication in the collector
	policy
In-Reply-To: <21393.51923.883180.555814@mykonos.us.oracle.com>
References: <5362F22A.1030505@oracle.com>
	<21393.51923.883180.555814@mykonos.us.oracle.com>
Message-ID: <53956633.5050107@oracle.com>

Thanks John!
I'll have a look at the underflow issue.
/Jesper

John Coomes skrev 6/6/14 16:06:
> Jesper Wilhelmsson (jesper.wilhelmsson at oracle.com) wrote:
>> Hi,
>>
>> Another step towards cleaner collector policy code.
>>
>> This cleanup removes the need to keep the generation sizing flags in sync with
>> the collector policy version of the same variables during setup. The collector
>> policy variables are initialized in the start and then used throughout the setup
>> code. In the end we write the values back to the flags if needed.
>>
>> This change builds upon the merged collector policy (8027643) currently in review.
>>
>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8026396/webrev/
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8026396
>
> This looks good to me.
>
> There seems to be a chance for underflow here:
>
> 534     _min_gen1_size = MIN2(_initial_gen1_size, _min_heap_byte_size - _min_gen0_size);
>
> but your change is ok, since that also existed in the original code.
>
> -John
>


From andrey.x.zakharov at oracle.com  Mon Jun  9 14:31:29 2014
From: andrey.x.zakharov at oracle.com (Andrey Zakharov)
Date: Mon, 09 Jun 2014 18:31:29 +0400
Subject: RFR: 8041946 -  CMM Testing: 8u40 an allocated humongous object at
	the end of the heap should not prevents shrinking the heap
Message-ID: <5395C541.1000207@oracle.com>

Hi, everyone!
Please, review this test for new feature in G1 - sorted free list which 
make possible shrinking of the defragmented heap.
To proper testing of free list sorting we need to defragment memory with 
small young and humongous objects.
This is test scenario:
  - make enough space for new objects to prevent it going old.
  - allocate bunch of small objects, and a bit of humongous several 
times (ssssHssssHssssHssssHssssHssssHssssHssssH)
  - free almost all of allocated stuff. Check that heap shrinks after 
GC. (-----------H)

Webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8041946

I have tested it along all major platforms and it works fine. There is 
lag on Solaris MXBeans about memory usage, so I need sleep by 1s.
It will be very nicely if somebody advice me about method which "flush" 
memory usage info to remove this sleep.
Thanks.


From igor.ignatyev at oracle.com  Tue Jun 10 14:48:52 2014
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 10 Jun 2014 18:48:52 +0400
Subject: RFR(XS) : 8044575 :
	testlibrary_tests/whitebox/vm_flags/UintxTest.java
	failed: assert(!res ||
	TypeEntriesAtCall::arguments_profiling_enabled())
	failed: no profiling of arguments
In-Reply-To: <538CF2C3.8090107@oracle.com>
References: <538CE366.90806@oracle.com> <538CF2C3.8090107@oracle.com>
Message-ID: <53971AD4.2000001@oracle.com>

Hi GC-people,

Could some of you look at the change?

Igor

On 06/03/2014 01:55 AM, Vladimir Kozlov wrote:
> Hi Igor,
>
> Looks good to me but I would ask GC group to comment on this change.
>
> Thanks,
> Vladimir
>
> On 6/2/14 1:49 PM, Igor Ignatyev wrote:
>> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/
>> 4 lines changed: 0 ins; 2 del; 2 mod;
>>
>> Hi all,
>>
>> Please review patch:
>>
>> Problem:
>> the test changes 'TypeProfileLevel' via WhiteBox during execution, but
>> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts
>> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers
>> one of these asserts.
>>
>> Fix:
>> - as a flag to change, the test uses 'VerifyGCStartAt' instead of
>> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during execution
>> - removed 'System.out.println' which was left by accident
>>
>> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575
>> testing: failing tests locally w/ different flags combinations


From dmitry.fazunenko at oracle.com  Tue Jun 10 15:08:37 2014
From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko)
Date: Tue, 10 Jun 2014 19:08:37 +0400
Subject: RFR(XS) : 8044575 :
	testlibrary_tests/whitebox/vm_flags/UintxTest.java
	failed: assert(!res ||
	TypeEntriesAtCall::arguments_profiling_enabled())
	failed: no profiling of arguments
In-Reply-To: <53971AD4.2000001@oracle.com>
References: <538CE366.90806@oracle.com> <538CF2C3.8090107@oracle.com>
	<53971AD4.2000001@oracle.com>
Message-ID: <53971F75.9050403@oracle.com>

Looks good to me.


On 10.06.2014 18:48, Igor Ignatyev wrote:
> Hi GC-people,
>
> Could some of you look at the change?
>
> Igor
>
> On 06/03/2014 01:55 AM, Vladimir Kozlov wrote:
>> Hi Igor,
>>
>> Looks good to me but I would ask GC group to comment on this change.
>>
>> Thanks,
>> Vladimir
>>
>> On 6/2/14 1:49 PM, Igor Ignatyev wrote:
>>> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/
>>> 4 lines changed: 0 ins; 2 del; 2 mod;
>>>
>>> Hi all,
>>>
>>> Please review patch:
>>>
>>> Problem:
>>> the test changes 'TypeProfileLevel' via WhiteBox during execution, but
>>> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts
>>> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers
>>> one of these asserts.
>>>
>>> Fix:
>>> - as a flag to change, the test uses 'VerifyGCStartAt' instead of
>>> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during 
>>> execution
>>> - removed 'System.out.println' which was left by accident
>>>
>>> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575
>>> testing: failing tests locally w/ different flags combinations


From jon.masamitsu at oracle.com  Tue Jun 10 16:09:49 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 10 Jun 2014 09:09:49 -0700
Subject: RFR(XS) : 8044575 :
	testlibrary_tests/whitebox/vm_flags/UintxTest.java
	failed: assert(!res ||
	TypeEntriesAtCall::arguments_profiling_enabled())
	failed: no profiling of arguments
In-Reply-To: <53971AD4.2000001@oracle.com>
References: <538CE366.90806@oracle.com> <538CF2C3.8090107@oracle.com>
	<53971AD4.2000001@oracle.com>
Message-ID: <53972DCD.5030105@oracle.com>

Igor,

Does it matter that VerifyGCStartAt is a diagnostic flag?

   diagnostic(uintx, VerifyGCStartAt, 0,                                   \
       "GC invoke count where +VerifyBefore/AfterGC kicks in") \

Otherwise, looks good.

Jon

On 06/10/2014 07:48 AM, Igor Ignatyev wrote:
> Hi GC-people,
>
> Could some of you look at the change?
>
> Igor
>
> On 06/03/2014 01:55 AM, Vladimir Kozlov wrote:
>> Hi Igor,
>>
>> Looks good to me but I would ask GC group to comment on this change.
>>
>> Thanks,
>> Vladimir
>>
>> On 6/2/14 1:49 PM, Igor Ignatyev wrote:
>>> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/
>>> 4 lines changed: 0 ins; 2 del; 2 mod;
>>>
>>> Hi all,
>>>
>>> Please review patch:
>>>
>>> Problem:
>>> the test changes 'TypeProfileLevel' via WhiteBox during execution, but
>>> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts
>>> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers
>>> one of these asserts.
>>>
>>> Fix:
>>> - as a flag to change, the test uses 'VerifyGCStartAt' instead of
>>> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during 
>>> execution
>>> - removed 'System.out.println' which was left by accident
>>>
>>> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575
>>> testing: failing tests locally w/ different flags combinations


From igor.ignatyev at oracle.com  Tue Jun 10 16:26:25 2014
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 10 Jun 2014 20:26:25 +0400
Subject: RFR(XS) : 8044575 :
	testlibrary_tests/whitebox/vm_flags/UintxTest.java
	failed: assert(!res ||
	TypeEntriesAtCall::arguments_profiling_enabled())
	failed: no profiling of arguments
In-Reply-To: <53972DCD.5030105@oracle.com>
References: <538CE366.90806@oracle.com>
	<538CF2C3.8090107@oracle.com>	<53971AD4.2000001@oracle.com>
	<53972DCD.5030105@oracle.com>
Message-ID: <539731B1.3000702@oracle.com>

Jon,

> Does it matter that VerifyGCStartAt is a diagnostic flag?
no it doesn't.

Jon/Vladimir/Dima, thanks for review.

Igor

On 06/10/2014 08:09 PM, Jon Masamitsu wrote:
> Igor,
>
> Does it matter that VerifyGCStartAt is a diagnostic flag?
>
>    diagnostic(uintx, VerifyGCStartAt,
> 0,                                   \
>        "GC invoke count where +VerifyBefore/AfterGC kicks in") \
>
> Otherwise, looks good.
>
> Jon
>
> On 06/10/2014 07:48 AM, Igor Ignatyev wrote:
>> Hi GC-people,
>>
>> Could some of you look at the change?
>>
>> Igor
>>
>> On 06/03/2014 01:55 AM, Vladimir Kozlov wrote:
>>> Hi Igor,
>>>
>>> Looks good to me but I would ask GC group to comment on this change.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 6/2/14 1:49 PM, Igor Ignatyev wrote:
>>>> webrev: http://cr.openjdk.java.net/~iignatyev/8044575/webrev.00/
>>>> 4 lines changed: 0 ins; 2 del; 2 mod;
>>>>
>>>> Hi all,
>>>>
>>>> Please review patch:
>>>>
>>>> Problem:
>>>> the test changes 'TypeProfileLevel' via WhiteBox during execution, but
>>>> 'TypeProfileLevel' isn't supposed to be changed and there's the asserts
>>>> based on that. the test w/ '-Xcomp and -XX:-TieredCompilation' triggers
>>>> one of these asserts.
>>>>
>>>> Fix:
>>>> - as a flag to change, the test uses 'VerifyGCStartAt' instead of
>>>> 'TypeProfileLevel'. 'VerifyGCStartAt' is safe to change during
>>>> execution
>>>> - removed 'System.out.println' which was left by accident
>>>>
>>>> jbs: https://bugs.openjdk.java.net/browse/JDK-8044575
>>>> testing: failing tests locally w/ different flags combinations
>


From bengt.rutisson at oracle.com  Wed Jun 11 10:33:01 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 11 Jun 2014 12:33:01 +0200
Subject: RFR (S): JDK-8046518: G1: Double calls to
	register_concurrent_cycle_end()
Message-ID: <5398305D.7070800@oracle.com>


Hi all,

Can I have a review for this change?

http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/

https://bugs.openjdk.java.net/browse/JDK-8046518

Background:
When we abort a concurrent cycle due to a Full GC in G1 we call 
ConcurrentMark::abort(). That will set _has_aborted flag and then call 
register_concurrent_cycle_end().

The concurrent marking thread will see the _has_aborted flag in its 
ConcurrentMarkThread::run() method, abort the execution and then call 
register_concurrent_cycle_end().

Currently this works since the code inside 
register_concurrent_cycle_end() is guarded by _concurrent_cycle_started 
which it then resets. So, the double calls will not necessarily result 
in too much extra work being done. But one of the things that 
register_concurrent_cycle_end() does is to call report_gc_end() on the 
concurrent GC tracer. That prevents further use of it for this GC. This 
means that inside the ConcurrentMarkThread::run() method we can not rely 
on the tracer.

Removing the call to register_concurrent_cycle_end() in 
ConcurrentMark::abort() and relying on the call in 
ConcurrentMarkThread::run() seems to be a reasonable approach.

Thanks,
Bengt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140611/79f0c88a/attachment.htm>

From stefan.karlsson at oracle.com  Wed Jun 11 11:10:43 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 11 Jun 2014 13:10:43 +0200
Subject: RFR (S): JDK-8046518: G1: Double calls to
	register_concurrent_cycle_end()
In-Reply-To: <5398305D.7070800@oracle.com>
References: <5398305D.7070800@oracle.com>
Message-ID: <53983933.2020204@oracle.com>


On 2014-06-11 12:33, Bengt Rutisson wrote:
>
> Hi all,
>
> Can I have a review for this change?
>
> http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/
>
> https://bugs.openjdk.java.net/browse/JDK-8046518
>
> Background:
> When we abort a concurrent cycle due to a Full GC in G1 we call 
> ConcurrentMark::abort(). That will set _has_aborted flag and then call 
> register_concurrent_cycle_end().
>
> The concurrent marking thread will see the _has_aborted flag in its 
> ConcurrentMarkThread::run() method, abort the execution and then call 
> register_concurrent_cycle_end().
>
> Currently this works since the code inside 
> register_concurrent_cycle_end() is guarded by 
> _concurrent_cycle_started which it then resets. So, the double calls 
> will not necessarily result in too much extra work being done. But one 
> of the things that register_concurrent_cycle_end() does is to call 
> report_gc_end() on the concurrent GC tracer. That prevents further use 
> of it for this GC. This means that inside the 
> ConcurrentMarkThread::run() method we can not rely on the tracer.
>
> Removing the call to register_concurrent_cycle_end() in 
> ConcurrentMark::abort() and relying on the call in 
> ConcurrentMarkThread::run() seems to be a reasonable approach.

The double call was deliberately put there to make sure that we end the 
tracing of the concurrent GC before starting to trace teh Full GC. Why 
do you need to change this? I guess it has to do with your other GCId 
changes?

thanks,
StefanK

>
> Thanks,
> Bengt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140611/05659ffb/attachment.htm>

From jesper.wilhelmsson at oracle.com  Wed Jun 11 13:19:49 2014
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Wed, 11 Jun 2014 15:19:49 +0200
Subject: RFR: 8041946 -  CMM Testing: 8u40 an allocated humongous object
	at the end of the heap should not prevents shrinking the heap
In-Reply-To: <5395C541.1000207@oracle.com>
References: <5395C541.1000207@oracle.com>
Message-ID: <53985775.2090609@oracle.com>

Hi Andrey,

As it is used, the constant MINIMAL_HEAP_SIZE does not define the minimal heap 
size but the minimal young gen size. Would you consider calling it 
MINIMAL_YOUNG_SIZE instead?

Besides that it looks ok.
/Jesper

Andrey Zakharov skrev 9/6/14 16:31:
> Hi, everyone!
> Please, review this test for new feature in G1 - sorted free list which make
> possible shrinking of the defragmented heap.
> To proper testing of free list sorting we need to defragment memory with small
> young and humongous objects.
> This is test scenario:
>   - make enough space for new objects to prevent it going old.
>   - allocate bunch of small objects, and a bit of humongous several times
> (ssssHssssHssssHssssHssssHssssHssssHssssH)
>   - free almost all of allocated stuff. Check that heap shrinks after GC.
> (-----------H)
>
> Webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8041946
>
> I have tested it along all major platforms and it works fine. There is lag on
> Solaris MXBeans about memory usage, so I need sleep by 1s.
> It will be very nicely if somebody advice me about method which "flush" memory
> usage info to remove this sleep.
> Thanks.
>
>
>
>
>


From bengt.rutisson at oracle.com  Wed Jun 11 14:22:37 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 11 Jun 2014 16:22:37 +0200
Subject: RFR (S): JDK-8046518: G1: Double calls to
	register_concurrent_cycle_end()
In-Reply-To: <53983933.2020204@oracle.com>
References: <5398305D.7070800@oracle.com> <53983933.2020204@oracle.com>
Message-ID: <5398662D.5000600@oracle.com>


Hi Stefan,

Thanks for looking at this!

On 6/11/14 1:10 PM, Stefan Karlsson wrote:
>
> On 2014-06-11 12:33, Bengt Rutisson wrote:
>>
>> Hi all,
>>
>> Can I have a review for this change?
>>
>> http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/
>>
>> https://bugs.openjdk.java.net/browse/JDK-8046518
>>
>> Background:
>> When we abort a concurrent cycle due to a Full GC in G1 we call 
>> ConcurrentMark::abort(). That will set _has_aborted flag and then 
>> call register_concurrent_cycle_end().
>>
>> The concurrent marking thread will see the _has_aborted flag in its 
>> ConcurrentMarkThread::run() method, abort the execution and then call 
>> register_concurrent_cycle_end().
>>
>> Currently this works since the code inside 
>> register_concurrent_cycle_end() is guarded by 
>> _concurrent_cycle_started which it then resets. So, the double calls 
>> will not necessarily result in too much extra work being done. But 
>> one of the things that register_concurrent_cycle_end() does is to 
>> call report_gc_end() on the concurrent GC tracer. That prevents 
>> further use of it for this GC. This means that inside the 
>> ConcurrentMarkThread::run() method we can not rely on the tracer.
>>
>> Removing the call to register_concurrent_cycle_end() in 
>> ConcurrentMark::abort() and relying on the call in 
>> ConcurrentMarkThread::run() seems to be a reasonable approach.
>
> The double call was deliberately put there to make sure that we end 
> the tracing of the concurrent GC before starting to trace teh Full GC. 

I figured there was a reason. I just couldn't remember. We would get 
overlapping GC events without this extra call. Thanks for pointing that out!

> Why do you need to change this? I guess it has to do with your other 
> GCId changes?

Right. It is for the GCId change. The problem is that calling 
register_concurrent_cycle_end() will reset the GCId to be -1. When we 
get to the logging, which is done in ConcurrentMarkThread::run(), I want 
to add the GCId to this log entry:

       if (cm()->has_aborted()) {
         if (G1Log::fine()) {
gclog_or_tty->gclog_stamp(g1h->gc_tracer_cm()->gc_id());
           gclog_or_tty->print_cr("[GC concurrent-mark-abort]");
         }
       }

But with the current code the GCId is always -1 here.

I guess one workaround I can do is to in abort() store the last aborted 
GC id and use that for logging. It just seems a bit fragile that we 
reset the concurrent gc tracer while we still have the concurrent mark 
running.

Bengt


>
> thanks,
> StefanK
>
>>
>> Thanks,
>> Bengt
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140611/94325f68/attachment.htm>

From andrey.x.zakharov at oracle.com  Wed Jun 11 15:39:16 2014
From: andrey.x.zakharov at oracle.com (Andrey Zakharov)
Date: Wed, 11 Jun 2014 19:39:16 +0400
Subject: RFR: 8041946 -  CMM Testing: 8u40 an allocated humongous object
	at the end of the heap should not prevents shrinking the heap
In-Reply-To: <53985775.2090609@oracle.com>
References: <5395C541.1000207@oracle.com> <53985775.2090609@oracle.com>
Message-ID: <53987824.8010305@oracle.com>

Hi, Jesper. Thanks for point.
Here is updated webrev:

http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.01/

Changed const name from MINIMAL_HEAP_SIZE to MINIMAL_YOUNG_SIZE
Tested locally as very minor changes

Thanks.


On 11.06.2014 17:19, Jesper Wilhelmsson wrote:
> Hi Andrey,
>
> As it is used, the constant MINIMAL_HEAP_SIZE does not define the 
> minimal heap size but the minimal young gen size. Would you consider 
> calling it MINIMAL_YOUNG_SIZE instead?
>
> Besides that it looks ok.
> /Jesper
>
> Andrey Zakharov skrev 9/6/14 16:31:
>> Hi, everyone!
>> Please, review this test for new feature in G1 - sorted free list 
>> which make
>> possible shrinking of the defragmented heap.
>> To proper testing of free list sorting we need to defragment memory 
>> with small
>> young and humongous objects.
>> This is test scenario:
>>   - make enough space for new objects to prevent it going old.
>>   - allocate bunch of small objects, and a bit of humongous several 
>> times
>> (ssssHssssHssssHssssHssssHssssHssssHssssH)
>>   - free almost all of allocated stuff. Check that heap shrinks after 
>> GC.
>> (-----------H)
>>
>> Webrev: 
>> http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8041946
>>
>> I have tested it along all major platforms and it works fine. There 
>> is lag on
>> Solaris MXBeans about memory usage, so I need sleep by 1s.
>> It will be very nicely if somebody advice me about method which 
>> "flush" memory
>> usage info to remove this sleep.
>> Thanks.
>>
>>
>>
>>
>>


From jesper.wilhelmsson at oracle.com  Wed Jun 11 18:49:51 2014
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Wed, 11 Jun 2014 20:49:51 +0200
Subject: RFR: 8041946 -  CMM Testing: 8u40 an allocated humongous object
	at the end of the heap should not prevents shrinking the heap
In-Reply-To: <53987824.8010305@oracle.com>
References: <5395C541.1000207@oracle.com> <53985775.2090609@oracle.com>
	<53987824.8010305@oracle.com>
Message-ID: <5398A4CF.9040408@oracle.com>

Looks good!
/Jesper

Andrey Zakharov skrev 11/6/14 17:39:
> Hi, Jesper. Thanks for point.
> Here is updated webrev:
>
> http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.01/
>
> Changed const name from MINIMAL_HEAP_SIZE to MINIMAL_YOUNG_SIZE
> Tested locally as very minor changes
>
> Thanks.
>
>
>
> On 11.06.2014 17:19, Jesper Wilhelmsson wrote:
>> Hi Andrey,
>>
>> As it is used, the constant MINIMAL_HEAP_SIZE does not define the minimal heap
>> size but the minimal young gen size. Would you consider calling it
>> MINIMAL_YOUNG_SIZE instead?
>>
>> Besides that it looks ok.
>> /Jesper
>>
>> Andrey Zakharov skrev 9/6/14 16:31:
>>> Hi, everyone!
>>> Please, review this test for new feature in G1 - sorted free list which make
>>> possible shrinking of the defragmented heap.
>>> To proper testing of free list sorting we need to defragment memory with small
>>> young and humongous objects.
>>> This is test scenario:
>>>   - make enough space for new objects to prevent it going old.
>>>   - allocate bunch of small objects, and a bit of humongous several times
>>> (ssssHssssHssssHssssHssssHssssHssssHssssH)
>>>   - free almost all of allocated stuff. Check that heap shrinks after GC.
>>> (-----------H)
>>>
>>> Webrev: http://cr.openjdk.java.net/~fzhinkin/azakharov/8041946/webrev.00/
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8041946
>>>
>>> I have tested it along all major platforms and it works fine. There is lag on
>>> Solaris MXBeans about memory usage, so I need sleep by 1s.
>>> It will be very nicely if somebody advice me about method which "flush" memory
>>> usage info to remove this sleep.
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>


From bengt.rutisson at oracle.com  Thu Jun 12 08:35:37 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 12 Jun 2014 10:35:37 +0200
Subject: RFR (S): JDK-8046518: G1: Double calls to
	register_concurrent_cycle_end()
In-Reply-To: <5398662D.5000600@oracle.com>
References: <5398305D.7070800@oracle.com> <53983933.2020204@oracle.com>
	<5398662D.5000600@oracle.com>
Message-ID: <53996659.9000703@oracle.com>


Hi all,

I'm withdrawing this review request. I closed the bug as will not fix.

Bengt


On 2014-06-11 16:22, Bengt Rutisson wrote:
>
> Hi Stefan,
>
> Thanks for looking at this!
>
> On 6/11/14 1:10 PM, Stefan Karlsson wrote:
>>
>> On 2014-06-11 12:33, Bengt Rutisson wrote:
>>>
>>> Hi all,
>>>
>>> Can I have a review for this change?
>>>
>>> http://cr.openjdk.java.net/~brutisso/8046518/webrev.00/
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8046518
>>>
>>> Background:
>>> When we abort a concurrent cycle due to a Full GC in G1 we call 
>>> ConcurrentMark::abort(). That will set _has_aborted flag and then 
>>> call register_concurrent_cycle_end().
>>>
>>> The concurrent marking thread will see the _has_aborted flag in its 
>>> ConcurrentMarkThread::run() method, abort the execution and then 
>>> call register_concurrent_cycle_end().
>>>
>>> Currently this works since the code inside 
>>> register_concurrent_cycle_end() is guarded by 
>>> _concurrent_cycle_started which it then resets. So, the double calls 
>>> will not necessarily result in too much extra work being done. But 
>>> one of the things that register_concurrent_cycle_end() does is to 
>>> call report_gc_end() on the concurrent GC tracer. That prevents 
>>> further use of it for this GC. This means that inside the 
>>> ConcurrentMarkThread::run() method we can not rely on the tracer.
>>>
>>> Removing the call to register_concurrent_cycle_end() in 
>>> ConcurrentMark::abort() and relying on the call in 
>>> ConcurrentMarkThread::run() seems to be a reasonable approach.
>>
>> The double call was deliberately put there to make sure that we end 
>> the tracing of the concurrent GC before starting to trace teh Full GC. 
>
> I figured there was a reason. I just couldn't remember. We would get 
> overlapping GC events without this extra call. Thanks for pointing 
> that out!
>
>> Why do you need to change this? I guess it has to do with your other 
>> GCId changes?
>
> Right. It is for the GCId change. The problem is that calling 
> register_concurrent_cycle_end() will reset the GCId to be -1. When we 
> get to the logging, which is done in ConcurrentMarkThread::run(), I 
> want to add the GCId to this log entry:
>
>       if (cm()->has_aborted()) {
>         if (G1Log::fine()) {
> gclog_or_tty->gclog_stamp(g1h->gc_tracer_cm()->gc_id());
>           gclog_or_tty->print_cr("[GC concurrent-mark-abort]");
>         }
>       }
>
> But with the current code the GCId is always -1 here.
>
> I guess one workaround I can do is to in abort() store the last 
> aborted GC id and use that for logging. It just seems a bit fragile 
> that we reset the concurrent gc tracer while we still have the 
> concurrent mark running.
>
> Bengt
>
>
>>
>> thanks,
>> StefanK
>>
>>>
>>> Thanks,
>>> Bengt
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140612/69d0d71c/attachment.htm>

From stefan.karlsson at oracle.com  Thu Jun 12 08:47:35 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 12 Jun 2014 10:47:35 +0200
Subject: RFR: 8046670: Make CMS metadata aware closures applicable for other
	collectors
Message-ID: <53996927.1000102@oracle.com>

Hi all,

Please, review this patch to make the metadata-tracing oop closures used 
by CMS available to other collectors. This patch is needed by the G1 
Class Unloading work.

http://cr.openjdk.java.net/~stefank/8046670/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8046670

thanks,
StefanK


From per.liden at oracle.com  Thu Jun 12 10:09:46 2014
From: per.liden at oracle.com (Per Liden)
Date: Thu, 12 Jun 2014 12:09:46 +0200
Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop()
Message-ID: <53997C6A.2010209@oracle.com>

Hi,

Here's another (hopefully last) attempt at fixing issue with stopping 
G1's concurrent threads at VM shutdown.

Bug: https://bugs.openjdk.java.net/browse/JDK-8044796
Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/

The previous attempt tried to abort any ongoing concurrent mark to speed 
up the shutdown phase. This turned out to be a bad idea as it opened up 
another race, which could result in threads getting stuck again. So, 
this time I just wait for concurrent mark to complete before 
terminating. We've talked internally here about some alternatives to 
force an abort, but it seems all alternatives complicates the code way 
too much and introduces new states which is hard to verify and it just 
isn't worth it.

What worries me a bit is that the problems potentially introduced by a 
change like this are very hard to detect as they tend to be race 
conditions and show up only now and then. The previous fix had gone 
through a fair bit of testing without showing any problems. This new fix 
has gone thought 5 iterations of GC nightlies (Aurora adhoc 
submissions), 3 iterations of gc-test-suite and passed all JTReg G1 tests.

About the fix. Since I no longer try to abort concurrent work the stop() 
function became just a call to stop_conc_gc_threads(). Since 
stop_conc_gc_threads() isn't used anywhere else I simply moved its 
contents to stop() and removed stop_conc_gc_threads().

Thanks!
/Per


From stefan.karlsson at oracle.com  Thu Jun 12 10:40:42 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 12 Jun 2014 12:40:42 +0200
Subject: RFR: 8046670: Make CMS metadata aware closures applicable for
	other collectors
In-Reply-To: <53996927.1000102@oracle.com>
References: <53996927.1000102@oracle.com>
Message-ID: <539983AA.50604@oracle.com>

On 2014-06-12 10:47, Stefan Karlsson wrote:
> Hi all,
>
> Please, review this patch to make the metadata-tracing oop closures 
> used by CMS available to other collectors. This patch is needed by the 
> G1 Class Unloading work.
>
> http://cr.openjdk.java.net/~stefank/8046670/webrev.00/

New patch:
http://cr.openjdk.java.net/~stefank/8046670/webrev.01/

The old patch didn't include the new iterator.inline.hpp file. I've 
added the file and made sure that we include it where needed. I've 
verified that this builds without precompiled header.

I've also verified that we unload classes when running Kitchensink with CMS.

thanks,
StefanK

> https://bugs.openjdk.java.net/browse/JDK-8046670
>
> thanks,
> StefanK


From bengt.rutisson at oracle.com  Thu Jun 12 11:24:57 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 12 Jun 2014 13:24:57 +0200
Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop()
In-Reply-To: <53997C6A.2010209@oracle.com>
References: <53997C6A.2010209@oracle.com>
Message-ID: <53998E09.6020107@oracle.com>


Hi Per,

Thanks for doing such thorough testing!

As far as I can tell this looks good.

Bengt


On 2014-06-12 12:09, Per Liden wrote:
> Hi,
>
> Here's another (hopefully last) attempt at fixing issue with stopping 
> G1's concurrent threads at VM shutdown.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8044796
> Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/
>
> The previous attempt tried to abort any ongoing concurrent mark to 
> speed up the shutdown phase. This turned out to be a bad idea as it 
> opened up another race, which could result in threads getting stuck 
> again. So, this time I just wait for concurrent mark to complete 
> before terminating. We've talked internally here about some 
> alternatives to force an abort, but it seems all alternatives 
> complicates the code way too much and introduces new states which is 
> hard to verify and it just isn't worth it.
>
> What worries me a bit is that the problems potentially introduced by a 
> change like this are very hard to detect as they tend to be race 
> conditions and show up only now and then. The previous fix had gone 
> through a fair bit of testing without showing any problems. This new 
> fix has gone thought 5 iterations of GC nightlies (Aurora adhoc 
> submissions), 3 iterations of gc-test-suite and passed all JTReg G1 
> tests.
>
> About the fix. Since I no longer try to abort concurrent work the 
> stop() function became just a call to stop_conc_gc_threads(). Since 
> stop_conc_gc_threads() isn't used anywhere else I simply moved its 
> contents to stop() and removed stop_conc_gc_threads().
>
> Thanks!
> /Per


From per.liden at oracle.com  Thu Jun 12 11:34:30 2014
From: per.liden at oracle.com (Per Liden)
Date: Thu, 12 Jun 2014 13:34:30 +0200
Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop()
In-Reply-To: <53998E09.6020107@oracle.com>
References: <53997C6A.2010209@oracle.com> <53998E09.6020107@oracle.com>
Message-ID: <53999046.6060809@oracle.com>

Thanks for reviewing Bengt!

/Per

On 06/12/2014 01:24 PM, Bengt Rutisson wrote:
>
> Hi Per,
>
> Thanks for doing such thorough testing!
>
> As far as I can tell this looks good.
>
> Bengt
>
>
> On 2014-06-12 12:09, Per Liden wrote:
>> Hi,
>>
>> Here's another (hopefully last) attempt at fixing issue with stopping
>> G1's concurrent threads at VM shutdown.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8044796
>> Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/
>>
>> The previous attempt tried to abort any ongoing concurrent mark to
>> speed up the shutdown phase. This turned out to be a bad idea as it
>> opened up another race, which could result in threads getting stuck
>> again. So, this time I just wait for concurrent mark to complete
>> before terminating. We've talked internally here about some
>> alternatives to force an abort, but it seems all alternatives
>> complicates the code way too much and introduces new states which is
>> hard to verify and it just isn't worth it.
>>
>> What worries me a bit is that the problems potentially introduced by a
>> change like this are very hard to detect as they tend to be race
>> conditions and show up only now and then. The previous fix had gone
>> through a fair bit of testing without showing any problems. This new
>> fix has gone thought 5 iterations of GC nightlies (Aurora adhoc
>> submissions), 3 iterations of gc-test-suite and passed all JTReg G1
>> tests.
>>
>> About the fix. Since I no longer try to abort concurrent work the
>> stop() function became just a call to stop_conc_gc_threads(). Since
>> stop_conc_gc_threads() isn't used anywhere else I simply moved its
>> contents to stop() and removed stop_conc_gc_threads().
>>
>> Thanks!
>> /Per
>


From graham at vast.com  Fri Jun 13 04:16:48 2014
From: graham at vast.com (graham sanderson)
Date: Thu, 12 Jun 2014 23:16:48 -0500
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
Message-ID: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>

Hi, I hope this is the right list for this question:

I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19

I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release)

I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection (is this your understanding too?)

We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses

we already have

-XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled

we have seen a few long(isn) initial marks, so 

-XX:+CMSParallelInitialMarkEnabled sounds good

as for 

-XX:+CMSEdenChunksRecordAlways

my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk?

Thanks,

Graham

P.S. less relevant I think, but our old generation is 16g
P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140612/e29a23d9/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1574 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140612/e29a23d9/smime.p7s>

From erik.helin at oracle.com  Fri Jun 13 08:53:49 2014
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 13 Jun 2014 10:53:49 +0200
Subject: RFR: 8046670: Make CMS metadata aware closures applicable for
	other collectors
In-Reply-To: <539983AA.50604@oracle.com>
References: <53996927.1000102@oracle.com> <539983AA.50604@oracle.com>
Message-ID: <2138756.9fqg2igiCv@ehelin-desktop>

Hi Stefan,

looks good, reviewed.

Thanks,
Erik

On Thursday 12 June 2014 12.40.42 Stefan Karlsson wrote:
> On 2014-06-12 10:47, Stefan Karlsson wrote:
> > Hi all,
> > 
> > Please, review this patch to make the metadata-tracing oop closures
> > used by CMS available to other collectors. This patch is needed by the
> > G1 Class Unloading work.
> > 
> > http://cr.openjdk.java.net/~stefank/8046670/webrev.00/
> 
> New patch:
> http://cr.openjdk.java.net/~stefank/8046670/webrev.01/
> 
> The old patch didn't include the new iterator.inline.hpp file. I've
> added the file and made sure that we include it where needed. I've
> verified that this builds without precompiled header.
> 
> I've also verified that we unload classes when running Kitchensink with CMS.
> 
> thanks,
> StefanK
> 
> > https://bugs.openjdk.java.net/browse/JDK-8046670
> > 
> > thanks,
> > StefanK


From stefan.johansson at oracle.com  Fri Jun 13 10:34:50 2014
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Fri, 13 Jun 2014 12:34:50 +0200
Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop()
In-Reply-To: <53997C6A.2010209@oracle.com>
References: <53997C6A.2010209@oracle.com>
Message-ID: <539AD3CA.1070403@oracle.com>

Hi Per,

The change looks good. Hopefully there are no more rare corner cases to 
trip over and if there are I think it's good to get the change in to 
find them.

StefanJ

On 2014-06-12 12:09, Per Liden wrote:
> Hi,
>
> Here's another (hopefully last) attempt at fixing issue with stopping 
> G1's concurrent threads at VM shutdown.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8044796
> Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/
>
> The previous attempt tried to abort any ongoing concurrent mark to 
> speed up the shutdown phase. This turned out to be a bad idea as it 
> opened up another race, which could result in threads getting stuck 
> again. So, this time I just wait for concurrent mark to complete 
> before terminating. We've talked internally here about some 
> alternatives to force an abort, but it seems all alternatives 
> complicates the code way too much and introduces new states which is 
> hard to verify and it just isn't worth it.
>
> What worries me a bit is that the problems potentially introduced by a 
> change like this are very hard to detect as they tend to be race 
> conditions and show up only now and then. The previous fix had gone 
> through a fair bit of testing without showing any problems. This new 
> fix has gone thought 5 iterations of GC nightlies (Aurora adhoc 
> submissions), 3 iterations of gc-test-suite and passed all JTReg G1 
> tests.
>
> About the fix. Since I no longer try to abort concurrent work the 
> stop() function became just a call to stop_conc_gc_threads(). Since 
> stop_conc_gc_threads() isn't used anywhere else I simply moved its 
> contents to stop() and removed stop_conc_gc_threads().
>
> Thanks!
> /Per


From per.liden at oracle.com  Fri Jun 13 11:32:39 2014
From: per.liden at oracle.com (Per Liden)
Date: Fri, 13 Jun 2014 13:32:39 +0200
Subject: RFR(s): 8044796: G1: Enabled G1CollectedHeap::stop()
In-Reply-To: <539AD3CA.1070403@oracle.com>
References: <53997C6A.2010209@oracle.com> <539AD3CA.1070403@oracle.com>
Message-ID: <539AE157.2000809@oracle.com>

Thanks Stefan!

/Per

On 06/13/2014 12:34 PM, Stefan Johansson wrote:
> Hi Per,
>
> The change looks good. Hopefully there are no more rare corner cases to
> trip over and if there are I think it's good to get the change in to
> find them.
>
> StefanJ
>
> On 2014-06-12 12:09, Per Liden wrote:
>> Hi,
>>
>> Here's another (hopefully last) attempt at fixing issue with stopping
>> G1's concurrent threads at VM shutdown.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8044796
>> Webrev: http://cr.openjdk.java.net/~pliden/8044796/webrev.0/
>>
>> The previous attempt tried to abort any ongoing concurrent mark to
>> speed up the shutdown phase. This turned out to be a bad idea as it
>> opened up another race, which could result in threads getting stuck
>> again. So, this time I just wait for concurrent mark to complete
>> before terminating. We've talked internally here about some
>> alternatives to force an abort, but it seems all alternatives
>> complicates the code way too much and introduces new states which is
>> hard to verify and it just isn't worth it.
>>
>> What worries me a bit is that the problems potentially introduced by a
>> change like this are very hard to detect as they tend to be race
>> conditions and show up only now and then. The previous fix had gone
>> through a fair bit of testing without showing any problems. This new
>> fix has gone thought 5 iterations of GC nightlies (Aurora adhoc
>> submissions), 3 iterations of gc-test-suite and passed all JTReg G1
>> tests.
>>
>> About the fix. Since I no longer try to abort concurrent work the
>> stop() function became just a call to stop_conc_gc_threads(). Since
>> stop_conc_gc_threads() isn't used anywhere else I simply moved its
>> contents to stop() and removed stop_conc_gc_threads().
>>
>> Thanks!
>> /Per
>


From graham at vast.com  Fri Jun 13 15:48:42 2014
From: graham at vast.com (graham sanderson)
Date: Fri, 13 Jun 2014 10:48:42 -0500
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
Message-ID: <22EB562F-F475-4946-911B-0475E02A8837@vast.com>

Apologies wrong mailing list - resent over in hotspot-gc-use (hopefully this email attachers itself to the right thread)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1574 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140613/c7e29bb7/smime.p7s>

From jon.masamitsu at oracle.com  Mon Jun 16 18:27:43 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 16 Jun 2014 11:27:43 -0700
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
In-Reply-To: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>
References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>
Message-ID: <539F371F.10008@oracle.com>


On 06/12/2014 09:16 PM, graham sanderson wrote:
> Hi, I hope this is the right list for this question:
>
> I was investigating abortable preclean timeouts in our app (and 
> associated long remark pause) so had a look at the old jdk6 code I had 
> on my box, wondered about recording eden chunks during certain eden 
> slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS 
> bump), and saw what looked perfect in the latest code, so was excited 
> to install 1.7.0_60-b19
>
> I wanted to ask what you consider the stability of these two options 
> to be (I?m pretty sure at least the first one is new in this release)
>
> I have just installed locally on my mac, and am aware of 
> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I 
> could reproduce, however I wasn?t able to reproduce it without 
> -XX:-UseCMSCompactAtFullCollection (is this your understanding too?)

Yes.

>
> We are running our application with 8 gig young generation (6.4g 
> eden), on boxes with 32 cores? so parallelism is good for short pauses
>
> we already have
>
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
>
> we have seen a few long(isn) initial marks, so
>
> -XX:+CMSParallelInitialMarkEnabled sounds good
>
> as for
>
> -XX:+CMSEdenChunksRecordAlways
>
> my question is: what constitutes a slow path such an eden chunk is 
> potentially recorded? TLAB allocation, or more horrific things; 
> basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) 
> is it likely that I?ll actually get less samples using 
> -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I 
> would with sampling, or put another way? what sort of app allocation 
> patterns if any might avoid the slow path altogether and might leave 
> me with just one chunk?

Fast path allocation is done from TLAB's.  If you have to get
a new TLAB, the call to get the new TLAB comes from compiled
code but the call is into the JVM and  that is the slow path where
the sampling is done.

Jon

>
> Thanks,
>
> Graham
>
> P.S. less relevant I think, but our old generation is 16g
> P.P.S. I suspect the abortable preclean timeouts mostly happen after a 
> burst of very high allocation rate followed by an almost complete 
> lull? this is one of the patterns that can happen in our application

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140616/66fbb71b/attachment.htm>

From graham at vast.com  Mon Jun 16 18:54:14 2014
From: graham at vast.com (graham sanderson)
Date: Mon, 16 Jun 2014 13:54:14 -0500
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
In-Reply-To: <539F371F.10008@oracle.com>
References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>
	<539F371F.10008@oracle.com>
Message-ID: <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com>

Thanks Jon; that?s exactly what i was hoping

On Jun 16, 2014, at 1:27 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:

> 
> On 06/12/2014 09:16 PM, graham sanderson wrote:
>> Hi, I hope this is the right list for this question:
>> 
>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19
>> 
>> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release)
>> 
>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection (is this
>>           your understanding too?)
> 
> Yes.
> 
>> 
>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses
>> 
>> we already have
>> 
>> -XX:+UseParNewGC 
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled
>> 
>> we have seen a few long(isn) initial marks, so 
>> 
>> -XX:+CMSParallelInitialMarkEnabled sounds good
>> 
>> as for 
>> 
>> -XX:+CMSEdenChunksRecordAlways
>> 
>> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk?
> 
> Fast path allocation is done from TLAB's.  If you have to get
> a new TLAB, the call to get the new TLAB comes from compiled
> code but the call is into the JVM and  that is the slow path where
> the sampling is done.
> 
> Jon
> 
>> 
>> Thanks,
>> 
>> Graham
>> 
>> P.S. less relevant I think, but our old generation is 16g
>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140616/b3944a13/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1574 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140616/b3944a13/smime.p7s>

From sbergman at redhat.com  Tue Jun 17 07:15:16 2014
From: sbergman at redhat.com (Stephan Bergmann)
Date: Tue, 17 Jun 2014 09:15:16 +0200
Subject: History of finalizer execution and gc progress?
Message-ID: <539FEB04.90704@redhat.com>

Hi all,

Does anybody recollect historical details of how execution of 
(potentially long-running) finalizers impacted overall gc progress?

 From the behavior of a small test program run on OpenJDK 8, it looks 
like recent JVMs at least offload all finalizer calls to a single 
dedicated thread, so that a blocking finalizer blocks finalization (and 
thus reclamation) of other garbage objects with explicit finalizers, but 
reclamation of other garbage proceeds unhindered.

But how was the behavior in the past?  Was it so that in older JVMs 
(still in use around 2005) execution of a blocking finalizer could block 
reclamation of /all/ garbage, even of those objects that did not have 
explicit finalizers?

(I'm asking because in LibreOffice we have a dedicated thread to which 
we offload the actual work done by certain objects' finalize methods, 
introduced around 2005 to work around memory starvation in case one of 
those finalizers took too long.  But I can't remember whether that was 
because no garbage at all was reclaimed in such a scenario---and we 
could drop our additional thread again today---, or because it blocked 
finalization of unrelated objects with explicit finalizers---in which 
case we would need to keep our additional thread.)

Stephan


From kirk at kodewerk.com  Tue Jun 17 09:00:45 2014
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Tue, 17 Jun 2014 11:00:45 +0200
Subject: History of finalizer execution and gc progress?
In-Reply-To: <539FEB04.90704@redhat.com>
References: <539FEB04.90704@redhat.com>
Message-ID: <1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com>

Hi Stephan,

finalization uses a helper thread to do the actual work so there should be no direct impact. However, finalization needs to complete before a collection can finally reclaim the memory used by the object. Until then the object will need to be processed as an object waiting for finalization.

If your finalize method kills the helper thread, I think you have to wait for a GC cycle to end for a new finalization sequence to be triggered and that could be disruptive.

That said, your (albeit brief) description of the problem suggests that you can manage object clean up on your own which implies that you don?t need finalization.

Regards,
Kirk

On Jun 17, 2014, at 9:15 AM, Stephan Bergmann <sbergman at redhat.com> wrote:

> Hi all,
> 
> Does anybody recollect historical details of how execution of (potentially long-running) finalizers impacted overall gc progress?
> 
> From the behavior of a small test program run on OpenJDK 8, it looks like recent JVMs at least offload all finalizer calls to a single dedicated thread, so that a blocking finalizer blocks finalization (and thus reclamation) of other garbage objects with explicit finalizers, but reclamation of other garbage proceeds unhindered.
> 
> But how was the behavior in the past?  Was it so that in older JVMs (still in use around 2005) execution of a blocking finalizer could block reclamation of /all/ garbage, even of those objects that did not have explicit finalizers?
> 
> (I'm asking because in LibreOffice we have a dedicated thread to which we offload the actual work done by certain objects' finalize methods, introduced around 2005 to work around memory starvation in case one of those finalizers took too long.  But I can't remember whether that was because no garbage at all was reclaimed in such a scenario---and we could drop our additional thread again today---, or because it blocked finalization of unrelated objects with explicit finalizers---in which case we would need to keep our additional thread.)
> 
> Stephan


From per.liden at oracle.com  Tue Jun 17 12:12:08 2014
From: per.liden at oracle.com (Per Liden)
Date: Tue, 17 Jun 2014 14:12:08 +0200
Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not
	in strong code roots for region
Message-ID: <53A03098.2070207@oracle.com>

Could I please have this fix reviewed.

Summary: nmethods are only registered with the heap if 
nmethod::detect_scavenge_root_oops() returns true. However, in case the 
nmethod only contains oops to humongous objects 
detect_scavenge_root_oops() will return false and the nmethod will not 
be registered. This will later cause heap verification to fail.

There are several ways in which this can be fixed. One alternative is to 
adjust the verification to ignore humongous oops (since these objects 
will never move). Another alternative is to just register the method 
regardless of what detect_scavenge_root_oops() says. Since we might want 
to allow humongous objects to move in the future this is the proposed fix.

Bug: https://bugs.openjdk.java.net/browse/JDK-8046231
Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/

Testing:
* gc-test-suite
* manual ad-hoc testing

Thanks!
/Per


From sbergman at redhat.com  Tue Jun 17 12:47:59 2014
From: sbergman at redhat.com (Stephan Bergmann)
Date: Tue, 17 Jun 2014 14:47:59 +0200
Subject: History of finalizer execution and gc progress?
In-Reply-To: <1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com>
References: <539FEB04.90704@redhat.com>
	<1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com>
Message-ID: <53A038FF.5010001@redhat.com>

On 06/17/2014 11:00 AM, Kirk Pepperdine wrote:
> finalization uses a helper thread to do the actual work so there should be no direct impact. However, finalization needs to complete before a collection can finally reclaim the memory used by the object. Until then the object will need to be processed as an object waiting for finalization.

Sure, but that doesn't address my question, whether in ca. 2005 JVMs one 
"malicious" blocking finalizer invocation could have blocked /all/ 
garbage reclamation (incl. of objects without explicit finalizers).

> If your finalize method kills the helper thread, I think you have to wait for a GC cycle to end for a new finalization sequence to be triggered and that could be disruptive.

No killing of any JVM's helper threads involved.

> That said, your (albeit brief) description of the problem suggests that you can manage object clean up on your own which implies that you don?t need finalization.

No.  Let me rephrase:  Assume we continuously create and let become 
unreachable again three kinds of objects.  A objects don't have explicit 
finalizers; B objects have explicit finalizers (that are "well behaved" 
and execute quickly); C objects have explicit finalizers that are 
"malicious" and can take arbitrarily long to execute.

Now, the question is whether it would have been common behavior for a 
ca. 2005 JVM that a very long-running finalizer execution for a C object 
would have prevented timely reclamation of A objects (in addition to 
reclamation of B objects and other C objects).

Stephan

> On Jun 17, 2014, at 9:15 AM, Stephan Bergmann <sbergman at redhat.com> wrote:
>> Does anybody recollect historical details of how execution of (potentially long-running) finalizers impacted overall gc progress?
>>
>>  From the behavior of a small test program run on OpenJDK 8, it looks like recent JVMs at least offload all finalizer calls to a single dedicated thread, so that a blocking finalizer blocks finalization (and thus reclamation) of other garbage objects with explicit finalizers, but reclamation of other garbage proceeds unhindered.
>>
>> But how was the behavior in the past?  Was it so that in older JVMs (still in use around 2005) execution of a blocking finalizer could block reclamation of /all/ garbage, even of those objects that did not have explicit finalizers?
>>
>> (I'm asking because in LibreOffice we have a dedicated thread to which we offload the actual work done by certain objects' finalize methods, introduced around 2005 to work around memory starvation in case one of those finalizers took too long.  But I can't remember whether that was because no garbage at all was reclaimed in such a scenario---and we could drop our additional thread again today---, or because it blocked finalization of unrelated objects with explicit finalizers---in which case we would need to keep our additional thread.)


From kirk at kodewerk.com  Tue Jun 17 13:05:10 2014
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Tue, 17 Jun 2014 15:05:10 +0200
Subject: History of finalizer execution and gc progress?
In-Reply-To: <53A038FF.5010001@redhat.com>
References: <539FEB04.90704@redhat.com>
	<1E2D3150-AF19-488C-B904-35D025A91F0D@kodewerk.com>
	<53A038FF.5010001@redhat.com>
Message-ID: <5A9D4815-07DE-4799-8E13-C295924B0D78@kodewerk.com>

Hi Stephan,

>> finalization uses a helper thread to do the actual work so there should be no direct impact. However, finalization needs to complete before a collection can finally reclaim the memory used by the object. Until then the object will need to be processed as an object waiting for finalization.
> 
> Sure, but that doesn't address my question, whether in ca. 2005 JVMs one "malicious" blocking finalizer invocation could have blocked /all/ garbage reclamation (incl. of objects without explicit finalizers).

To clarify my answer, no, not that I?ve seen in the code or have experienced.

> 
>> If your finalize method kills the helper thread, I think you have to wait for a GC cycle to end for a new finalization sequence to be triggered and that could be disruptive.
> 
> No killing of any JVM's helper threads involved.
> 
>> That said, your (albeit brief) description of the problem suggests that you can manage object clean up on your own which implies that you don?t need finalization.
> 
> No.  Let me rephrase:  Assume we continuously create and let become unreachable again three kinds of objects.  A objects don't have explicit finalizers; B objects have explicit finalizers (that are "well behaved" and execute quickly); C objects have explicit finalizers that are "malicious" and can take arbitrarily long to execute.
> 
> Now, the question is whether it would have been common behavior for a ca. 2005 JVM that a very long-running finalizer execution for a C object would have prevented timely reclamation of A objects (in addition to reclamation of B objects and other C objects).

I might wrap C in a PhantomReference and deal with them on my own if; 1) I was unable to deal with them as a closable resource (IOWs, I didn?t have a handle on the complete lifecycle for what ever reason), it was disruptive to having B finalized.

Regards,
Kirk


From jon.masamitsu at oracle.com  Tue Jun 17 20:29:20 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 17 Jun 2014 13:29:20 -0700
Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ...
	not in strong code roots for region
In-Reply-To: <53A03098.2070207@oracle.com>
References: <53A03098.2070207@oracle.com>
Message-ID: <53A0A520.8010203@oracle.com>


On 6/17/2014 5:12 AM, Per Liden wrote:
> Could I please have this fix reviewed.
>
> Summary: nmethods are only registered with the heap if 
> nmethod::detect_scavenge_root_oops() returns true. However, in case 
> the nmethod only contains oops to humongous objects 
> detect_scavenge_root_oops() will return false and the nmethod will not 
> be registered. This will later cause heap verification to fail.
>
> There are several ways in which this can be fixed. One alternative is 
> to adjust the verification to ignore humongous oops (since these 
> objects will never move). Another alternative is to just register the 
> method regardless of what detect_scavenge_root_oops() says. Since we 
> might want to allow humongous objects to move in the future this is 
> the proposed fix.

Per,

Do you have any measurements on how many more nmethods get registered
with this approach (registering an nmethod regardless return from 
detect_scavenge_root_oops()?

Jon

>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231
> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/
>
> Testing:
> * gc-test-suite
> * manual ad-hoc testing
>
> Thanks!
> /Per
>


From andrey.x.zakharov at oracle.com  Wed Jun 18 12:31:14 2014
From: andrey.x.zakharov at oracle.com (Andrey Zakharov)
Date: Wed, 18 Jun 2014 16:31:14 +0400
Subject: RFR: 8026847 [TESTBUG] gc/g1/TestSummarizeRSetStats* tests launch
	32bit jvm with UseCompressedOops
Message-ID: <53A18692.20507@oracle.com>

Hi, all.
"UseCompressedOops" options is being used In 
gc/g1/TestSummarizeRSetStats* tests.
But it doesn't needed for those tests. Also I have asked Thomas Schatzl 
about this options and he confirmed useless.
So here is simple patch - just removing.

webrev:
http://cr.openjdk.java.net/~fzhinkin/azakharov/8026847/webrev.00/
bug:
https://bugs.openjdk.java.net/browse/JDK-8026847

I have tested it locally for 32 and 64bits JDK and also in Aurora (batch 
514909.ute.hs_jtreg.accept.full).
Please, review it.
Thanks.


From per.liden at oracle.com  Wed Jun 18 12:35:16 2014
From: per.liden at oracle.com (Per Liden)
Date: Wed, 18 Jun 2014 14:35:16 +0200
Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ...
	not in strong code roots for region
In-Reply-To: <53A0A520.8010203@oracle.com>
References: <53A03098.2070207@oracle.com> <53A0A520.8010203@oracle.com>
Message-ID: <53A18784.20705@oracle.com>

Jon,

On 06/17/2014 10:29 PM, Jon Masamitsu wrote:
>
> On 6/17/2014 5:12 AM, Per Liden wrote:
>> Could I please have this fix reviewed.
>>
>> Summary: nmethods are only registered with the heap if
>> nmethod::detect_scavenge_root_oops() returns true. However, in case
>> the nmethod only contains oops to humongous objects
>> detect_scavenge_root_oops() will return false and the nmethod will not
>> be registered. This will later cause heap verification to fail.
>>
>> There are several ways in which this can be fixed. One alternative is
>> to adjust the verification to ignore humongous oops (since these
>> objects will never move). Another alternative is to just register the
>> method regardless of what detect_scavenge_root_oops() says. Since we
>> might want to allow humongous objects to move in the future this is
>> the proposed fix.
>
> Per,
>
> Do you have any measurements on how many more nmethods get registered
> with this approach (registering an nmethod regardless return from
> detect_scavenge_root_oops()?

I don't have any numbers, but I'm fairly confident that it's a small 
number. The only nmethods that weren't registered before this change 
were methods in classes loaded by the BootClassLoader, which only had 
humongous oops in them. All methods loaded by a SystemClassLoader would 
have been registered anyway.

/Per

>
> Jon
>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231
>> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/
>>
>> Testing:
>> * gc-test-suite
>> * manual ad-hoc testing
>>
>> Thanks!
>> /Per
>>
>


From jresch at cleversafe.com  Wed Jun 18 19:20:36 2014
From: jresch at cleversafe.com (Jason Resch)
Date: Wed, 18 Jun 2014 14:20:36 -0500
Subject: Reference Processing in G1 remark phase vs. throughput collector
Message-ID: <53A1E684.1020000@cleversafe.com>

Hello,

We've recently been experimenting with the G1 collector for our 
application, and we noticed something odd with reference processing 
times in the G1. It is not clear to us if this is expected or indicative 
of a bug, but I thought I would mention it to this list to see if there 
is a reasonable explanation for this result.

We are seeing that during the remark phase when non-strong references 
are processed, it takes around 20 times longer than the throughput 
collector spends processing the same number of references.  As an 
example, here is some output for references processing times we observed:

    2014-05-23T19:58:12.805+0000: 11446.605: [GC remark 11446.618: [GC
    ref-proc11446.618: [SoftReference, 0 refs, 0.0040400 secs]11446.622:
    [WeakReference, 11131810 refs, 8.7176900 secs]11455.340:
    [FinalReference, 2273593 refs, 2.0022000 secs]11457.342:
    [PhantomReference, 297950 refs, 0.3004680 secs]11457.643: [JNI Weak
    Reference, 0.0000040 secs], 13.7534950 secs], 13.8035420 secs]


We see the G1 spent 8.7 seconds were spent processing 11 million weak 
references

    2014-05-30T05:57:24.002+0000: 32724.998: [Full GC32726.138:
    [SoftReference, 154 refs, 0.0050380 secs]32726.143: [WeakReference,
    7713339 refs, 0.3449380 secs]32726.488: [FinalReference, 1966941
    refs, 0.1005860 secs]32726.588: [PhantomReference, 650797 refs,
    0.0631680 secs]32726.652: [JNI Weak Reference, 0.0000060 secs]
    [PSYoungGen: 1012137K->0K(14784384K)] [ParOldGen:
    16010001K->5894387K(16384000K)] 17022139K->5894387K(31168384K)
    [PSPermGen: 39256K->39256K(39552K)], 4.3463290 secs] [Times:
    user=98.05 sys=0.00, real=4.35 secs]

While the throughput collector spent 0.34 seconds processing 7.7 million 
weak references


In summary, the G1 collector processed weak references at a rate of 1.27 
million per second, while the throughput collector processed them at 
22.36 million references per second. Is there a fundamental design 
reason that explains why the G1 collector should be so much slower in 
this regard, or might there be ways to improve upon it?


Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140618/7f1a8296/attachment.htm>

From Peter.B.Kessler at Oracle.COM  Wed Jun 18 22:35:47 2014
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Wed, 18 Jun 2014 15:35:47 -0700
Subject: History of finalizer execution and gc progress?
In-Reply-To: <539FEB04.90704@redhat.com>
References: <539FEB04.90704@redhat.com>
Message-ID: <53A21443.5060607@Oracle.COM>

As far back as I can remember (I never worked on the implementation of the "classic" JVM), the JVM discovers objects that should have their finalize() method called and puts them on a queue to be handled by Java library code.  That means the JVM can continue collecting garbage even if that queue isn't being drained.  Also as far back as I can remember, the Java library code calls the finalize() method from a thread dedicated to that.  Cf. the static block at the end of [1] and [2], to cite only openly-available sources.

As to "still in use around 2005", it would help to have a JDK version number.

One of the things I used to do to dissuade people from using finalize() methods was to create and drop an object with a finalize() method that blocked, because that *would* block any subsequent calls from the Java library code to finalize() methods.  Some people worked around that (or in real life :-) by calling Runtime.runFinalization() (usually by calling System.runFinalization()), which spins up an additional thread to drain the queue of discovered objects with non-trivial finalize() methods.  Runtime.runFinalization() has been around since the beginning[3], though of course the specification is a little vague.

You might be able to disable the collection of unreachable objects by misusing calls to the JNI functions GetPrimitiveArrayCritical and friends.[4]

As long as you have all the infrastructure of your own thread, etc., I would recommend that you switch your uses of finalize() to WeakReferences (or PhantomReferences) and your own threads to drain your own queues.  Then you would own all your code and wouldn't have to worry about interactions with other object types.

			... peter

--------
[1]http://hg.openjdk.java.net/jdk6/jdk6/jdk/file/a68f89bda2cf/src/share/classes/java/lang/ref/Finalizer.java
[2]http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/27561aede285/src/share/classes/java/lang/ref/Finalizer.java
[3]http://titanium.cs.berkeley.edu/doc/java-langspec-1.0/javalang.doc15.html#6892
[4] http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#GetPrimitiveArrayCritical

On 06/17/14 00:15, Stephan Bergmann wrote:
> Hi all,
>
> Does anybody recollect historical details of how execution of (potentially long-running) finalizers impacted overall gc progress?
>
>  From the behavior of a small test program run on OpenJDK 8, it looks like recent JVMs at least offload all finalizer calls to a single dedicated thread, so that a blocking finalizer blocks finalization (and thus reclamation) of other garbage objects with explicit finalizers, but reclamation of other garbage proceeds unhindered.
>
> But how was the behavior in the past?  Was it so that in older JVMs (still in use around 2005) execution of a blocking finalizer could block reclamation of /all/ garbage, even of those objects that did not have explicit finalizers?
>
> (I'm asking because in LibreOffice we have a dedicated thread to which we offload the actual work done by certain objects' finalize methods, introduced around 2005 to work around memory starvation in case one of those finalizers took too long.  But I can't remember whether that was because no garbage at all was reclaimed in such a scenario---and we could drop our additional thread again today---, or because it blocked finalization of unrelated objects with explicit finalizers---in which case we would need to keep our additional thread.)
>
> Stephan


From graham at vast.com  Thu Jun 19 03:19:48 2014
From: graham at vast.com (graham sanderson)
Date: Wed, 18 Jun 2014 22:19:48 -0500
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
In-Reply-To: <2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com>
References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>
	<539F371F.10008@oracle.com>
	<2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com>
Message-ID: <D17A9AC7-A7D0-4AE5-B8F0-2917F7480244@vast.com>

The options are working great and as expected (faster initial mark, and no long pauses after abortable preclean timeout). One weird thing though which I?m curious about: I?m showing some data for six JVMs (calling them nodes - they are on separate machines)

all with :

Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild at c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014
JDK 1.7.0_60-b19
16 gig old gen
8 gig new (6.4 gig eden)
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
256 gig RAM
16 cores (sandy bridge)

Nodes 4-6 also have 

-XX:+CMSEdenChunksRecordAlways
-XX:+CMSParallelInitialMarkEnabled

There are some application level config differences (which limit amount of certain objects kept in memory before flushing to disk) - 1&4 have the same app config, 2&5 have the same app config, 3&6 have the same app config

This first dump, shows two days worth of total times application threads were stopped via grepping logs for Total time for which application threads were stopped and summing the values. worst case 4 minutes over the day is not too bad, so this isn?t a big issue

2014-06-17 : 1 154.623
2014-06-17 : 2 90.3006
2014-06-17 : 3 75.3602
2014-06-17 : 4 180.618
2014-06-17 : 5 107.668
2014-06-17 : 6 99.7783
-------
2014-06-18 : 1 190.741
2014-06-18 : 2 82.8865
2014-06-18 : 3 90.0098
2014-06-18 : 4 239.702
2014-06-18 : 5 149.332
2014-06-18 : 6 138.03

Notably however if you look via JMX/visualGC, the total GC time is actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I know that biased lock revocation and other things cause safe point, so I figure something other than GC must be the cause? so I just did a count of log lines with  Total time for which application threads were stopped and got this:

2014-06-17 : 1 19282
2014-06-17 : 2 6784
2014-06-17 : 3 1275
2014-06-17 : 4 26356
2014-06-17 : 5 14491
2014-06-17 : 6 8402
-------
2014-06-18 : 1 20943
2014-06-18 : 2 1134
2014-06-18 : 3 1129
2014-06-18 : 4 30289
2014-06-18 : 5 16508
2014-06-18 : 6 11459

I can?t cycle these nodes right now (to try each new parameter individually), but am curious whether you can think of why adding these parameters would have such a large effect on the number of safe point stops - e.g. 1129 vs 11459 for otherwise identically configured nodes with very similar workload. Note the ratio is highest on nodes 2 vs node 5 which spill the least into the old generation (so certainly fewer CMS cycles, and also fewer young gen collections) if that sparks any ideas.

Thanks,

Graham.

P.S. It is entirely possible I don?t know exactly what Total time for which application threads were stopped refers to in all cases (I?m assuming it is a safe point stop)

On Jun 16, 2014, at 1:54 PM, graham sanderson <graham at vast.com> wrote:

> Thanks Jon; that?s exactly what i was hoping
> 
> On Jun 16, 2014, at 1:27 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
> 
>> 
>> On 06/12/2014 09:16 PM, graham sanderson wrote:
>>> Hi, I hope this is the right list for this question:
>>> 
>>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19
>>> 
>>> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release)
>>> 
>>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection (is this
>>>           your understanding too?)
>> 
>> Yes.
>> 
>>> 
>>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses
>>> 
>>> we already have
>>> 
>>> -XX:+UseParNewGC 
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+CMSParallelRemarkEnabled
>>> 
>>> we have seen a few long(isn) initial marks, so 
>>> 
>>> -XX:+CMSParallelInitialMarkEnabled sounds good
>>> 
>>> as for 
>>> 
>>> -XX:+CMSEdenChunksRecordAlways
>>> 
>>> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk?
>> 
>> Fast path allocation is done from TLAB's.  If you have to get
>> a new TLAB, the call to get the new TLAB comes from compiled
>> code but the call is into the JVM and  that is the slow path where
>> the sampling is done.
>> 
>> Jon
>> 
>>> 
>>> Thanks,
>>> 
>>> Graham
>>> 
>>> P.S. less relevant I think, but our old generation is 16g
>>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140618/00706484/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1574 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140618/00706484/smime.p7s>

From stefan.karlsson at oracle.com  Thu Jun 19 07:16:38 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 19 Jun 2014 09:16:38 +0200
Subject: RFR: Remove unused _copy_metadata_obj_cl in G1CopyingKeepAliveClosure
Message-ID: <53A28E56.5040504@oracle.com>

Please, review this small patch to remove the unused 
G1CopyingKeepAliveClosure::_copy_metadata_obj_cl.

http://cr.openjdk.java.net/~stefank/8047323/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8047323

thanks,
StefanK


From mikael.gerdin at oracle.com  Thu Jun 19 07:47:53 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 19 Jun 2014 09:47:53 +0200
Subject: RFR: 8046670: Make CMS metadata aware closures applicable for
	other collectors
In-Reply-To: <539983AA.50604@oracle.com>
References: <53996927.1000102@oracle.com> <539983AA.50604@oracle.com>
Message-ID: <3791251.MY5aGxyT38@mgerdin03>

Hi Stefan,

On Thursday 12 June 2014 12.40.42 Stefan Karlsson wrote:
> On 2014-06-12 10:47, Stefan Karlsson wrote:
> > Hi all,
> > 
> > Please, review this patch to make the metadata-tracing oop closures
> > used by CMS available to other collectors. This patch is needed by the
> > G1 Class Unloading work.
> > 
> > http://cr.openjdk.java.net/~stefank/8046670/webrev.00/
> 
> New patch:
> http://cr.openjdk.java.net/~stefank/8046670/webrev.01/
> 

The change looks good.

/Mikael

> The old patch didn't include the new iterator.inline.hpp file. I've
> added the file and made sure that we include it where needed. I've
> verified that this builds without precompiled header.
> 
> I've also verified that we unload classes when running Kitchensink with CMS.
> 
> thanks,
> StefanK
> 
> > https://bugs.openjdk.java.net/browse/JDK-8046670
> > 
> > thanks,
> > StefanK


From mikael.gerdin at oracle.com  Thu Jun 19 07:48:43 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 19 Jun 2014 09:48:43 +0200
Subject: RFR: Remove unused _copy_metadata_obj_cl in
	G1CopyingKeepAliveClosure
In-Reply-To: <53A28E56.5040504@oracle.com>
References: <53A28E56.5040504@oracle.com>
Message-ID: <3272310.UF6rd0vlGd@mgerdin03>

Stefan,

On Thursday 19 June 2014 09.16.38 Stefan Karlsson wrote:
> Please, review this small patch to remove the unused
> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl.
> 
> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/

Looks good.

/Mikael

> https://bugs.openjdk.java.net/browse/JDK-8047323
> 
> thanks,
> StefanK


From thomas.schatzl at oracle.com  Thu Jun 19 08:12:34 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 19 Jun 2014 10:12:34 +0200
Subject: RFR: Remove unused _copy_metadata_obj_cl in
	G1CopyingKeepAliveClosure
In-Reply-To: <53A28E56.5040504@oracle.com>
References: <53A28E56.5040504@oracle.com>
Message-ID: <1403165554.2621.1.camel@cirrus>

Hi,

On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote:
> Please, review this small patch to remove the unused 
> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl.
> 
> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8047323


  I think the change should also remove the metadata_obj_cl from the
constructor as it is obsolete too.

Thanks,
Thomas


From dmitry.fazunenko at oracle.com  Thu Jun 19 08:23:37 2014
From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko)
Date: Thu, 19 Jun 2014 12:23:37 +0400
Subject: RFR: 8026847 [TESTBUG] gc/g1/TestSummarizeRSetStats* tests launch
	32bit jvm with UseCompressedOops
In-Reply-To: <53A18692.20507@oracle.com>
References: <53A18692.20507@oracle.com>
Message-ID: <53A29E09.3000002@oracle.com>

Looks good.

Thanks,
Dima

On 18.06.2014 16:31, Andrey Zakharov wrote:
> Hi, all.
> "UseCompressedOops" options is being used In 
> gc/g1/TestSummarizeRSetStats* tests.
> But it doesn't needed for those tests. Also I have asked Thomas 
> Schatzl about this options and he confirmed useless.
> So here is simple patch - just removing.
>
> webrev:
> http://cr.openjdk.java.net/~fzhinkin/azakharov/8026847/webrev.00/
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8026847
>
> I have tested it locally for 32 and 64bits JDK and also in Aurora 
> (batch 514909.ute.hs_jtreg.accept.full).
> Please, review it.
> Thanks.


From stefan.karlsson at oracle.com  Thu Jun 19 08:26:02 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 19 Jun 2014 10:26:02 +0200
Subject: RFR: Remove unused _copy_metadata_obj_cl in
	G1CopyingKeepAliveClosure
In-Reply-To: <1403165554.2621.1.camel@cirrus>
References: <53A28E56.5040504@oracle.com> <1403165554.2621.1.camel@cirrus>
Message-ID: <53A29E9A.5030309@oracle.com>

On 2014-06-19 10:12, Thomas Schatzl wrote:
> Hi,
>
> On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote:
>> Please, review this small patch to remove the unused
>> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl.
>>
>> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8047323
>
>    I think the change should also remove the metadata_obj_cl from the
> constructor as it is obsolete too.

Updated patch:
http://cr.openjdk.java.net/~stefank/8047323/webrev.01

thanks for catching this,
StefanK

>
> Thanks,
> Thomas
>


From thomas.schatzl at oracle.com  Thu Jun 19 09:01:40 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 19 Jun 2014 11:01:40 +0200
Subject: RFR: Remove unused _copy_metadata_obj_cl in
	G1CopyingKeepAliveClosure
In-Reply-To: <53A29E9A.5030309@oracle.com>
References: <53A28E56.5040504@oracle.com> <1403165554.2621.1.camel@cirrus>
	<53A29E9A.5030309@oracle.com>
Message-ID: <1403168500.2621.3.camel@cirrus>

Hi Stefan,

On Thu, 2014-06-19 at 10:26 +0200, Stefan Karlsson wrote:
> On 2014-06-19 10:12, Thomas Schatzl wrote:
> > Hi,
> >
> > On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote:
> >> Please, review this small patch to remove the unused
> >> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl.
> >>
> >> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/
> >> https://bugs.openjdk.java.net/browse/JDK-8047323
> >
> >    I think the change should also remove the metadata_obj_cl from the
> > constructor as it is obsolete too.
> 
> Updated patch:
> http://cr.openjdk.java.net/~stefank/8047323/webrev.01

Looks good.

Thanks,
Thomas


From stefan.karlsson at oracle.com  Thu Jun 19 09:20:38 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 19 Jun 2014 11:20:38 +0200
Subject: RFR: Remove unused _copy_metadata_obj_cl in
	G1CopyingKeepAliveClosure
In-Reply-To: <1403168500.2621.3.camel@cirrus>
References: <53A28E56.5040504@oracle.com> <1403165554.2621.1.camel@cirrus>	
	<53A29E9A.5030309@oracle.com> <1403168500.2621.3.camel@cirrus>
Message-ID: <53A2AB66.3080701@oracle.com>


On 2014-06-19 11:01, Thomas Schatzl wrote:
> Hi Stefan,
>
> On Thu, 2014-06-19 at 10:26 +0200, Stefan Karlsson wrote:
>> On 2014-06-19 10:12, Thomas Schatzl wrote:
>>> Hi,
>>>
>>> On Thu, 2014-06-19 at 09:16 +0200, Stefan Karlsson wrote:
>>>> Please, review this small patch to remove the unused
>>>> G1CopyingKeepAliveClosure::_copy_metadata_obj_cl.
>>>>
>>>> http://cr.openjdk.java.net/~stefank/8047323/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8047323
>>>     I think the change should also remove the metadata_obj_cl from the
>>> constructor as it is obsolete too.
>> Updated patch:
>> http://cr.openjdk.java.net/~stefank/8047323/webrev.01
> Looks good.

Thanks.

StefanK

>
> Thanks,
> Thomas
>
>


From stefan.karlsson at oracle.com  Thu Jun 19 12:45:13 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 19 Jun 2014 14:45:13 +0200
Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create
	a new RelocIterator
Message-ID: <53A2DB59.9050605@oracle.com>

Hi all,

I have a patch that we have been using in the G1 Class Unloading project 
to lower the remark times.  This changes Compiler code, so I would like 
to get feedback from the Compiler team.

http://cr.openjdk.java.net/~stefank/8047362/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8047362

The patch builds upon the patch in:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html


Summary from the bug report:
---
Creation of RelocIterators show up high in profiles of the remark phase, 
in the G1 Class Unloading project.

There's a pattern in the nmethod/codecache code to create a 
RelocIterator and then materialize a CompiledIC:

     RelocIterator iter(this, low_boundary);
     while(iter.next()) {
       if (iter.type() == relocInfo::virtual_call_type) {
         CompiledIC *ic = CompiledIC_at(iter.reloc());

CompiledIC_at is implemented as:
   new CompiledIC(call_site->code(), nativeCall_at(call_site->addr()));

And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, 
NativeCall* call) does is to create a new RelocIterator:
...
address ic_call = call->instruction_address();
...
   RelocIterator iter(nm, ic_call, ic_call+1);
   bool ret = iter.next();
   assert(ret == true, "relocInfo must exist at this address");
   assert(iter.addr() == ic_call, "must find ic_call");

I would like to propose that we pass down the RelocIterator that we 
already have, instead of creating a new.
---


I've previously received feedback that this seems like reasonable thing 
to do, but that the parameter to the new CompileIC_at should take a 
const RelocIterator* instead of RelocIterator*. I couldn't do that 
without changing a significant amount of Compiler code, so I have left 
it out for now. Any opinions on how to handle that?


To give an idea of the performance difference, I temporarily added the 
following code:
void CodeCache::iterate_through_CIs(int style) {
   int count;
   FOR_ALL_ALIVE_NMETHODS(nm) {
     RelocIterator iter(nm);
     while(iter.next()) {
       if (iter.type() == relocInfo::virtual_call_type ||
           iter.type() == relocInfo::opt_virtual_call_type) {
         if (style > 0) {
           CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : 
CompiledIC_at(iter.reloc());
           if (ic->ic_destination() == (address)0xdeadb000) {
             gclog_or_tty->print_cr("ShouldNotReachHere");
           }
         }
       }
     }
   }
}

and then measured how long time it took to execute 
iterate_through_CIs(style) 1000 times with style == {0, 1, 2}.

The results are:
  iterate_through_CIs(0): 1.210833 s // No CompiledICs created
  iterate_through_CIs(1): 1.976557 s // New style
  iterate_through_CIs(2): 9.924209 s // Old style


Testing:
   A similar version has been used and thoroughly been tested together 
with the other G1 Class Unloading changes. This exact version has so far 
only been tested with Kitchensink and SpecJVM2008 compiler.compiler. 
What test lists would be appropriate to test this with?


thanks,
StefanK


From andreas.sjoberg at oracle.com  Thu Jun 19 13:27:23 2014
From: andreas.sjoberg at oracle.com (=?ISO-8859-1?Q?Andreas_Sj=F6berg?=)
Date: Thu, 19 Jun 2014 15:27:23 +0200
Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1 SparsePRTEntry
Message-ID: <53A2E53B.3050508@oracle.com>

Hi all,

can I please have reviews for this patch that removes the unrolled 
for-loops in sparsePRT.cpp.

I ran some performance benchmarks and could not see any benefits in 
keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on 
Linux x64 actually.

Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/

Testing: jprt, specjbb2005, specjvm2008, specjbb2013

Thanks,
Andreas


From stefan.karlsson at oracle.com  Thu Jun 19 15:36:44 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 19 Jun 2014 17:36:44 +0200
Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create
	a new RelocIterator
In-Reply-To: <53A2DB59.9050605@oracle.com>
References: <53A2DB59.9050605@oracle.com>
Message-ID: <53A3038C.9020004@oracle.com>

This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev.

On 2014-06-19 14:45, Stefan Karlsson wrote:
> Hi all,
>
> I have a patch that we have been using in the G1 Class Unloading 
> project to lower the remark times.  This changes Compiler code, so I 
> would like to get feedback from the Compiler team.
>
> http://cr.openjdk.java.net/~stefank/8047362/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8047362
>
> The patch builds upon the patch in:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html
>
>
> Summary from the bug report:
> ---
> Creation of RelocIterators show up high in profiles of the remark 
> phase, in the G1 Class Unloading project.
>
> There's a pattern in the nmethod/codecache code to create a 
> RelocIterator and then materialize a CompiledIC:
>
>     RelocIterator iter(this, low_boundary);
>     while(iter.next()) {
>       if (iter.type() == relocInfo::virtual_call_type) {
>         CompiledIC *ic = CompiledIC_at(iter.reloc());
>
> CompiledIC_at is implemented as:
>   new CompiledIC(call_site->code(), nativeCall_at(call_site->addr()));
>
> And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, 
> NativeCall* call) does is to create a new RelocIterator:
> ...
> address ic_call = call->instruction_address();
> ...
>   RelocIterator iter(nm, ic_call, ic_call+1);
>   bool ret = iter.next();
>   assert(ret == true, "relocInfo must exist at this address");
>   assert(iter.addr() == ic_call, "must find ic_call");
>
> I would like to propose that we pass down the RelocIterator that we 
> already have, instead of creating a new.
> ---
>
>
> I've previously received feedback that this seems like reasonable 
> thing to do, but that the parameter to the new CompileIC_at should 
> take a const RelocIterator* instead of RelocIterator*. I couldn't do 
> that without changing a significant amount of Compiler code, so I have 
> left it out for now. Any opinions on how to handle that?
>
>
> To give an idea of the performance difference, I temporarily added the 
> following code:
> void CodeCache::iterate_through_CIs(int style) {
>   int count;
>   FOR_ALL_ALIVE_NMETHODS(nm) {
>     RelocIterator iter(nm);
>     while(iter.next()) {
>       if (iter.type() == relocInfo::virtual_call_type ||
>           iter.type() == relocInfo::opt_virtual_call_type) {
>         if (style > 0) {
>           CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : 
> CompiledIC_at(iter.reloc());
>           if (ic->ic_destination() == (address)0xdeadb000) {
>             gclog_or_tty->print_cr("ShouldNotReachHere");
>           }
>         }
>       }
>     }
>   }
> }
>
> and then measured how long time it took to execute 
> iterate_through_CIs(style) 1000 times with style == {0, 1, 2}.
>
> The results are:
>  iterate_through_CIs(0): 1.210833 s // No CompiledICs created
>  iterate_through_CIs(1): 1.976557 s // New style
>  iterate_through_CIs(2): 9.924209 s // Old style
>
>
> Testing:
>   A similar version has been used and thoroughly been tested together 
> with the other G1 Class Unloading changes. This exact version has so 
> far only been tested with Kitchensink and SpecJVM2008 
> compiler.compiler. What test lists would be appropriate to test this 
> with?
>
>
> thanks,
> StefanK
>


From jon.masamitsu at oracle.com  Thu Jun 19 17:23:17 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 19 Jun 2014 10:23:17 -0700
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
In-Reply-To: <D17A9AC7-A7D0-4AE5-B8F0-2917F7480244@vast.com>
References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>
	<539F371F.10008@oracle.com>
	<2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com>
	<D17A9AC7-A7D0-4AE5-B8F0-2917F7480244@vast.com>
Message-ID: <53A31C85.3040600@oracle.com>

Graham,

I don't have any guesses about what is causing the difference in
the number of safepoints.  As you note the messages you are
counting are not specifically GC pauses.    The differences you
are seeing are huge (1129 vs 11459, which, by the way, I don't
see those numbers in the tables; did you mean 31129 vs 611459?).
If those numbers are really GC related I would expect some
dramatic effect on the application behavior.

Sometimes better GC means applications run faster so
maybe more work is getting done in nodes 4-6.
The scale of the difference is surprising though.

Jon

On 6/18/2014 8:19 PM, graham sanderson wrote:
> The options are working great and as expected (faster initial mark, 
> and no long pauses after abortable preclean timeout). One weird thing 
> though which I?m curious about: I?m showing some data for six JVMs 
> (calling them nodes - they are on separate machines)
>
> all with :
>
> Linux version 2.6.32-431.3.1.el6.x86_64 
> (mockbuild at c6b10.bsys.dev.centos.org 
> <mailto:mockbuild at c6b10.bsys.dev.centos.org>) (gcc version 4.4.7 
> 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014
> JDK 1.7.0_60-b19
> 16 gig old gen
> 8 gig new (6.4 gig eden)
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> 256 gig RAM
> 16 cores (sandy bridge)
>
> Nodes 4-6 also have
>
> -XX:+CMSEdenChunksRecordAlways
> -XX:+CMSParallelInitialMarkEnabled
>
> There are some application level config differences (which limit 
> amount of certain objects kept in memory before flushing to disk) - 
> 1&4 have the same app config, 2&5 have the same app config, 3&6 have 
> the same app config
>
> This first dump, shows two days worth of total times application 
> threads were stopped via grepping logs for Total time for which 
> application threads were stopped and summing the values. worst case 4 
> minutes over the day is not too bad, so this isn?t a big issue
>
> 2014-06-17 : 1 154.623
> 2014-06-17 : 2 90.3006
> 2014-06-17 : 3 75.3602
> 2014-06-17 : 4 180.618
> 2014-06-17 : 5 107.668
> 2014-06-17 : 6 99.7783
> -------
> 2014-06-18 : 1 190.741
> 2014-06-18 : 2 82.8865
> 2014-06-18 : 3 90.0098
> 2014-06-18 : 4 239.702
> 2014-06-18 : 5 149.332
> 2014-06-18 : 6 138.03
>
> Notably however if you look via JMX/visualGC, the total GC time is 
> actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I 
> know that biased lock revocation and other things cause safe point, so 
> I figure something other than GC must be the cause? so I just did a 
> count of log lines with Total time for which application threads were 
> stopped and got this:
>
> 2014-06-17 : 1 19282
> 2014-06-17 : 2 6784
> 2014-06-17 : 3 1275
> 2014-06-17 : 4 26356
> 2014-06-17 : 5 14491
> 2014-06-17 : 6 8402
> -------
> 2014-06-18 : 1 20943
> 2014-06-18 : 2 1134
> 2014-06-18 : 3 1129
> 2014-06-18 : 4 30289
> 2014-06-18 : 5 16508
> 2014-06-18 : 6 11459
>
> I can?t cycle these nodes right now (to try each new parameter 
> individually), but am curious whether you can think of why adding 
> these parameters would have such a large effect on the number of safe 
> point stops - e.g. 1129 vs 11459 for otherwise identically configured 
> nodes with very similar workload. Note the ratio is highest on nodes 2 
> vs node 5 which spill the least into the old generation (so certainly 
> fewer CMS cycles, and also fewer young gen collections) if that sparks 
> any ideas.
>
> Thanks,
>
> Graham.
>
> P.S. It is entirely possible I don?t know exactly what Total time for 
> which application threads were stopped refers to in all cases (I?m 
> assuming it is a safe point stop)
>
> On Jun 16, 2014, at 1:54 PM, graham sanderson <graham at vast.com 
> <mailto:graham at vast.com>> wrote:
>
>> Thanks Jon; that?s exactly what i was hoping
>>
>> On Jun 16, 2014, at 1:27 PM, Jon Masamitsu <jon.masamitsu at oracle.com 
>> <mailto:jon.masamitsu at oracle.com>> wrote:
>>
>>>
>>> On 06/12/2014 09:16 PM, graham sanderson wrote:
>>>> Hi, I hope this is the right list for this question:
>>>>
>>>> I was investigating abortable preclean timeouts in our app (and 
>>>> associated long remark pause) so had a look at the old jdk6 code I 
>>>> had on my box, wondered about recording eden chunks during certain 
>>>> eden slow allocation paths (I wasn?t sure if TLAB allocation is 
>>>> just a CAS bump), and saw what looked perfect in the latest code, 
>>>> so was excited to install 1.7.0_60-b19
>>>>
>>>> I wanted to ask what you consider the stability of these two 
>>>> options to be (I?m pretty sure at least the first one is new in 
>>>> this release)
>>>>
>>>> I have just installed locally on my mac, and am aware of 
>>>> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I 
>>>> could reproduce, however I wasn?t able to reproduce it without 
>>>> -XX:-UseCMSCompactAtFullCollection (is this your understanding too?)
>>>
>>> Yes.
>>>
>>>>
>>>> We are running our application with 8 gig young generation (6.4g 
>>>> eden), on boxes with 32 cores? so parallelism is good for short pauses
>>>>
>>>> we already have
>>>>
>>>> -XX:+UseParNewGC
>>>> -XX:+UseConcMarkSweepGC
>>>> -XX:+CMSParallelRemarkEnabled
>>>>
>>>> we have seen a few long(isn) initial marks, so
>>>>
>>>> -XX:+CMSParallelInitialMarkEnabled sounds good
>>>>
>>>> as for
>>>>
>>>> -XX:+CMSEdenChunksRecordAlways
>>>>
>>>> my question is: what constitutes a slow path such an eden chunk is 
>>>> potentially recorded? TLAB allocation, or more horrific things; 
>>>> basically (and I?ll test our app 
>>>> with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll 
>>>> actually get less samples using -XX:+CMSEdenChunksRecordAlways in a 
>>>> highly multithread app than I would with sampling, or put another 
>>>> way? what sort of app allocation patterns if any might avoid the 
>>>> slow path altogether and might leave me with just one chunk?
>>>
>>> Fast path allocation is done from TLAB's.  If you have to get
>>> a new TLAB, the call to get the new TLAB comes from compiled
>>> code but the call is into the JVM and  that is the slow path where
>>> the sampling is done.
>>>
>>> Jon
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Graham
>>>>
>>>> P.S. less relevant I think, but our old generation is 16g
>>>> P.P.S. I suspect the abortable preclean timeouts mostly happen 
>>>> after a burst of very high allocation rate followed by an almost 
>>>> complete lull? this is one of the patterns that can happen in our 
>>>> application
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140619/43a538fb/attachment.htm>

From thomas.schatzl at oracle.com  Thu Jun 19 17:31:05 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 19 Jun 2014 19:31:05 +0200
Subject: RFR: 8026847 [TESTBUG] gc/g1/TestSummarizeRSetStats* tests
	launch 32bit jvm with UseCompressedOops
In-Reply-To: <53A18692.20507@oracle.com>
References: <53A18692.20507@oracle.com>
Message-ID: <1403199065.2621.4.camel@cirrus>

Hi,

On Wed, 2014-06-18 at 16:31 +0400, Andrey Zakharov wrote:
> Hi, all.
> "UseCompressedOops" options is being used In 
> gc/g1/TestSummarizeRSetStats* tests.
> But it doesn't needed for those tests. Also I have asked Thomas Schatzl 
> about this options and he confirmed useless.
> So here is simple patch - just removing.
> 
> webrev:
> http://cr.openjdk.java.net/~fzhinkin/azakharov/8026847/webrev.00/
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8026847

Looks good.

Thomas


From graham at vast.com  Thu Jun 19 19:22:08 2014
From: graham at vast.com (graham sanderson)
Date: Thu, 19 Jun 2014 14:22:08 -0500
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
In-Reply-To: <53A31C85.3040600@oracle.com>
References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>
	<539F371F.10008@oracle.com>
	<2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com>
	<D17A9AC7-A7D0-4AE5-B8F0-2917F7480244@vast.com>
	<53A31C85.3040600@oracle.com>
Message-ID: <80970B7A-B6AE-4125-AA99-4D846F3EF2DB@vast.com>

Ok, thanks?

The 3 before the 1129 and the 6 before the 11459 are the node numbers

I?ll dig around in the source for any way of finding out what the cause of safe points are (I?m not aware of a better -XX: option) ? frankly I?ll probably just report this in the user hotspot-gc-use thread (it isn?t causing any real issues) and see if other people report it too.

Thanks,

Graham

On Jun 19, 2014, at 12:23 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:

> Graham,
> 
> I don't have any guesses about what is causing the difference in
> the number of safepoints.  As you note the messages you are
> counting are not specifically GC pauses.    The differences you
> are seeing are huge (1129 vs 11459, which, by the way, I don't 
> see those numbers in the tables; did you mean 31129 vs 611459?).
> If those numbers are really GC related I would expect some
> dramatic effect on the application behavior.
> 
> Sometimes better GC means applications run faster so
> maybe more work is getting done in nodes 4-6.
> The scale of the difference is surprising though.
> 
> Jon
> 
> On 6/18/2014 8:19 PM, graham sanderson wrote:
>> The options are working great and as expected (faster initial mark, and no long pauses after abortable preclean timeout). One weird thing though which I?m curious about: I?m showing some data for six JVMs (calling them nodes - they are on separate machines)
>> 
>> all with :
>> 
>> Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild at c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014
>> JDK 1.7.0_60-b19
>> 16 gig old gen
>> 8 gig new (6.4 gig eden)
>> -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled
>> 256 gig RAM
>> 16 cores (sandy bridge)
>> 
>> Nodes 4-6 also have 
>> 
>> -XX:+CMSEdenChunksRecordAlways
>> -XX:+CMSParallelInitialMarkEnabled
>> 
>> There are some application level config differences (which limit amount of certain objects kept in memory before flushing to disk) - 1&4 have the same app config, 2&5 have the same app config, 3&6 have the same app config
>> 
>> This first dump, shows two days worth of total times application threads were stopped via grepping logs for Total time for which application threads were stopped and summing the values. worst case 4 minutes over the day is not too bad, so this isn?t a big issue
>> 
>> 2014-06-17 : 1 154.623
>> 2014-06-17 : 2 90.3006
>> 2014-06-17 : 3 75.3602
>> 2014-06-17 : 4 180.618
>> 2014-06-17 : 5 107.668
>> 2014-06-17 : 6 99.7783
>> -------
>> 2014-06-18 : 1 190.741
>> 2014-06-18 : 2 82.8865
>> 2014-06-18 : 3 90.0098
>> 2014-06-18 : 4 239.702
>> 2014-06-18 : 5 149.332
>> 2014-06-18 : 6 138.03
>> 
>> Notably however if you look via JMX/visualGC, the total GC time is actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I know that biased lock revocation and other things cause safe point, so I figure something other than GC must be the cause? so I just did a count of log lines with  Total time for which application threads were stopped and got this:
>> 
>> 2014-06-17 : 1 19282
>> 2014-06-17 : 2 6784
>> 2014-06-17 : 3 1275
>> 2014-06-17 : 4 26356
>> 2014-06-17 : 5 14491
>> 2014-06-17 : 6 8402
>> -------
>> 2014-06-18 : 1 20943
>> 2014-06-18 : 2 1134
>> 2014-06-18 : 3 1129
>> 2014-06-18 : 4 30289
>> 2014-06-18 : 5 16508
>> 2014-06-18 : 6 11459
>> 
>> I can?t cycle these nodes right now (to try each new parameter individually), but am curious whether you can think of why adding these parameters would have such a large effect on the number of safe point stops - e.g. 1129 vs 11459 for otherwise identically configured nodes with very similar workload. Note the ratio is highest on nodes 2 vs node 5 which spill the least into the old generation (so certainly fewer CMS cycles, and also fewer young gen collections) if that sparks any ideas.
>> 
>> Thanks,
>> 
>> Graham.
>> 
>> P.S. It is entirely possible I don?t know exactly what Total time for which application threads were stopped refers to in all cases (I?m assuming it is a safe point stop)
>> 
>> On Jun 16, 2014, at 1:54 PM, graham sanderson <graham at vast.com> wrote:
>> 
>>> Thanks Jon; that?s exactly what i was hoping
>>> 
>>> On Jun 16, 2014, at 1:27 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
>>> 
>>>> 
>>>> On 06/12/2014 09:16 PM, graham sanderson wrote:
>>>>> Hi, I hope this is the right list for this question:
>>>>> 
>>>>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19
>>>>> 
>>>>> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release)
>>>>> 
>>>>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection
>>>>>                               (is this your understanding too?)
>>>> 
>>>> Yes.
>>>> 
>>>>> 
>>>>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses
>>>>> 
>>>>> we already have
>>>>> 
>>>>> -XX:+UseParNewGC 
>>>>> -XX:+UseConcMarkSweepGC
>>>>> -XX:+CMSParallelRemarkEnabled
>>>>> 
>>>>> we have seen a few long(isn) initial marks, so 
>>>>> 
>>>>> -XX:+CMSParallelInitialMarkEnabled sounds good
>>>>> 
>>>>> as for 
>>>>> 
>>>>> -XX:+CMSEdenChunksRecordAlways
>>>>> 
>>>>> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk?
>>>> 
>>>> Fast path allocation is done from TLAB's.  If you have to get
>>>> a new TLAB, the call to get the new TLAB comes from compiled
>>>> code but the call is into the JVM and  that is the slow path where
>>>> the sampling is done.
>>>> 
>>>> Jon
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Graham
>>>>> 
>>>>> P.S. less relevant I think, but our old generation is 16g
>>>>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140619/46f6ee2b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1574 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140619/46f6ee2b/smime.p7s>

From graham at vast.com  Fri Jun 20 14:39:14 2014
From: graham at vast.com (graham sanderson)
Date: Fri, 20 Jun 2014 09:39:14 -0500
Subject: CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled
In-Reply-To: <80970B7A-B6AE-4125-AA99-4D846F3EF2DB@vast.com>
References: <43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com>
	<539F371F.10008@oracle.com>
	<2E44F2B4-D697-495C-A88F-816F6B33D913@vast.com>
	<D17A9AC7-A7D0-4AE5-B8F0-2917F7480244@vast.com>
	<53A31C85.3040600@oracle.com>
	<80970B7A-B6AE-4125-AA99-4D846F3EF2DB@vast.com>
Message-ID: <29C39CCF-DBFD-4554-8072-70BF1FDE098E@vast.com>

All is well in the universe (except for some mild stupidity on my part).

The anomalous safepoints are caused by VisualVM it would seem, which I had been using on and off to watch the GC visually, and just happened to have left it connected to certain nodes for coincidental lengths of time that seemed to produce a somewhat correlated pattern.

I don?t have a dev JVM so I can?t do -XX:+TraceSafepoint, but whatever it is doing, it is doing it once every 1 or 2 seconds, even if you turn all monitoring off (but remain connected)

On Jun 19, 2014, at 2:22 PM, graham sanderson <graham at vast.com> wrote:

> Ok, thanks?
> 
> The 3 before the 1129 and the 6 before the 11459 are the node numbers
> 
> I?ll dig around in the source for any way of finding out what the cause of safe points are (I?m not aware of a better -XX: option) ? frankly I?ll probably just report this in the user hotspot-gc-use thread (it isn?t causing any real issues) and see if other people report it too.
> 
> Thanks,
> 
> Graham
> 
> On Jun 19, 2014, at 12:23 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
> 
>> Graham,
>> 
>> I don't have any guesses about what is causing the difference in
>> the number of safepoints.  As you note the messages you are
>> counting are not specifically GC pauses.    The differences you
>> are seeing are huge (1129 vs 11459, which, by the way, I don't 
>> see those numbers in the tables; did you mean 31129 vs 611459?).
>> If those numbers are really GC related I would expect some
>> dramatic effect on the application behavior.
>> 
>> Sometimes better GC means applications run faster so
>> maybe more work is getting done in nodes 4-6.
>> The scale of the difference is surprising though.
>> 
>> Jon
>> 
>> On 6/18/2014 8:19 PM, graham sanderson wrote:
>>> The options are working great and as expected (faster initial mark, and no long pauses after abortable preclean timeout). One weird thing though which I?m curious about: I?m showing some data for six JVMs (calling them nodes - they are on separate machines)
>>> 
>>> all with :
>>> 
>>> Linux version 2.6.32-431.3.1.el6.x86_64 (mockbuild at c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014
>>> JDK 1.7.0_60-b19
>>> 16 gig old gen
>>> 8 gig new (6.4 gig eden)
>>> -XX:+UseParNewGC
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+CMSParallelRemarkEnabled
>>> 256 gig RAM
>>> 16 cores (sandy bridge)
>>> 
>>> Nodes 4-6 also have 
>>> 
>>> -XX:+CMSEdenChunksRecordAlways
>>> -XX:+CMSParallelInitialMarkEnabled
>>> 
>>> There are some application level config differences (which limit amount of certain objects kept in memory before flushing to disk) - 1&4 have the same app config, 2&5 have the same app config, 3&6 have the same app config
>>> 
>>> This first dump, shows two days worth of total times application threads were stopped via grepping logs for Total time for which application threads were stopped and summing the values. worst case 4 minutes over the day is not too bad, so this isn?t a big issue
>>> 
>>> 2014-06-17 : 1 154.623
>>> 2014-06-17 : 2 90.3006
>>> 2014-06-17 : 3 75.3602
>>> 2014-06-17 : 4 180.618
>>> 2014-06-17 : 5 107.668
>>> 2014-06-17 : 6 99.7783
>>> -------
>>> 2014-06-18 : 1 190.741
>>> 2014-06-18 : 2 82.8865
>>> 2014-06-18 : 3 90.0098
>>> 2014-06-18 : 4 239.702
>>> 2014-06-18 : 5 149.332
>>> 2014-06-18 : 6 138.03
>>> 
>>> Notably however if you look via JMX/visualGC, the total GC time is actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I know that biased lock revocation and other things cause safe point, so I figure something other than GC must be the cause? so I just did a count of log lines with  Total time for which application threads were stopped and got this:
>>> 
>>> 2014-06-17 : 1 19282
>>> 2014-06-17 : 2 6784
>>> 2014-06-17 : 3 1275
>>> 2014-06-17 : 4 26356
>>> 2014-06-17 : 5 14491
>>> 2014-06-17 : 6 8402
>>> -------
>>> 2014-06-18 : 1 20943
>>> 2014-06-18 : 2 1134
>>> 2014-06-18 : 3 1129
>>> 2014-06-18 : 4 30289
>>> 2014-06-18 : 5 16508
>>> 2014-06-18 : 6 11459
>>> 
>>> I can?t cycle these nodes right now (to try each new parameter individually), but am curious whether you can think of why adding these parameters would have such a large effect on the number of safe point stops - e.g. 1129 vs 11459 for otherwise identically configured nodes with very similar workload. Note the ratio is highest on nodes 2 vs node 5 which spill the least into the old generation (so certainly fewer CMS cycles, and also fewer young gen collections) if that sparks any ideas.
>>> 
>>> Thanks,
>>> 
>>> Graham.
>>> 
>>> P.S. It is entirely possible I don?t know exactly what Total time for which application threads were stopped refers to in all cases (I?m assuming it is a safe point stop)
>>> 
>>> On Jun 16, 2014, at 1:54 PM, graham sanderson <graham at vast.com> wrote:
>>> 
>>>> Thanks Jon; that?s exactly what i was hoping
>>>> 
>>>> On Jun 16, 2014, at 1:27 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On 06/12/2014 09:16 PM, graham sanderson wrote:
>>>>>> Hi, I hope this is the right list for this question:
>>>>>> 
>>>>>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn?t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19
>>>>>> 
>>>>>> I wanted to ask what you consider the stability of these two options to be (I?m pretty sure at least the first one is new in this release)
>>>>>> 
>>>>>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn?t able to reproduce it without -XX:-UseCMSCompactAtFullCollection
>>>>>>                               (is this your understanding too?)
>>>>> 
>>>>> Yes.
>>>>> 
>>>>>> 
>>>>>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores? so parallelism is good for short pauses
>>>>>> 
>>>>>> we already have
>>>>>> 
>>>>>> -XX:+UseParNewGC 
>>>>>> -XX:+UseConcMarkSweepGC
>>>>>> -XX:+CMSParallelRemarkEnabled
>>>>>> 
>>>>>> we have seen a few long(isn) initial marks, so 
>>>>>> 
>>>>>> -XX:+CMSParallelInitialMarkEnabled sounds good
>>>>>> 
>>>>>> as for 
>>>>>> 
>>>>>> -XX:+CMSEdenChunksRecordAlways
>>>>>> 
>>>>>> my question is: what constitutes a slow path such an eden chunk is potentially recorded? TLAB allocation, or more horrific things; basically (and I?ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I?ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way? what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk?
>>>>> 
>>>>> Fast path allocation is done from TLAB's.  If you have to get
>>>>> a new TLAB, the call to get the new TLAB comes from compiled
>>>>> code but the call is into the JVM and  that is the slow path where
>>>>> the sampling is done.
>>>>> 
>>>>> Jon
>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Graham
>>>>>> 
>>>>>> P.S. less relevant I think, but our old generation is 16g
>>>>>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull? this is one of the patterns that can happen in our application
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140620/5f6d3fa8/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1574 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140620/5f6d3fa8/smime.p7s>

From thomas.schatzl at oracle.com  Mon Jun 23 10:05:29 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 23 Jun 2014 12:05:29 +0200
Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ...
	not in strong code roots for region
In-Reply-To: <53A03098.2070207@oracle.com>
References: <53A03098.2070207@oracle.com>
Message-ID: <1403517929.2753.22.camel@cirrus>

Hi Per,

On Tue, 2014-06-17 at 14:12 +0200, Per Liden wrote:
> Could I please have this fix reviewed.
> 
> Summary: nmethods are only registered with the heap if 
> nmethod::detect_scavenge_root_oops() returns true. However, in case the 
> nmethod only contains oops to humongous objects 
> detect_scavenge_root_oops() will return false and the nmethod will not 
> be registered. This will later cause heap verification to fail.
> 
> There are several ways in which this can be fixed. One alternative is to 
> adjust the verification to ignore humongous oops (since these objects 
> will never move). Another alternative is to just register the method 
> regardless of what detect_scavenge_root_oops() says. Since we might want 
> to allow humongous objects to move in the future this is the proposed fix.

I agree that this is the better solution.

> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231
> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/
> 
> Testing:
> * gc-test-suite
> * manual ad-hoc testing

  looks good.

Thanks,
  Thomas


From per.liden at oracle.com  Mon Jun 23 10:44:47 2014
From: per.liden at oracle.com (Per Liden)
Date: Mon, 23 Jun 2014 12:44:47 +0200
Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ...
	not in strong code roots for region
In-Reply-To: <1403517929.2753.22.camel@cirrus>
References: <53A03098.2070207@oracle.com> <1403517929.2753.22.camel@cirrus>
Message-ID: <53A8051F.3040403@oracle.com>

Thanks Thomas!

/Per

On 06/23/2014 12:05 PM, Thomas Schatzl wrote:
> Hi Per,
>
> On Tue, 2014-06-17 at 14:12 +0200, Per Liden wrote:
>> Could I please have this fix reviewed.
>>
>> Summary: nmethods are only registered with the heap if
>> nmethod::detect_scavenge_root_oops() returns true. However, in case the
>> nmethod only contains oops to humongous objects
>> detect_scavenge_root_oops() will return false and the nmethod will not
>> be registered. This will later cause heap verification to fail.
>>
>> There are several ways in which this can be fixed. One alternative is to
>> adjust the verification to ignore humongous oops (since these objects
>> will never move). Another alternative is to just register the method
>> regardless of what detect_scavenge_root_oops() says. Since we might want
>> to allow humongous objects to move in the future this is the proposed fix.
>
> I agree that this is the better solution.
>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231
>> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/
>>
>> Testing:
>> * gc-test-suite
>> * manual ad-hoc testing
>
>    looks good.
>
> Thanks,
>    Thomas
>


From erik.helin at oracle.com  Mon Jun 23 12:39:11 2014
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 23 Jun 2014 14:39:11 +0200
Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1
	SparsePRTEntry
In-Reply-To: <53A2E53B.3050508@oracle.com>
References: <53A2E53B.3050508@oracle.com>
Message-ID: <1846572.Plu0fjq2zP@ehelin-laptop>

On Thursday 19 June 2014 15:27:23 PM Andreas Sj?berg wrote:
> Hi all,
> 
> can I please have reviews for this patch that removes the unrolled
> for-loops in sparsePRT.cpp.
> 
> I ran some performance benchmarks and could not see any benefits in
> keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on
> Linux x64 actually.
> 
> Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/

Looks good, Reviewed!

Thanks,
Erik

> Testing: jprt, specjbb2005, specjvm2008, specjbb2013
> 
> Thanks,
> Andreas


From erik.helin at oracle.com  Mon Jun 23 13:47:02 2014
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 23 Jun 2014 15:47:02 +0200
Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ... not
	in strong code roots for region
In-Reply-To: <53A03098.2070207@oracle.com>
References: <53A03098.2070207@oracle.com>
Message-ID: <2481990.7gl4ni0RtK@ehelin-laptop>

On Tuesday 17 June 2014 14:12:08 PM Per Liden wrote:
> Could I please have this fix reviewed.
> 
> Summary: nmethods are only registered with the heap if
> nmethod::detect_scavenge_root_oops() returns true. However, in case the
> nmethod only contains oops to humongous objects
> detect_scavenge_root_oops() will return false and the nmethod will not
> be registered. This will later cause heap verification to fail.
> 
> There are several ways in which this can be fixed. One alternative is to
> adjust the verification to ignore humongous oops (since these objects
> will never move). Another alternative is to just register the method
> regardless of what detect_scavenge_root_oops() says. Since we might want
> to allow humongous objects to move in the future this is the proposed fix.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231
> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/

Looks good, Reviewed.

Thanks,
Erik

> Testing:
> * gc-test-suite
> * manual ad-hoc testing
> 
> Thanks!
> /Per


From per.liden at oracle.com  Mon Jun 23 14:18:14 2014
From: per.liden at oracle.com (Per Liden)
Date: Mon, 23 Jun 2014 16:18:14 +0200
Subject: RFR(s): 8046231: G1: Code root location ... from nmethod ...
	not in strong code roots for region
In-Reply-To: <2481990.7gl4ni0RtK@ehelin-laptop>
References: <53A03098.2070207@oracle.com> <2481990.7gl4ni0RtK@ehelin-laptop>
Message-ID: <53A83726.90809@oracle.com>

Thanks Erik!

/Per

On 06/23/2014 03:47 PM, Erik Helin wrote:
> On Tuesday 17 June 2014 14:12:08 PM Per Liden wrote:
>> Could I please have this fix reviewed.
>>
>> Summary: nmethods are only registered with the heap if
>> nmethod::detect_scavenge_root_oops() returns true. However, in case the
>> nmethod only contains oops to humongous objects
>> detect_scavenge_root_oops() will return false and the nmethod will not
>> be registered. This will later cause heap verification to fail.
>>
>> There are several ways in which this can be fixed. One alternative is to
>> adjust the verification to ignore humongous oops (since these objects
>> will never move). Another alternative is to just register the method
>> regardless of what detect_scavenge_root_oops() says. Since we might want
>> to allow humongous objects to move in the future this is the proposed fix.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8046231
>> Webrev: http://cr.openjdk.java.net/~pliden/8046231/webrev.0/
>
> Looks good, Reviewed.
>
> Thanks,
> Erik
>
>> Testing:
>> * gc-test-suite
>> * manual ad-hoc testing
>>
>> Thanks!
>> /Per
>


From mikael.gerdin at oracle.com  Mon Jun 23 14:25:40 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 23 Jun 2014 16:25:40 +0200
Subject: RFR: 8047819: G1 HeapRegionDCTOC does not need to inherit
	ContiguousSpaceDCTOC
Message-ID: <87729984.51aIKtkShi@mgerdin03>

Hi!

As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
G1 needs to stop using the special version of DirtyCardToOopClosure.
This also makes the code more easy to follow since G1 never actually 
relies on the functionality from Filtering_DCTOC and ContiguousSpaceDCTOC.

This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 
which are needed to refactor the HeapRegion class and its superclasses 
in order to simplify the G1 class unloading change which is coming.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8047819
Webrev:
http://cr.openjdk.java.net/~mgerdin/8047819/webrev/

[1] https://bugs.openjdk.java.net/browse/JDK-8047818

Thanks
/Mikael


From mikael.gerdin at oracle.com  Mon Jun 23 14:25:53 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 23 Jun 2014 16:25:53 +0200
Subject: RFR: 8047820: G1 Block offset table does not need to support generic
	Space classes
Message-ID: <11148293.8zVS3laxSo@mgerdin03>

Hi!

As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
G1's block offset table needs to be modified to work with Space subclasses 
which are not subclasses of ContiguousSpace. Just change the code to have
knowledge of G1OffsetTableContigSpace.

This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 
which are needed to refactor the HeapRegion class and its superclasses 
in order to simplify the G1 class unloading change which is coming.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8047820
Webrev:
http://cr.openjdk.java.net/~mgerdin/8047820/webrev/

[1] https://bugs.openjdk.java.net/browse/JDK-8047818

Thanks
/Mikael


From mikael.gerdin at oracle.com  Mon Jun 23 14:26:00 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 23 Jun 2014 16:26 +0200
Subject: RFR: 8047821: G1 Does not use the save_marks functionality as intended
Message-ID: <1878024.dB1lCmA3nF@mgerdin03>

Hi!

As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
and as a general cleanup we should rename the save_marks and set_saved_marks
methods on HeapRegion. They are not used with oops_since_saved_marks_iterate
and cause more confusion than anything. 

This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 
which are needed to refactor the HeapRegion class and its superclasses 
in order to simplify the G1 class unloading change which is coming.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8047821
Webrev:
http://cr.openjdk.java.net/~mgerdin/8047821/webrev/

[1] https://bugs.openjdk.java.net/browse/JDK-8047818

Thanks
/Mikael


From mikael.gerdin at oracle.com  Mon Jun 23 14:26:03 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 23 Jun 2014 16:26:03 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
Message-ID: <17336712.znP3JIk1Pt@mgerdin03>

Hi!

When G1 is modified to unload classes without doing full collections the old 
HeapRegions can contain unparseable objects. This makes ContiguousSpace 
unsuitable as a base class for HeapRegion since it assumes that all objects 
below _top are parseable.

Modify G1OffsetTableContigSpace to implement allocation with a separate _top 
and reimplement some Space pure virtuals to make object iteration work as 
expected.

This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820, 
8047821 which are needed to refactor the HeapRegion class and its superclasses 
in order to simplify the G1 class unloading change which is coming.
This change depends on the 19, 20 and 21 changes.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8047818
Webrev:
http://cr.openjdk.java.net/~mgerdin/8047818/webrev/

Notes:
The moving of set_offset_range is due to an introduced circular dependency 
between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp

Thanks
/Mikael


From mikael.gerdin at oracle.com  Mon Jun 23 14:51:54 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 23 Jun 2014 16:51:54 +0200
Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1
	SparsePRTEntry
In-Reply-To: <53A2E53B.3050508@oracle.com>
References: <53A2E53B.3050508@oracle.com>
Message-ID: <2772002.dt1otWjIrg@mgerdin03>

Hi Andreas,

On Thursday 19 June 2014 15.27.23 Andreas Sj?berg wrote:
> Hi all,
> 
> can I please have reviews for this patch that removes the unrolled
> for-loops in sparsePRT.cpp.
> 
> I ran some performance benchmarks and could not see any benefits in
> keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on
> Linux x64 actually.
> 
> Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/

It looks like you can remove the define as well:
  36 #define UNROLL_CARD_LOOPS  1

UnrollFactor should also be useless now, but it seems like it's being used to 
align up the number of cards. I suggest you leave UnrollFactor for a second 
cleanup.

/Mikael

> 
> Testing: jprt, specjbb2005, specjvm2008, specjbb2013
> 
> Thanks,
> Andreas


From markus.gronlund at oracle.com  Mon Jun 23 15:26:51 2014
From: markus.gronlund at oracle.com (=?iso-8859-1?B?TWFya3VzIEdy9m5sdW5k?=)
Date: Mon, 23 Jun 2014 08:26:51 -0700 (PDT)
Subject: FW: RFR(S): 8047812: Ensure
	ClassLoaderDataGraph::classes_unloading_do only delivers klasses from
	CLDs with non-reclaimed class loader oops
Message-ID: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default>

Sending this to the Hotspot-GC-dev group as well.

?

/Markus

?

From: Markus Gr?nlund 
Sent: den 23 juni 2014 17:03
To: hotspot-runtime-dev; serviceability-dev
Subject: RFR(S): 8047812: Ensure ClassLoaderDataGraph::classes_unloading_do only delivers klasses from CLDs with non-reclaimed class loader oops

?

Greetings,

?

Kindly asking for reviews for the following change:

?

Bug: https://bugs.openjdk.java.net/browse/JDK-8047812

Webrev: http://cr.openjdk.java.net/~mgronlun/8047812/webrev01 

?

Description:

The "8038212: Method::is_valid_method() check has performance regression 
??impact for stackwalking" - changeset introduced a change in how the ClassLoaderDataGraph::_unloading list of ClassLoaderData's is purged. 

This change to the purging of the CLD's work the same as before for most GC's, but when using CMS GC, SystemDictionary::do_unloading() is called twice with no explicit purge call in between. On the second call (post-sweep), we can now get stale class loader oops delivered as part of the Klass closure callbacks from the _unloading list. Again, this is because there is no explicit purge call in between these two entries to SystemDictionary::do_unloading() - and being CMS and concurrent, it is very hard to accommodate a timely and proper purge call here.

The first do_unloading call comes after CMS concurrent marking, and the second comes from a Full GC triggered while sweeping the CMS heap. 

This fix ensures the unloading purge mechanism to work correctly also for the CMS collector, in that only CLDs with non-reclaimed class loader oops will deliver klasses from the _unloading list. In addition, this will ensure a single "logical" pass is achieved when iterating the unloading list in-between purges (avoiding the processing of the same data twice). 

This fix is precipitated by nightly testing failures with CMS after the introduction of 8038212: Method::is_valid_method() check has performance regression 
??impact for stackwalking" - for example "nsk/sysdict/vm/stress/jck12a//sysdictj12a008" which is crashing because of following up stale klass loader oop's from the ClassLoaderDataGraph::_unloading list.

?

Thanks

Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140623/95cfe169/attachment.htm>

From stefan.karlsson at oracle.com  Mon Jun 23 15:45:47 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 23 Jun 2014 17:45:47 +0200
Subject: FW: RFR(S): 8047812: Ensure
	ClassLoaderDataGraph::classes_unloading_do
	only delivers klasses from CLDs with non-reclaimed class loader oops
In-Reply-To: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default>
References: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default>
Message-ID: <53A84BAB.2040709@oracle.com>

Markus,

You need to include all three mailing list in the same mail, or else the 
mail threads will diverge.

thanks,
StefanK

On 2014-06-23 17:26, Markus Gr?nlund wrote:
>
> Sending this to the Hotspot-GC-dev group as well.
>
> /Markus
>
> *From:*Markus Gr?nlund
> *Sent:* den 23 juni 2014 17:03
> *To:* hotspot-runtime-dev; serviceability-dev
> *Subject:* RFR(S): 8047812: Ensure 
> ClassLoaderDataGraph::classes_unloading_do only delivers klasses from 
> CLDs with non-reclaimed class loader oops
>
> Greetings,
>
> Kindly asking for reviews for the following change:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8047812
>
> Webrev: http://cr.openjdk.java.net/~mgronlun/8047812/webrev01 
> <http://cr.openjdk.java.net/%7Emgronlun/8047812/webrev01>
>
> Description:
>
> The "8038212: Method::is_valid_method() check has performance regression
>   impact for stackwalking" - changeset introduced a change in how the 
> ClassLoaderDataGraph::_unloading list of ClassLoaderData's is purged.
>
> This change to the purging of the CLD's work the same as before for 
> most GC's, but when using CMS GC, SystemDictionary::do_unloading() is 
> called twice with no explicit purge call in between. On the second 
> call (post-sweep), we can now get stale class loader oops delivered as 
> part of the Klass closure callbacks from the _unloading list. Again, 
> this is because there is no explicit purge call in between these two 
> entries to SystemDictionary::do_unloading() - and being CMS and 
> concurrent, it is very hard to accommodate a timely and proper purge 
> call here.
>
> The first do_unloading call comes after CMS concurrent marking, and 
> the second comes from a Full GC triggered while sweeping the CMS heap.
>
> This fix ensures the unloading purge mechanism to work correctly also 
> for the CMS collector, in that only CLDs with non-reclaimed class 
> loader oops will deliver klasses from the _unloading list. In 
> addition, this will ensure a single "logical" pass is achieved when 
> iterating the unloading list in-between purges (avoiding the 
> processing of the same data twice).
>
> This fix is precipitated by nightly testing failures with CMS after 
> the introduction of 8038212: Method::is_valid_method() check has 
> performance regression
>   impact for stackwalking" - for example 
> "nsk/sysdict/vm/stress/jck12a//sysdictj12a008" which is crashing 
> because of following up stale klass loader oop's from the 
> ClassLoaderDataGraph::_unloading list.
>
> Thanks
>
> Markus
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140623/39254c02/attachment.htm>

From stefan.karlsson at oracle.com  Tue Jun 24 09:05:08 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 24 Jun 2014 11:05:08 +0200
Subject: RFR: 8047819: G1 HeapRegionDCTOC does not need to inherit
	ContiguousSpaceDCTOC
In-Reply-To: <87729984.51aIKtkShi@mgerdin03>
References: <87729984.51aIKtkShi@mgerdin03>
Message-ID: <53A93F44.6050907@oracle.com>


On 2014-06-23 16:25, Mikael Gerdin wrote:
> Hi!
>
> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
> G1 needs to stop using the special version of DirtyCardToOopClosure.
> This also makes the code more easy to follow since G1 never actually
> relies on the functionality from Filtering_DCTOC and ContiguousSpaceDCTOC.
>
> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821
> which are needed to refactor the HeapRegion class and its superclasses
> in order to simplify the G1 class unloading change which is coming.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047819
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047819/webrev/

Looks good.

StefanK

>
> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
>
> Thanks
> /Mikael


From stefan.karlsson at oracle.com  Tue Jun 24 09:23:36 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 24 Jun 2014 11:23:36 +0200
Subject: RFR: 8047820: G1 Block offset table does not need to support
	generic Space classes
In-Reply-To: <11148293.8zVS3laxSo@mgerdin03>
References: <11148293.8zVS3laxSo@mgerdin03>
Message-ID: <53A94398.2080301@oracle.com>


On 2014-06-23 16:25, Mikael Gerdin wrote:
> Hi!
>
> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
> G1's block offset table needs to be modified to work with Space subclasses
> which are not subclasses of ContiguousSpace. Just change the code to have
> knowledge of G1OffsetTableContigSpace.
>
> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821
> which are needed to refactor the HeapRegion class and its superclasses
> in order to simplify the G1 class unloading change which is coming.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047820
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/

http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implementation/g1/g1BlockOffsetTable.cpp.udiff.html
http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implementation/g1/g1BlockOffsetTable.inline.hpp.udiff.html

I talked to Mikael and we decided to do the changes from obj->size() to 
block_size() in a later change.

thanks,
StefanK

>
> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
>
> Thanks
> /Mikael


From stefan.karlsson at oracle.com  Tue Jun 24 10:25:25 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 24 Jun 2014 12:25:25 +0200
Subject: RFR: 8047821: G1 Does not use the save_marks functionality as
	intended
In-Reply-To: <1878024.dB1lCmA3nF@mgerdin03>
References: <1878024.dB1lCmA3nF@mgerdin03>
Message-ID: <53A95215.9020108@oracle.com>


On 2014-06-23 16:26, Mikael Gerdin wrote:
> Hi!
>
> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
> and as a general cleanup we should rename the save_marks and set_saved_marks
> methods on HeapRegion. They are not used with oops_since_saved_marks_iterate
> and cause more confusion than anything.
>
> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821
> which are needed to refactor the HeapRegion class and its superclasses
> in order to simplify the G1 class unloading change which is coming.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047821
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/

Looks good, but it would be nice if you could remove these as well:

http://cr.openjdk.java.net/~mgerdin/8047821/webrev/src/share/vm/gc_implementation/g1/heapRegion.hpp.frames.html

  583   // Apply "cl->do_oop" to (the addresses of) all reference fields in objects
  584   // allocated in the current region before the last call to "save_mark".
  585   void oop_before_save_marks_iterate(ExtendedOopClosure* cl);

and

  205   // Requires that the region "mr" be dense with objects, and begin and end
  206   // with an object.
  207   void oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl);

and

  396 void HeapRegion::oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl) {
  397   HeapWord* p = mr.start();
  398   HeapWord* e = mr.end();
  399   oop obj;
  400   while (p < e) {
  401     obj = oop(p);
  402     p += obj->oop_iterate(cl);
  403   }
  404   assert(p == e, "bad memregion: doesn't end on obj boundary");
  405 }


thanks,
StefanK

>
> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
>
> Thanks
> /Mikael


From stefan.karlsson at oracle.com  Tue Jun 24 11:06:46 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 24 Jun 2014 13:06:46 +0200
Subject: RFR: 8047821: G1 Does not use the save_marks functionality as
	intended
In-Reply-To: <53A95215.9020108@oracle.com>
References: <1878024.dB1lCmA3nF@mgerdin03> <53A95215.9020108@oracle.com>
Message-ID: <53A95BC6.60401@oracle.com>


On 2014-06-24 12:25, Stefan Karlsson wrote:
>
> On 2014-06-23 16:26, Mikael Gerdin wrote:
>> Hi!
>>
>> As part of a larger effort to detach G1's HeapRegion from 
>> ContiguousSpace[1]
>> and as a general cleanup we should rename the save_marks and 
>> set_saved_marks
>> methods on HeapRegion. They are not used with 
>> oops_since_saved_marks_iterate
>> and cause more confusion than anything.
>>
>> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 
>> 8047821
>> which are needed to refactor the HeapRegion class and its superclasses
>> in order to simplify the G1 class unloading change which is coming.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8047821
>> Webrev:
>> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/
>
> Looks good, but it would be nice if you could remove these as well:
>
> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/src/share/vm/gc_implementation/g1/heapRegion.hpp.frames.html 
>

This should also be removed:

  572   // Allows logical separation between objects allocated before and after.
  573   void save_marks();

StefanK

>
>  583   // Apply "cl->do_oop" to (the addresses of) all reference 
> fields in objects
>  584   // allocated in the current region before the last call to 
> "save_mark".
>  585   void oop_before_save_marks_iterate(ExtendedOopClosure* cl);
>
> and
>
>  205   // Requires that the region "mr" be dense with objects, and 
> begin and end
>  206   // with an object.
>  207   void oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl);
>
> and
>
>  396 void HeapRegion::oops_in_mr_iterate(MemRegion mr, 
> ExtendedOopClosure* cl) {
>  397   HeapWord* p = mr.start();
>  398   HeapWord* e = mr.end();
>  399   oop obj;
>  400   while (p < e) {
>  401     obj = oop(p);
>  402     p += obj->oop_iterate(cl);
>  403   }
>  404   assert(p == e, "bad memregion: doesn't end on obj boundary");
>  405 }
>
>
> thanks,
> StefanK
>
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
>>
>> Thanks
>> /Mikael
>


From mikael.gerdin at oracle.com  Tue Jun 24 11:33:01 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 24 Jun 2014 13:33:01 +0200
Subject: RFR: 8047821: G1 Does not use the save_marks functionality as
	intended
In-Reply-To: <53A95BC6.60401@oracle.com>
References: <1878024.dB1lCmA3nF@mgerdin03> <53A95215.9020108@oracle.com>
	<53A95BC6.60401@oracle.com>
Message-ID: <4395417.SDKTkeWyTO@mgerdin03>

On Tuesday 24 June 2014 13.06.46 Stefan Karlsson wrote:
> On 2014-06-24 12:25, Stefan Karlsson wrote:
> > On 2014-06-23 16:26, Mikael Gerdin wrote:
> >> Hi!
> >> 
> >> As part of a larger effort to detach G1's HeapRegion from
> >> ContiguousSpace[1]
> >> and as a general cleanup we should rename the save_marks and
> >> set_saved_marks
> >> methods on HeapRegion. They are not used with
> >> oops_since_saved_marks_iterate
> >> and cause more confusion than anything.
> >> 
> >> This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
> >> 8047821
> >> which are needed to refactor the HeapRegion class and its superclasses
> >> in order to simplify the G1 class unloading change which is coming.
> >> 
> >> Bug:
> >> https://bugs.openjdk.java.net/browse/JDK-8047821
> >> Webrev:
> >> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/
> > 
> > Looks good, but it would be nice if you could remove these as well:
> > 
> > http://cr.openjdk.java.net/~mgerdin/8047821/webrev/src/share/vm/gc_impleme
> > ntation/g1/heapRegion.hpp.frames.html
> This should also be removed:
> 
>   572   // Allows logical separation between objects allocated before and
> after. 573   void save_marks();

Will do.
Thanks
/Mikael

> 
> StefanK
> 
> >  583   // Apply "cl->do_oop" to (the addresses of) all reference
> > 
> > fields in objects
> > 
> >  584   // allocated in the current region before the last call to
> > 
> > "save_mark".
> > 
> >  585   void oop_before_save_marks_iterate(ExtendedOopClosure* cl);
> > 
> > and
> > 
> >  205   // Requires that the region "mr" be dense with objects, and
> > 
> > begin and end
> > 
> >  206   // with an object.
> >  207   void oops_in_mr_iterate(MemRegion mr, ExtendedOopClosure* cl);
> > 
> > and
> > 
> >  396 void HeapRegion::oops_in_mr_iterate(MemRegion mr,
> > 
> > ExtendedOopClosure* cl) {
> > 
> >  397   HeapWord* p = mr.start();
> >  398   HeapWord* e = mr.end();
> >  399   oop obj;
> >  400   while (p < e) {
> >  401     obj = oop(p);
> >  402     p += obj->oop_iterate(cl);
> >  403   }
> >  404   assert(p == e, "bad memregion: doesn't end on obj boundary");
> >  405 }
> > 
> > thanks,
> > StefanK
> > 
> >> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
> >> 
> >> Thanks
> >> /Mikael


From mikael.gerdin at oracle.com  Tue Jun 24 12:10:47 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 24 Jun 2014 14:10:47 +0200
Subject: RFR: 8047821: G1 Does not use the save_marks functionality as
	intended
In-Reply-To: <1878024.dB1lCmA3nF@mgerdin03>
References: <1878024.dB1lCmA3nF@mgerdin03>
Message-ID: <2070242.8rk8uRrq1v@mgerdin03>

Hi!

On Monday 23 June 2014 16.26.00 Mikael Gerdin wrote:
> Hi!
> 
> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
> and as a general cleanup we should rename the save_marks and
> set_saved_marks methods on HeapRegion. They are not used with
> oops_since_saved_marks_iterate and cause more confusion than anything.
> 
> This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
> 8047821 which are needed to refactor the HeapRegion class and its
> superclasses in order to simplify the G1 class unloading change which is
> coming.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047821
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/

Stefan discovered some more dead code in HeapRegion, here are a new set of 
webrevs:

http://cr.openjdk.java.net/~mgerdin/8047821/webrev.0_to_1/
http://cr.openjdk.java.net/~mgerdin/8047821/webrev.1/

/Mikael

> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
> 
> Thanks
> /Mikael


From mikael.gerdin at oracle.com  Tue Jun 24 12:11:59 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 24 Jun 2014 14:11:59 +0200
Subject: RFR: 8047820: G1 Block offset table does not need to support
	generic Space classes
In-Reply-To: <53A94398.2080301@oracle.com>
References: <11148293.8zVS3laxSo@mgerdin03> <53A94398.2080301@oracle.com>
Message-ID: <1942487.fCe74F8AUE@mgerdin03>

On Tuesday 24 June 2014 11.23.36 Stefan Karlsson wrote:
> On 2014-06-23 16:25, Mikael Gerdin wrote:
> > Hi!
> > 
> > As part of a larger effort to detach G1's HeapRegion from
> > ContiguousSpace[1] G1's block offset table needs to be modified to work
> > with Space subclasses which are not subclasses of ContiguousSpace. Just
> > change the code to have knowledge of G1OffsetTableContigSpace.
> > 
> > This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
> > 8047821 which are needed to refactor the HeapRegion class and its
> > superclasses in order to simplify the G1 class unloading change which is
> > coming.
> > 
> > Bug:
> > https://bugs.openjdk.java.net/browse/JDK-8047820
> > Webrev:
> > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/
> 
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implement
> ation/g1/g1BlockOffsetTable.cpp.udiff.html
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implemen
> tation/g1/g1BlockOffsetTable.inline.hpp.udiff.html
> 
> I talked to Mikael and we decided to do the changes from obj->size() to
> block_size() in a later change.

And here's the webrev reflecting that change.

http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1/

/Mikael

> 
> thanks,
> StefanK
> 
> > [1] https://bugs.openjdk.java.net/browse/JDK-8047818
> > 
> > Thanks
> > /Mikael


From thomas.schatzl at oracle.com  Tue Jun 24 12:20:30 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 24 Jun 2014 14:20:30 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03>
Message-ID: <1403612430.2662.14.camel@cirrus>

Hi,

On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote:
> Hi!
> 
> When G1 is modified to unload classes without doing full collections the old 
> HeapRegions can contain unparseable objects. This makes ContiguousSpace 
> unsuitable as a base class for HeapRegion since it assumes that all objects 
> below _top are parseable.
> 
> Modify G1OffsetTableContigSpace to implement allocation with a separate _top 
> and reimplement some Space pure virtuals to make object iteration work as 
> expected.
> 
> This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820, 
> 8047821 which are needed to refactor the HeapRegion class and its superclasses 
> in order to simplify the G1 class unloading change which is coming.
> This change depends on the 19, 20 and 21 changes.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047818
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> 
> Notes:
> The moving of set_offset_range is due to an introduced circular dependency 
> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp

  a few minor nits:

 - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should
access _top directly per coding guidelines
 - just a note: _top should be declared volatile as it is used in the
CAS, although the code is correct. However there is already an issue for
that https://bugs.openjdk.java.net/browse/JDK-8033552, so I suggest
postponing this.
 - extra newline after G1OffsetTableContigSpace::allocate_inner()
 - extra newline after G1BlockOffsetSharedArray::set_offset_array()

Thanks,
  Thomas


From thomas.schatzl at oracle.com  Tue Jun 24 12:25:13 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 24 Jun 2014 14:25:13 +0200
Subject: RFR: 8047819: G1 HeapRegionDCTOC does not need to inherit
	ContiguousSpaceDCTOC
In-Reply-To: <87729984.51aIKtkShi@mgerdin03>
References: <87729984.51aIKtkShi@mgerdin03>
Message-ID: <1403612713.2662.15.camel@cirrus>

Hi,

On Mon, 2014-06-23 at 16:25 +0200, Mikael Gerdin wrote:
> Hi!
> 
> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
> G1 needs to stop using the special version of DirtyCardToOopClosure.
> This also makes the code more easy to follow since G1 never actually 
> relies on the functionality from Filtering_DCTOC and ContiguousSpaceDCTOC.
> 
> This change is part of a set of 4 changes: 8047818, 8047819, 8047820, 8047821 
> which are needed to refactor the HeapRegion class and its superclasses 
> in order to simplify the G1 class unloading change which is coming.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047819
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047819/webrev/
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8047818

  looks good.

Thomas


From stefan.karlsson at oracle.com  Tue Jun 24 13:32:44 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 24 Jun 2014 15:32:44 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03>
Message-ID: <53A97DFC.2040809@oracle.com>


On 2014-06-23 16:26, Mikael Gerdin wrote:
> Hi!
>
> When G1 is modified to unload classes without doing full collections the old
> HeapRegions can contain unparseable objects. This makes ContiguousSpace
> unsuitable as a base class for HeapRegion since it assumes that all objects
> below _top are parseable.
>
> Modify G1OffsetTableContigSpace to implement allocation with a separate _top
> and reimplement some Space pure virtuals to make object iteration work as
> expected.
>
> This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820,
> 8047821 which are needed to refactor the HeapRegion class and its superclasses
> in order to simplify the G1 class unloading change which is coming.
> This change depends on the 19, 20 and 21 changes.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047818
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/

http://cr.openjdk.java.net/~mgerdin/8047818/webrev/src/share/vm/gc_implementation/g1/heapRegion.hpp.udiff.html

+  inline HeapWord* cas_allocate_inner(size_t size);
+  inline HeapWord* allocate_inner(size_t size);


Could you move these declarations to a point after the variable 
declarations?

+  inline void set_top(HeapWord* value) { _top = value; }


No need for the inline keyword here.

  void G1OffsetTableContigSpace::clear(bool mangle_space) {
-  ContiguousSpace::clear(mangle_space);
+  set_top(bottom());
+  CompactibleSpace::clear(mangle_space);


ContiguousSpace::clear calls   void set_saved_mark()            { 
_saved_mark_word = top();    }

I think it might be worth being a bit defensive and doing the same from 
G1OffsetTableContigSpace::clear.

+void G1OffsetTableContigSpace::object_iterate(ObjectClosure* blk) {
+  HeapWord* p = bottom();
+  if (!block_is_obj(p)) {
+    p += block_size(p);
+  }
+  while (p < top()) {
+    blk->do_object(oop(p));
+    p += block_size(p);
+  }

Shouldn't blk->do_object(oop(p)) be guarded by: if (!block_is_obj(p))


  G1OffsetTableContigSpace::
  G1OffsetTableContigSpace(G1BlockOffsetSharedArray* sharedOffsetArray,
                           MemRegion mr) :
+  _top(bottom()),
    _offsets(sharedOffsetArray, mr),
    _par_alloc_lock(Mutex::leaf, "OffsetTableContigSpace par alloc lock", true),
    _gc_time_stamp(0)
  {
    _offsets.set_space(this);
    // false ==> we'll do the clearing if there's clearing to be done.
-  ContiguousSpace::initialize(mr, false, SpaceDecorator::Mangle);
+  CompactibleSpace::initialize(mr, false, SpaceDecorator::Mangle);


bottom() is used before _bottom has been initialized to the correct 
value. As the code stands we set _top to NULL. _bottom is initialized 
through this call chain:
  CompactibleSpace::initialize.
   Space::initialize
    set_bottom

And the SA agent needs to be updated. =)

thanks,
StefanK

>
> Notes:
> The moving of set_offset_range is due to an introduced circular dependency
> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
>
> Thanks
> /Mikael
>


From stefan.karlsson at oracle.com  Tue Jun 24 13:39:29 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 24 Jun 2014 15:39:29 +0200
Subject: RFR: 8047820: G1 Block offset table does not need to support
	generic Space classes
In-Reply-To: <1942487.fCe74F8AUE@mgerdin03>
References: <11148293.8zVS3laxSo@mgerdin03> <53A94398.2080301@oracle.com>
	<1942487.fCe74F8AUE@mgerdin03>
Message-ID: <53A97F91.40204@oracle.com>


On 2014-06-24 14:11, Mikael Gerdin wrote:
> On Tuesday 24 June 2014 11.23.36 Stefan Karlsson wrote:
>> On 2014-06-23 16:25, Mikael Gerdin wrote:
>>> Hi!
>>>
>>> As part of a larger effort to detach G1's HeapRegion from
>>> ContiguousSpace[1] G1's block offset table needs to be modified to work
>>> with Space subclasses which are not subclasses of ContiguousSpace. Just
>>> change the code to have knowledge of G1OffsetTableContigSpace.
>>>
>>> This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
>>> 8047821 which are needed to refactor the HeapRegion class and its
>>> superclasses in order to simplify the G1 class unloading change which is
>>> coming.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8047820
>>> Webrev:
>>> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/
>> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implement
>> ation/g1/g1BlockOffsetTable.cpp.udiff.html
>> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implemen
>> tation/g1/g1BlockOffsetTable.inline.hpp.udiff.html
>>
>> I talked to Mikael and we decided to do the changes from obj->size() to
>> block_size() in a later change.
> And here's the webrev reflecting that change.
>
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1/

Looks good.

thanks,
StefanK

>
> /Mikael
>
>> thanks,
>> StefanK
>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
>>>
>>> Thanks
>>> /Mikael


From stefan.karlsson at oracle.com  Tue Jun 24 13:40:08 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 24 Jun 2014 15:40:08 +0200
Subject: RFR: 8047821: G1 Does not use the save_marks functionality as
	intended
In-Reply-To: <2070242.8rk8uRrq1v@mgerdin03>
References: <1878024.dB1lCmA3nF@mgerdin03> <2070242.8rk8uRrq1v@mgerdin03>
Message-ID: <53A97FB8.4050809@oracle.com>


On 2014-06-24 14:10, Mikael Gerdin wrote:
> Hi!
>
> On Monday 23 June 2014 16.26.00 Mikael Gerdin wrote:
>> Hi!
>>
>> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
>> and as a general cleanup we should rename the save_marks and
>> set_saved_marks methods on HeapRegion. They are not used with
>> oops_since_saved_marks_iterate and cause more confusion than anything.
>>
>> This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
>> 8047821 which are needed to refactor the HeapRegion class and its
>> superclasses in order to simplify the G1 class unloading change which is
>> coming.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8047821
>> Webrev:
>> http://cr.openjdk.java.net/~mgerdin/8047821/webrev/
> Stefan discovered some more dead code in HeapRegion, here are a new set of
> webrevs:
>
> http://cr.openjdk.java.net/~mgerdin/8047821/webrev.0_to_1/

Looks good.

thanks,
StefanK

> http://cr.openjdk.java.net/~mgerdin/8047821/webrev.1/
>
> /Mikael
>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
>>
>> Thanks
>> /Mikael


From thomas.schatzl at oracle.com  Tue Jun 24 14:06:40 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 24 Jun 2014 16:06:40 +0200
Subject: RFR: 8047820: G1 Block offset table does not need to support
	generic Space classes
In-Reply-To: <1942487.fCe74F8AUE@mgerdin03>
References: <11148293.8zVS3laxSo@mgerdin03> <53A94398.2080301@oracle.com>
	<1942487.fCe74F8AUE@mgerdin03>
Message-ID: <1403618800.2662.17.camel@cirrus>

Hi all,

On Tue, 2014-06-24 at 14:11 +0200, Mikael Gerdin wrote:
> On Tuesday 24 June 2014 11.23.36 Stefan Karlsson wrote:
> > On 2014-06-23 16:25, Mikael Gerdin wrote:
> > > Hi!
> > > 
> > > As part of a larger effort to detach G1's HeapRegion from
> > > ContiguousSpace[1] G1's block offset table needs to be modified to work
> > > with Space subclasses which are not subclasses of ContiguousSpace. Just
> > > change the code to have knowledge of G1OffsetTableContigSpace.
> > > 
> > > This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
> > > 8047821 which are needed to refactor the HeapRegion class and its
> > > superclasses in order to simplify the G1 class unloading change which is
> > > coming.
> > > 
> > > Bug:
> > > https://bugs.openjdk.java.net/browse/JDK-8047820
> > > Webrev:
> > > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/
> > 
> > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implement
> > ation/g1/g1BlockOffsetTable.cpp.udiff.html
> > http://cr.openjdk.java.net/~mgerdin/8047820/webrev/src/share/vm/gc_implemen
> > tation/g1/g1BlockOffsetTable.inline.hpp.udiff.html
> > 
> > I talked to Mikael and we decided to do the changes from obj->size() to
> > block_size() in a later change.
> 
> And here's the webrev reflecting that change.
> 
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1/
> 

Looks good,
  Thomas


From jon.masamitsu at oracle.com  Tue Jun 24 14:26:52 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 24 Jun 2014 07:26:52 -0700
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03>
Message-ID: <53A98AAC.6060408@oracle.com>

Mikael,

Did you consider creating a base class for ContiguousSpace and
G1OffsetTableContigSpace that has a  _top but does not assume
parsability?

Could allocate_inner() have been called allocate_impl() as it is
in ContiguousSpace?  I don't know what the "inner" in the name
is telling me.

I've just started on the review so more to come.

Jon


On 06/23/2014 07:26 AM, Mikael Gerdin wrote:
> Hi!
>
> When G1 is modified to unload classes without doing full collections the old
> HeapRegions can contain unparseable objects. This makes ContiguousSpace
> unsuitable as a base class for HeapRegion since it assumes that all objects
> below _top are parseable.
>
> Modify G1OffsetTableContigSpace to implement allocation with a separate _top
> and reimplement some Space pure virtuals to make object iteration work as
> expected.
>
> This change is the last part of a set of 4 changes: 8047818, 8047819, 8047820,
> 8047821 which are needed to refactor the HeapRegion class and its superclasses
> in order to simplify the G1 class unloading change which is coming.
> This change depends on the 19, 20 and 21 changes.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047818
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
>
> Notes:
> The moving of set_offset_range is due to an introduced circular dependency
> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
>
> Thanks
> /Mikael
>


From mikael.gerdin at oracle.com  Tue Jun 24 14:43:41 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 24 Jun 2014 16:43:41 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <53A98AAC.6060408@oracle.com>
References: <17336712.znP3JIk1Pt@mgerdin03> <53A98AAC.6060408@oracle.com>
Message-ID: <2594075.LDRc6zFiFv@mgerdin03>

Jon,

On Tuesday 24 June 2014 07.26.52 Jon Masamitsu wrote:
> Mikael,
> 
> Did you consider creating a base class for ContiguousSpace and
> G1OffsetTableContigSpace that has a  _top but does not assume
> parsability?

I did consider it but I thought that the added complexity of having even more 
levels of inheritance were not worth the benefit of sharing the _top field. 
Especially since G1 attempts to hack around the semantics around the _top 
field with respect to concurrent access. See the disjunction I removed from 
allocate_impl and when G1 calls cas_allocate and allocate.

> 
> Could allocate_inner() have been called allocate_impl() as it is
> in ContiguousSpace?  I don't know what the "inner" in the name
> is telling me.

"inner" tries to signal that this function is internal and wrapped by other 
methods providing the external API. I don't have a particular naming 
preference here, if the other reviewers are fine with "impl" or don't have a 
preference I'm fine with changing it.

> 
> I've just started on the review so more to come.

Great!

Thanks
/Mikael

> 
> Jon
> 
> On 06/23/2014 07:26 AM, Mikael Gerdin wrote:
> > Hi!
> > 
> > When G1 is modified to unload classes without doing full collections the
> > old HeapRegions can contain unparseable objects. This makes
> > ContiguousSpace unsuitable as a base class for HeapRegion since it
> > assumes that all objects below _top are parseable.
> > 
> > Modify G1OffsetTableContigSpace to implement allocation with a separate
> > _top and reimplement some Space pure virtuals to make object iteration
> > work as expected.
> > 
> > This change is the last part of a set of 4 changes: 8047818, 8047819,
> > 8047820, 8047821 which are needed to refactor the HeapRegion class and
> > its superclasses in order to simplify the G1 class unloading change which
> > is coming. This change depends on the 19, 20 and 21 changes.
> > 
> > Bug:
> > https://bugs.openjdk.java.net/browse/JDK-8047818
> > Webrev:
> > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> > 
> > Notes:
> > The moving of set_offset_range is due to an introduced circular dependency
> > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
> > 
> > Thanks
> > /Mikael


From mikael.gerdin at oracle.com  Tue Jun 24 15:39:33 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 24 Jun 2014 17:39:33 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <1403612430.2662.14.camel@cirrus>
References: <17336712.znP3JIk1Pt@mgerdin03> <1403612430.2662.14.camel@cirrus>
Message-ID: <6630007.I6e4KLTY5F@mgerdin03>

On Tuesday 24 June 2014 14.20.30 Thomas Schatzl wrote:
> Hi,
> 
> On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote:
> > Hi!
> > 
> > When G1 is modified to unload classes without doing full collections the
> > old HeapRegions can contain unparseable objects. This makes
> > ContiguousSpace unsuitable as a base class for HeapRegion since it
> > assumes that all objects below _top are parseable.
> > 
> > Modify G1OffsetTableContigSpace to implement allocation with a separate
> > _top and reimplement some Space pure virtuals to make object iteration
> > work as expected.
> > 
> > This change is the last part of a set of 4 changes: 8047818, 8047819,
> > 8047820, 8047821 which are needed to refactor the HeapRegion class and
> > its superclasses in order to simplify the G1 class unloading change which
> > is coming. This change depends on the 19, 20 and 21 changes.
> > 
> > Bug:
> > https://bugs.openjdk.java.net/browse/JDK-8047818
> > Webrev:
> > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> > 
> > Notes:
> > The moving of set_offset_range is due to an introduced circular dependency
> > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
> 
>   a few minor nits:
> 
>  - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should
> access _top directly per coding guidelines

I interpret this as a request to change to
  HeapWord* obj = _top;
Should I change other uses of top() as well?

I could only find 
https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-Accessors 
as a reference here. Do you interpret that as "only use public accessors if 
outside the class"?


There have been a few requests for renaming and changing the *allocate 
functions to be either exact copies of the ones in ContiguousSpace or even to 
break *allocate and _top into a separate class, how do you feel about this?

>  - just a note: _top should be declared volatile as it is used in the
> CAS, although the code is correct. However there is already an issue for
> that https://bugs.openjdk.java.net/browse/JDK-8033552, so I suggest
> postponing this.

Ok.

>  - extra newline after G1OffsetTableContigSpace::allocate_inner()
>  - extra newline after G1BlockOffsetSharedArray::set_offset_array()

I'll remove these.

/Mikael

> 
> Thanks,
>   Thomas


From jon.masamitsu at oracle.com  Tue Jun 24 16:34:57 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 24 Jun 2014 09:34:57 -0700
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <2594075.LDRc6zFiFv@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03> <53A98AAC.6060408@oracle.com>
	<2594075.LDRc6zFiFv@mgerdin03>
Message-ID: <53A9A8B1.2080301@oracle.com>


On 06/24/2014 07:43 AM, Mikael Gerdin wrote:
> Jon,
>
> On Tuesday 24 June 2014 07.26.52 Jon Masamitsu wrote:
>> Mikael,
>>
>> Did you consider creating a base class for ContiguousSpace and
>> G1OffsetTableContigSpace that has a  _top but does not assume
>> parsability?
> I did consider it but I thought that the added complexity of having even more
> levels of inheritance were not worth the benefit of sharing the _top field.
> Especially since G1 attempts to hack around the semantics around the _top
> field with respect to concurrent access. See the disjunction I removed from
> allocate_impl and when G1 calls cas_allocate and allocate.

Ok.  That's enough of a reason.

>
>> Could allocate_inner() have been called allocate_impl() as it is
>> in ContiguousSpace?  I don't know what the "inner" in the name
>> is telling me.
> "inner" tries to signal that this function is internal and wrapped by other
> methods providing the external API. I don't have a particular naming
> preference here, if the other reviewers are fine with "impl" or don't have a
> preference I'm fine with changing it.

If no one objects, the "impl" has more of a meaning to me.


http://cr.openjdk.java.net/~mgerdin/8047818/webrev/src/share/vm/gc_implementation/g1/heapRegion.inline.hpp.frames.html

You use the  if-then {return ...} return NULL style.

   52 inline HeapWord* G1OffsetTableContigSpace::allocate_inner(size_t size) {
   53   HeapWord* obj = top();
   54   if (pointer_delta(end(), obj) >= size) {
   55     HeapWord* new_top = obj + size;
   56     assert(is_aligned(obj) && is_aligned(new_top), "checking alignment");
   57     set_top(new_top);
   58     return obj;
   59   }
   60   return NULL;
   61 }

The ContiguousSpace uses the if-then {return ...} else {return NULL} style.

Any reason not to use the same style?

block_is_obj() seems more like an is_in() method.

   89 inline bool
   90 HeapRegion::block_is_obj(const HeapWord* p) const {
   91   return p < top();
   92 }


Could you add a specification for this block_size()?

   94 inline size_t
   95 HeapRegion::block_size(const HeapWord *addr) const {
   96   const HeapWord* current_top = top();
   97   if (addr < current_top) {
   98     return oop(addr)->size();
   99   } else {
  100     assert(addr == current_top, "just checking");
  101     return pointer_delta(end(), addr);
  102   }
  103 }

Jon
>
>> I've just started on the review so more to come.
> Great!
>
> Thanks
> /Mikael
>
>> Jon
>>
>> On 06/23/2014 07:26 AM, Mikael Gerdin wrote:
>>> Hi!
>>>
>>> When G1 is modified to unload classes without doing full collections the
>>> old HeapRegions can contain unparseable objects. This makes
>>> ContiguousSpace unsuitable as a base class for HeapRegion since it
>>> assumes that all objects below _top are parseable.
>>>
>>> Modify G1OffsetTableContigSpace to implement allocation with a separate
>>> _top and reimplement some Space pure virtuals to make object iteration
>>> work as expected.
>>>
>>> This change is the last part of a set of 4 changes: 8047818, 8047819,
>>> 8047820, 8047821 which are needed to refactor the HeapRegion class and
>>> its superclasses in order to simplify the G1 class unloading change which
>>> is coming. This change depends on the 19, 20 and 21 changes.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8047818
>>> Webrev:
>>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
>>>
>>> Notes:
>>> The moving of set_offset_range is due to an introduced circular dependency
>>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
>>>
>>> Thanks
>>> /Mikael


From mikael.gerdin at oracle.com  Wed Jun 25 06:56:56 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 25 Jun 2014 08:56:56 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <53A9A8B1.2080301@oracle.com>
References: <17336712.znP3JIk1Pt@mgerdin03> <2594075.LDRc6zFiFv@mgerdin03>
	<53A9A8B1.2080301@oracle.com>
Message-ID: <3844327.z641kvG9QC@mgerdin03>

Jon,

On Tuesday 24 June 2014 09.34.57 Jon Masamitsu wrote:
> On 06/24/2014 07:43 AM, Mikael Gerdin wrote:
> > Jon,
> > 
> > On Tuesday 24 June 2014 07.26.52 Jon Masamitsu wrote:
> >> Mikael,
> >> 
> >> Did you consider creating a base class for ContiguousSpace and
> >> G1OffsetTableContigSpace that has a  _top but does not assume
> >> parsability?
> > 
> > I did consider it but I thought that the added complexity of having even
> > more levels of inheritance were not worth the benefit of sharing the _top
> > field. Especially since G1 attempts to hack around the semantics around
> > the _top field with respect to concurrent access. See the disjunction I
> > removed from allocate_impl and when G1 calls cas_allocate and allocate.
> 
> Ok.  That's enough of a reason.

Thanks. I'm planning on filing an RFE for unifying the allocation code between 
G1 and ContiguousSpace somehow.

> 
> >> Could allocate_inner() have been called allocate_impl() as it is
> >> in ContiguousSpace?  I don't know what the "inner" in the name
> >> is telling me.
> > 
> > "inner" tries to signal that this function is internal and wrapped by
> > other
> > methods providing the external API. I don't have a particular naming
> > preference here, if the other reviewers are fine with "impl" or don't have
> > a preference I'm fine with changing it.
> 
> If no one objects, the "impl" has more of a meaning to me.

Ok, naming it "impl" also makes it closer to the ContiguousSpace version.

> 
> 
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/src/share/vm/gc_implement
> ation/g1/heapRegion.inline.hpp.frames.html
> 
> You use the  if-then {return ...} return NULL style.
> 
>    52 inline HeapWord* G1OffsetTableContigSpace::allocate_inner(size_t size)
> { 53   HeapWord* obj = top();
>    54   if (pointer_delta(end(), obj) >= size) {
>    55     HeapWord* new_top = obj + size;
>    56     assert(is_aligned(obj) && is_aligned(new_top), "checking
> alignment"); 57     set_top(new_top);
>    58     return obj;
>    59   }
>    60   return NULL;
>    61 }
> 
> The ContiguousSpace uses the if-then {return ...} else {return NULL} style.
> 
> Any reason not to use the same style?

No good reason. I'll keep the new ones in the same style.

> 
> block_is_obj() seems more like an is_in() method.
> 
>    89 inline bool
>    90 HeapRegion::block_is_obj(const HeapWord* p) const {
>    91   return p < top();
>    92 }
> 

It's actually the same as ContiguousSpace::block_is_obj
When G1 class unloading is integrated the implementation of 
HeapRegion::block_is_obj will change.

> 
> Could you add a specification for this block_size()?
> 
>    94 inline size_t
>    95 HeapRegion::block_size(const HeapWord *addr) const {
>    96   const HeapWord* current_top = top();
>    97   if (addr < current_top) {
>    98     return oop(addr)->size();
>    99   } else {
>   100     assert(addr == current_top, "just checking");
>   101     return pointer_delta(end(), addr);
>   102   }
>   103 }

Similar to block_is_obj this is currently the same as the ContiguousSpace 
variant. When G1 class unloading is integrated the implementation will change.

The other declarations of block_size are not that clearly specified, can you 
elaborate on what kind of specification you are looking for?

/Mikael

> 
> Jon
> 
> >> I've just started on the review so more to come.
> > 
> > Great!
> > 
> > Thanks
> > /Mikael
> > 
> >> Jon
> >> 
> >> On 06/23/2014 07:26 AM, Mikael Gerdin wrote:
> >>> Hi!
> >>> 
> >>> When G1 is modified to unload classes without doing full collections the
> >>> old HeapRegions can contain unparseable objects. This makes
> >>> ContiguousSpace unsuitable as a base class for HeapRegion since it
> >>> assumes that all objects below _top are parseable.
> >>> 
> >>> Modify G1OffsetTableContigSpace to implement allocation with a separate
> >>> _top and reimplement some Space pure virtuals to make object iteration
> >>> work as expected.
> >>> 
> >>> This change is the last part of a set of 4 changes: 8047818, 8047819,
> >>> 8047820, 8047821 which are needed to refactor the HeapRegion class and
> >>> its superclasses in order to simplify the G1 class unloading change
> >>> which
> >>> is coming. This change depends on the 19, 20 and 21 changes.
> >>> 
> >>> Bug:
> >>> https://bugs.openjdk.java.net/browse/JDK-8047818
> >>> Webrev:
> >>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> >>> 
> >>> Notes:
> >>> The moving of set_offset_range is due to an introduced circular
> >>> dependency
> >>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
> >>> 
> >>> Thanks
> >>> /Mikael


From andreas.sjoberg at oracle.com  Wed Jun 25 07:02:26 2014
From: andreas.sjoberg at oracle.com (=?ISO-8859-1?Q?Andreas_Sj=F6berg?=)
Date: Wed, 25 Jun 2014 09:02:26 +0200
Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1
	SparsePRTEntry
In-Reply-To: <2772002.dt1otWjIrg@mgerdin03>
References: <53A2E53B.3050508@oracle.com> <2772002.dt1otWjIrg@mgerdin03>
Message-ID: <53AA7402.1040608@oracle.com>

Hi!

Following Mikael's review and some offline comments from Thomas I've 
made these changes in addition to removing the unrolled card loops:

* removed the now unused define
* added braces for the for-loop in SparsePRTEntry::init
* changed the implementation of copy_cards to use memcpy

New webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev.02/

Thanks

On 06/23/2014 04:51 PM, Mikael Gerdin wrote:
> Hi Andreas,
>
> On Thursday 19 June 2014 15.27.23 Andreas Sj?berg wrote:
>> Hi all,
>>
>> can I please have reviews for this patch that removes the unrolled
>> for-loops in sparsePRT.cpp.
>>
>> I ran some performance benchmarks and could not see any benefits in
>> keeping the unrolled for loops. SPECjbb2013 shows a 3.48% increase on
>> Linux x64 actually.
>>
>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev/
>
> It looks like you can remove the define as well:
>    36 #define UNROLL_CARD_LOOPS  1
>
> UnrollFactor should also be useless now, but it seems like it's being used to
> align up the number of cards. I suggest you leave UnrollFactor for a second
> cleanup.
>
> /Mikael
>
>>
>> Testing: jprt, specjbb2005, specjvm2008, specjbb2013
>>
>> Thanks,
>> Andreas
>


From mikael.gerdin at oracle.com  Wed Jun 25 11:25:01 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 25 Jun 2014 13:25:01 +0200
Subject: RFR: 8047820: G1 Block offset table does not need to support
	generic Space classes
In-Reply-To: <11148293.8zVS3laxSo@mgerdin03>
References: <11148293.8zVS3laxSo@mgerdin03>
Message-ID: <1643242.qRoacBket5@mgerdin03>

Hi!

On Monday 23 June 2014 16.25.53 Mikael Gerdin wrote:
> Hi!
> 
> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
> G1's block offset table needs to be modified to work with Space subclasses
> which are not subclasses of ContiguousSpace. Just change the code to have
> knowledge of G1OffsetTableContigSpace.
> 
> This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
> 8047821 which are needed to refactor the HeapRegion class and its
> superclasses in order to simplify the G1 class unloading change which is
> coming.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047820
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/

I discovered that I accidentally put the set_offset_array change in the 
8047818 webrev. It is actually needed in this change to make the JVM compile 
without precompiled headers.

Here's the new full webrev:
http://cr.openjdk.java.net/~mgerdin/8047820/webrev.2/

Incremental webrev:
http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1_to_2/

Thanks
/Mikael

> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
> 
> Thanks
> /Mikael


From thomas.schatzl at oracle.com  Wed Jun 25 11:28:09 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 25 Jun 2014 13:28:09 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <6630007.I6e4KLTY5F@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03>
	<1403612430.2662.14.camel@cirrus> <6630007.I6e4KLTY5F@mgerdin03>
Message-ID: <1403695689.2769.12.camel@cirrus>


Hi,

On Tue, 2014-06-24 at 17:39 +0200, Mikael Gerdin wrote:
> On Tuesday 24 June 2014 14.20.30 Thomas Schatzl wrote:
> > Hi,
> > 
> > On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote:
> > > Hi!
> > > 
> > > When G1 is modified to unload classes without doing full collections the
> > > old HeapRegions can contain unparseable objects. This makes
> > > ContiguousSpace unsuitable as a base class for HeapRegion since it
> > > assumes that all objects below _top are parseable.
> > > 
> > > Modify G1OffsetTableContigSpace to implement allocation with a separate
> > > _top and reimplement some Space pure virtuals to make object iteration
> > > work as expected.
> > > 
> > > This change is the last part of a set of 4 changes: 8047818, 8047819,
> > > 8047820, 8047821 which are needed to refactor the HeapRegion class and
> > > its superclasses in order to simplify the G1 class unloading change which
> > > is coming. This change depends on the 19, 20 and 21 changes.
> > > 
> > > Bug:
> > > https://bugs.openjdk.java.net/browse/JDK-8047818
> > > Webrev:
> > > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> > > 
> > > Notes:
> > > The moving of set_offset_range is due to an introduced circular dependency
> > > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
> > 
> >   a few minor nits:
> > 
> >  - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should
> > access _top directly per coding guidelines
> 
> I interpret this as a request to change to
>   HeapWord* obj = _top;
> Should I change other uses of top() as well?
> 
> I could only find 
> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-Accessors 
> as a reference here. Do you interpret that as "only use public accessors if 
> outside the class"?

I remember having been made aware of that we are supposed to use members
directly within a class and its descendants a (small) few times when I started
here - because I otherwise tend to add accessors except for very simple
ones, mainly for private variables. I may have misunderstood something too.

Looking through the code, this might be (again) G1 code specific where it's
done (relatively) frequently in code that is not almost copy&paste from CMS.

I remember some changes that also actively removed by-accessor accesses
within a given class hierarchy (e.g. collector policy). 

So in the end I may have misunderstood something, and as I did not really
search for something "written down and generally accepted" at that time I am
grateful to be corrected. I like this way better too.

So keep it as is.

Thanks,
  Thomas


From mikael.gerdin at oracle.com  Wed Jun 25 11:32:23 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 25 Jun 2014 13:32:23 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <1403695689.2769.12.camel@cirrus>
References: <17336712.znP3JIk1Pt@mgerdin03> <6630007.I6e4KLTY5F@mgerdin03>
	<1403695689.2769.12.camel@cirrus>
Message-ID: <4341532.UT0yUrG55d@mgerdin03>

Hi Thomas,

On Wednesday 25 June 2014 13.28.09 Thomas Schatzl wrote:
> Hi,
> 
> On Tue, 2014-06-24 at 17:39 +0200, Mikael Gerdin wrote:
> > On Tuesday 24 June 2014 14.20.30 Thomas Schatzl wrote:
> > > Hi,
> > > 
> > > On Mon, 2014-06-23 at 16:26 +0200, Mikael Gerdin wrote:
> > > > Hi!
> > > > 
> > > > When G1 is modified to unload classes without doing full collections
> > > > the
> > > > old HeapRegions can contain unparseable objects. This makes
> > > > ContiguousSpace unsuitable as a base class for HeapRegion since it
> > > > assumes that all objects below _top are parseable.
> > > > 
> > > > Modify G1OffsetTableContigSpace to implement allocation with a
> > > > separate
> > > > _top and reimplement some Space pure virtuals to make object iteration
> > > > work as expected.
> > > > 
> > > > This change is the last part of a set of 4 changes: 8047818, 8047819,
> > > > 8047820, 8047821 which are needed to refactor the HeapRegion class and
> > > > its superclasses in order to simplify the G1 class unloading change
> > > > which
> > > > is coming. This change depends on the 19, 20 and 21 changes.
> > > > 
> > > > Bug:
> > > > https://bugs.openjdk.java.net/browse/JDK-8047818
> > > > Webrev:
> > > > http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> > > > 
> > > > Notes:
> > > > The moving of set_offset_range is due to an introduced circular
> > > > dependency
> > > > between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
> > > > 
> > >   a few minor nits:
> > >  - in G1OffsetTableContigSpace::cas_allocate_inner(), the method should
> > > 
> > > access _top directly per coding guidelines
> > 
> > I interpret this as a request to change to
> > 
> >   HeapWord* obj = _top;
> > 
> > Should I change other uses of top() as well?
> > 
> > I could only find
> > https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-Access
> > ors as a reference here. Do you interpret that as "only use public
> > accessors if outside the class"?
> 
> I remember having been made aware of that we are supposed to use members
> directly within a class and its descendants a (small) few times when I
> started here - because I otherwise tend to add accessors except for very
> simple ones, mainly for private variables. I may have misunderstood
> something too.
> 
> Looking through the code, this might be (again) G1 code specific where it's
> done (relatively) frequently in code that is not almost copy&paste from CMS.
> 
> I remember some changes that also actively removed by-accessor accesses
> within a given class hierarchy (e.g. collector policy).
> 
> So in the end I may have misunderstood something, and as I did not really
> search for something "written down and generally accepted" at that time I am
> grateful to be corrected. I like this way better too.
> 
> So keep it as is.

Ok, I see your point.
I agree that adding accessors for all members just for the sake of it is 
generally not that useful. In this case I was sort-of mimicing the code in 
ContiguousSpace.

Per Jon and Stefan's requests I've copied the ContiguousSpace versions of 
allocate_impl and par_allocate_impl straight off instead of having slightly 
different own versions, so in the end I added a top_addr() as well since 
ContiguousSpace has one.

A new webrev is coming.

/Mikael

> 
> Thanks,
>   Thomas


From mikael.gerdin at oracle.com  Wed Jun 25 11:50:57 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 25 Jun 2014 13:50:57 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <17336712.znP3JIk1Pt@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03>
Message-ID: <2049395.4GI85LboNd@mgerdin03>

Hi!

On Monday 23 June 2014 16.26.03 Mikael Gerdin wrote:
> Hi!
> 
> When G1 is modified to unload classes without doing full collections the old
> HeapRegions can contain unparseable objects. This makes ContiguousSpace
> unsuitable as a base class for HeapRegion since it assumes that all objects
> below _top are parseable.
> 
> Modify G1OffsetTableContigSpace to implement allocation with a separate _top
> and reimplement some Space pure virtuals to make object iteration work as
> expected.
> 
> This change is the last part of a set of 4 changes: 8047818, 8047819,
> 8047820, 8047821 which are needed to refactor the HeapRegion class and its
> superclasses in order to simplify the G1 class unloading change which is
> coming. This change depends on the 19, 20 and 21 changes.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8047818
> Webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/

Based on review comments from Jon, Stefan and Thomas (thanks!) here's a second 
version of this webrev.

A quick summary of the incremental changes:

* SA Support
* taking {par_,}allocate_impl from ContiguousSpace
* fix for building without precompiled headers
* setting _saved_mark_word in clear()
* initialization order problem with _top vs _bottom
* object_iterate block_is_obj check
* added a short specification for block_is_obj and block_size

Note that the set_offset_array change was moved to the 8047820 webrev since 
it's needed to get that change to compile without precompiled headers.

Full webrev:
http://cr.openjdk.java.net/~mgerdin/8047818/webrev.1/

Incremental webrev:
http://cr.openjdk.java.net/~mgerdin/8047818/webrev.0_to_1/

/Mikael

> 
> Notes:
> The moving of set_offset_range is due to an introduced circular dependency
> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
> 
> Thanks
> /Mikael


From stefan.karlsson at oracle.com  Wed Jun 25 11:58:06 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 25 Jun 2014 13:58:06 +0200
Subject: RFR: 8047820: G1 Block offset table does not need to support
	generic Space classes
In-Reply-To: <1643242.qRoacBket5@mgerdin03>
References: <11148293.8zVS3laxSo@mgerdin03> <1643242.qRoacBket5@mgerdin03>
Message-ID: <53AAB94E.90609@oracle.com>


On 2014-06-25 13:25, Mikael Gerdin wrote:
> Hi!
>
> On Monday 23 June 2014 16.25.53 Mikael Gerdin wrote:
>> Hi!
>>
>> As part of a larger effort to detach G1's HeapRegion from ContiguousSpace[1]
>> G1's block offset table needs to be modified to work with Space subclasses
>> which are not subclasses of ContiguousSpace. Just change the code to have
>> knowledge of G1OffsetTableContigSpace.
>>
>> This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
>> 8047821 which are needed to refactor the HeapRegion class and its
>> superclasses in order to simplify the G1 class unloading change which is
>> coming.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8047820
>> Webrev:
>> http://cr.openjdk.java.net/~mgerdin/8047820/webrev/
> I discovered that I accidentally put the set_offset_array change in the
> 8047818 webrev. It is actually needed in this change to make the JVM compile
> without precompiled headers.
>
> Here's the new full webrev:
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev.2/
>
> Incremental webrev:
> http://cr.openjdk.java.net/~mgerdin/8047820/webrev.1_to_2/

Looks good.

StefanK

>
> Thanks
> /Mikael
>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8047818
>>
>> Thanks
>> /Mikael


From jon.masamitsu at oracle.com  Wed Jun 25 13:50:06 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 25 Jun 2014 06:50:06 -0700
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <2049395.4GI85LboNd@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03> <2049395.4GI85LboNd@mgerdin03>
Message-ID: <53AAD38E.4040507@oracle.com>


On 6/25/2014 4:50 AM, Mikael Gerdin wrote:
> Hi!
>
> On Monday 23 June 2014 16.26.03 Mikael Gerdin wrote:
>> Hi!
>>
>> When G1 is modified to unload classes without doing full collections the old
>> HeapRegions can contain unparseable objects. This makes ContiguousSpace
>> unsuitable as a base class for HeapRegion since it assumes that all objects
>> below _top are parseable.
>>
>> Modify G1OffsetTableContigSpace to implement allocation with a separate _top
>> and reimplement some Space pure virtuals to make object iteration work as
>> expected.
>>
>> This change is the last part of a set of 4 changes: 8047818, 8047819,
>> 8047820, 8047821 which are needed to refactor the HeapRegion class and its
>> superclasses in order to simplify the G1 class unloading change which is
>> coming. This change depends on the 19, 20 and 21 changes.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8047818
>> Webrev:
>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> Based on review comments from Jon, Stefan and Thomas (thanks!) here's a second
> version of this webrev.
>
> A quick summary of the incremental changes:
>
> * SA Support
> * taking {par_,}allocate_impl from ContiguousSpace
> * fix for building without precompiled headers
> * setting _saved_mark_word in clear()
> * initialization order problem with _top vs _bottom
> * object_iterate block_is_obj check
> * added a short specification for block_is_obj and block_size
>
> Note that the set_offset_array change was moved to the 8047820 webrev since
> it's needed to get that change to compile without precompiled headers.
>
> Full webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev.1/
>
> Incremental webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev.0_to_1/

Looks good.  Thanks for the changes.

Reviewed.

Jon

>
> /Mikael
>
>> Notes:
>> The moving of set_offset_range is due to an introduced circular dependency
>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
>>
>> Thanks
>> /Mikael


From jon.masamitsu at oracle.com  Wed Jun 25 14:04:13 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 25 Jun 2014 07:04:13 -0700
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <3844327.z641kvG9QC@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03> <2594075.LDRc6zFiFv@mgerdin03>
	<53A9A8B1.2080301@oracle.com> <3844327.z641kvG9QC@mgerdin03>
Message-ID: <53AAD6DD.7@oracle.com>


On 6/24/2014 11:56 PM, Mikael Gerdin wrote:
> [...]
> Similar to block_is_obj this is currently the same as the ContiguousSpace
> variant. When G1 class unloading is integrated the implementation will change.
>
> The other declarations of block_size are not that clearly specified, can you
> elaborate on what kind of specification you are looking for?

The specification you put in was fine.   I just wanted it clearer about the
action when p >= top (in which case, to me, "block_size" is not a
particularly descriptive name, never has been).

Thanks.

Jon

>
> /Mikael
>
>> Jon
>>
>>>> I've just started on the review so more to come.
>>> Great!
>>>
>>> Thanks
>>> /Mikael
>>>
>>>> Jon
>>>>
>>>> On 06/23/2014 07:26 AM, Mikael Gerdin wrote:
>>>>> Hi!
>>>>>
>>>>> When G1 is modified to unload classes without doing full collections the
>>>>> old HeapRegions can contain unparseable objects. This makes
>>>>> ContiguousSpace unsuitable as a base class for HeapRegion since it
>>>>> assumes that all objects below _top are parseable.
>>>>>
>>>>> Modify G1OffsetTableContigSpace to implement allocation with a separate
>>>>> _top and reimplement some Space pure virtuals to make object iteration
>>>>> work as expected.
>>>>>
>>>>> This change is the last part of a set of 4 changes: 8047818, 8047819,
>>>>> 8047820, 8047821 which are needed to refactor the HeapRegion class and
>>>>> its superclasses in order to simplify the G1 class unloading change
>>>>> which
>>>>> is coming. This change depends on the 19, 20 and 21 changes.
>>>>>
>>>>> Bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8047818
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
>>>>>
>>>>> Notes:
>>>>> The moving of set_offset_range is due to an introduced circular
>>>>> dependency
>>>>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
>>>>>
>>>>> Thanks
>>>>> /Mikael


From stefan.karlsson at oracle.com  Wed Jun 25 14:05:40 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 25 Jun 2014 16:05:40 +0200
Subject: RFR: 8047818: G1 HeapRegions can no longer be ContiguousSpaces
In-Reply-To: <2049395.4GI85LboNd@mgerdin03>
References: <17336712.znP3JIk1Pt@mgerdin03> <2049395.4GI85LboNd@mgerdin03>
Message-ID: <53AAD734.7090103@oracle.com>


On 2014-06-25 13:50, Mikael Gerdin wrote:
> Hi!
>
> On Monday 23 June 2014 16.26.03 Mikael Gerdin wrote:
>> Hi!
>>
>> When G1 is modified to unload classes without doing full collections the old
>> HeapRegions can contain unparseable objects. This makes ContiguousSpace
>> unsuitable as a base class for HeapRegion since it assumes that all objects
>> below _top are parseable.
>>
>> Modify G1OffsetTableContigSpace to implement allocation with a separate _top
>> and reimplement some Space pure virtuals to make object iteration work as
>> expected.
>>
>> This change is the last part of a set of 4 changes: 8047818, 8047819,
>> 8047820, 8047821 which are needed to refactor the HeapRegion class and its
>> superclasses in order to simplify the G1 class unloading change which is
>> coming. This change depends on the 19, 20 and 21 changes.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8047818
>> Webrev:
>> http://cr.openjdk.java.net/~mgerdin/8047818/webrev/
> Based on review comments from Jon, Stefan and Thomas (thanks!) here's a second
> version of this webrev.
>
> A quick summary of the incremental changes:
>
> * SA Support
> * taking {par_,}allocate_impl from ContiguousSpace
> * fix for building without precompiled headers
> * setting _saved_mark_word in clear()
> * initialization order problem with _top vs _bottom
> * object_iterate block_is_obj check
> * added a short specification for block_is_obj and block_size
>
> Note that the set_offset_array change was moved to the 8047820 webrev since
> it's needed to get that change to compile without precompiled headers.
>
> Full webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev.1/
>
> Incremental webrev:
> http://cr.openjdk.java.net/~mgerdin/8047818/webrev.0_to_1/

Looks good.

StefanK

>
> /Mikael
>
>> Notes:
>> The moving of set_offset_range is due to an introduced circular dependency
>> between g1BlockOffsetTable.inline.hpp and heapRegion.inline.hpp
>>
>> Thanks
>> /Mikael


From erik.helin at oracle.com  Wed Jun 25 15:01:58 2014
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 25 Jun 2014 17:01:58 +0200
Subject: RFR: 8047821: G1 Does not use the save_marks functionality as
	intended
In-Reply-To: <2070242.8rk8uRrq1v@mgerdin03>
References: <1878024.dB1lCmA3nF@mgerdin03> <2070242.8rk8uRrq1v@mgerdin03>
Message-ID: <2377352.slVLfJONC9@ehelin-laptop>

On Tuesday 24 June 2014 14:10:47 PM Mikael Gerdin wrote:
> Hi!
> 
> On Monday 23 June 2014 16.26.00 Mikael Gerdin wrote:
> > Hi!
> > 
> > As part of a larger effort to detach G1's HeapRegion from
> > ContiguousSpace[1] and as a general cleanup we should rename the
> > save_marks and
> > set_saved_marks methods on HeapRegion. They are not used with
> > oops_since_saved_marks_iterate and cause more confusion than anything.
> > 
> > This change is part of a set of 4 changes: 8047818, 8047819, 8047820,
> > 8047821 which are needed to refactor the HeapRegion class and its
> > superclasses in order to simplify the G1 class unloading change which is
> > coming.
> > 
> > Bug:
> > https://bugs.openjdk.java.net/browse/JDK-8047821
> > Webrev:
> > http://cr.openjdk.java.net/~mgerdin/8047821/webrev/
> 
> Stefan discovered some more dead code in HeapRegion, here are a new set of
> webrevs:
> 
> http://cr.openjdk.java.net/~mgerdin/8047821/webrev.0_to_1/
> http://cr.openjdk.java.net/~mgerdin/8047821/webrev.1/

Looks good, Reviewed!

Thanks,
Erik

> 
> /Mikael
> 
> > [1] https://bugs.openjdk.java.net/browse/JDK-8047818
> > 
> > Thanks
> > /Mikael


From thomas.schatzl at oracle.com  Thu Jun 26 07:16:53 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 26 Jun 2014 09:16:53 +0200
Subject: Ping to re-review JDK-8035400, JDK-8035401 and JDK-8040977
Message-ID: <1403767013.2656.11.camel@cirrus>

Hi all,

  can I get re-reviews for the following issues (Bengt, Mikael?) that
have been lingering for at least a month so that I can complete them?

They are actually blocking me to get reviews a few more changesets for
some time now.

JDK-8035400: Move G1ParScanThreadState into its own files

Most recent email in review thread:
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010133.html
Diff to recent changes:
http://cr.openjdk.java.net/~tschatzl/8035400/webrev.1_to_2/
Latest changes (complete):
http://cr.openjdk.java.net/~tschatzl/8035400/webrev.2/

JDK-8035401: Fix visibility of G1ParScanThreadState members

Most recent email in review thread:
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010134.html
Diff to recent changes:
http://cr.openjdk.java.net/~tschatzl/8035401/webrev.1_to_2/
Latest changes (complete):
http://cr.openjdk.java.net/~tschatzl/8035401/webrev.2/

JDK-8040977: G1 crashes when run with -XX:-G1DeferredRSUpdate

Most recent email in review thread:
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-May/010132.html
Latest changes (complete):
http://cr.openjdk.java.net/~tschatzl/8040977/webrev.1/

Thanks,
  Thomas


From bengt.rutisson at oracle.com  Thu Jun 26 11:28:49 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 26 Jun 2014 13:28:49 +0200
Subject: RFR (XS): JDK-8040977: G1 crashes when run with
	-XX:-G1DeferredRSUpdate
In-Reply-To: <1401187156.2682.12.camel@cirrus>
References: <1398171391.3002.24.camel@cirrus> <5360D563.1070303@oracle.com>
	<1401187156.2682.12.camel@cirrus>
Message-ID: <53AC03F1.1090309@oracle.com>


Hi Thomas,

Sorry for the very late reply.

I think the dependency between G1ParScanClosure is still very awkward, 
but I think your change is a step in the right direction. Thanks for 
fixing this.

Reviewed.
Bengt


On 2014-05-27 12:39, Thomas Schatzl wrote:
> Hi Bengt,
>
>    thanks for the review.
>
> On Wed, 2014-04-30 at 12:50 +0200, Bengt Rutisson wrote:
>> Hi Thomas,
>>
>> On 2014-04-22 14:56, Thomas Schatzl wrote:
>>> Hi all,
>>>
>>>     can I have reviews for this change? It fixes wrong order of
>>> declaration of members of G1ParScanThreadState that causes crashes when
>>> G1DeferredRSUpdate is disabled.
>>>
>>> The change is based on the changes for 8035400 and8035401 posted recently.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8040977
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8040977/webrev/
>> I realize that this fixes the code but I would really appreciate a more
>> stable way of handling the dependencies.
>>
>> As it it now we end up calling methods on a G1ParScanThreadState
>> instance while we are setting it up. This seems broken to me and will
>> probably lead to similar initialization order issues again. Best would
>> be to not pass "this" to the constructor of G1ParScanClosure and instead
>> manage the circular dependency between G1ParScanClosure and
>> G1ParScanThreadState more explicitly after they have both been properly
>> set up.
>>
>> Second best would be to at least pass the worker id/queue num as a
>> separate parameter to avoid having to call methods on an uninitialized
>> object.
> I fixed this implementing the former idea. Also added some
>
> New webrev at
> http://cr.openjdk.java.net/~tschatzl/8040977/webrev.1/
>
> (Sorry, I already had merged the changes before making a diff webrev -
> however, most changes in the VM code have been redone anyway. The test
> case stayed the same).
>
> Thanks,
>    Thomas
>


From mikael.gerdin at oracle.com  Thu Jun 26 11:33:28 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 26 Jun 2014 13:33:28 +0200
Subject: RFR: 8048214: Linker error when compiling G1SATBCardTableModRefBS
	after include order changes
Message-ID: <7134937.oekIdTSvyk@mgerdin03>

Hi all!

A small build issue occurs with the change for 8047818 due to some strange 
include order effects.
The symptom is that a template function in G1SATBCardTableModRefBS is not 
instantiated when compiling on Windows and the link of jvm.dll fails.

Since 8047818 is already reviewed and is a change we want to keep separate I'd 
like to push the fix for this issue before 8047818 instead of folding it into 
that change.

My suggested fix is to move the implementations of the callers of the template 
function into the cpp file as well. They override virtual functions so they 
should not have been inlined in the first place (since we always call through 
a base class pointer to the BarrierSet).

Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev
Bug: https://bugs.openjdk.java.net/browse/JDK-8048214

Thanks
/Mikael


From bengt.rutisson at oracle.com  Thu Jun 26 11:31:08 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 26 Jun 2014 13:31:08 +0200
Subject: RFR (M/L): JDK-8035400: Move G1ParScanThreadState into its own
	files
In-Reply-To: <1401187161.2682.13.camel@cirrus>
References: <1397829156.2717.24.camel@cirrus> <536A1685.7080704@oracle.com>
	<1401187161.2682.13.camel@cirrus>
Message-ID: <53AC047C.4090101@oracle.com>


Hi Thomas,

I think this looks good now.

Thanks,
Bengt

On 2014-05-27 12:39, Thomas Schatzl wrote:
> Hi Bengt,
>
>    thanks for the review and sorry for the long delay...
>
> On Wed, 2014-05-07 at 13:18 +0200, Bengt Rutisson wrote:
>> Hi Thomas,
>>
>> On 2014-04-18 15:52, Thomas Schatzl wrote:
>>> Hi all,
>>>
>>>     can I have reviews for the above change? It moves G1ParScanThreadState
>>> into G1ParScanThreadState*pp files.
>>>
>>> The only changes are limited to:
>>>    - adding a "#pragma warning( disable:4355 ) // 'this' : used in base
>>> member initializer list" to shut visual C up about the problem (which
>>> should be cleaned up at some point - I found an issue that slipped
>>> through because of that, JDK-8040977)
>> As I commented in the review of JDK-8040977 I would prefer to make the
>> change to not pass this as a parameter to the constructor. That would
>> also remove the need for disabling the warning. Maybe in that case base
>> this review on top of the fix for JDK-8040977 rather than the other way
>> around?
> I do not see an advantage either way. Since this would require me make
> significant changes to all patches, I would prefer keeping the order
> this way if you do not mind.
>
> In the latest JDK-8040977 I removed the need for the pragma as
> requested.
>
>>>    - added necessary include file references; I hope the AIX guys can
>>> compile that change to avoid troubles. It compiles fine with all Oracle
>>> supported archs.
>> You also moved the definition of the destructor of G1ParScanThreadState
>> from the hpp file to the cpp file. Makes sense, but was not strictly
>> needed for this change, right?
> Fixed that. This has been an oversight when separating out the changes.
>
>>> There will be another CR for fixing up visibility and cleaning up stuff
>>> a little.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8035400
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8035400/webrev/
>> It is a bit hard to review moved code. But except for the comment
>> regarding JDK-8040977 above I think it looks good.
>>
>> I think you can clean up the includes a bit more if you have time. Seems
>> like these includes in g1CollectedHeap.cpp are for example not needed
>> anymore:
>>
>> #include "oops/oop.inline.hpp"
>> #include "oops/oop.pcgc.inline.hpp"
> I tried to clean up the includes a little more. However you cannot move
> these particular includes because they are still needed for evacuation
> failure handling.
>
> I also rebased the change on the current hotspot jdk9 gc repo.
>
> Diff webrev at
> http://cr.openjdk.java.net/~tschatzl/8035400/webrev.1_to_2/
>
> Complete webrev at
> http://cr.openjdk.java.net/~tschatzl/8035400/webrev.2/
>
> Thanks,
>    Thomas
>


From stefan.karlsson at oracle.com  Thu Jun 26 11:28:45 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 26 Jun 2014 13:28:45 +0200
Subject: RFR: 8048214: Linker error when compiling G1SATBCardTableModRefBS
	after include order changes
In-Reply-To: <7134937.oekIdTSvyk@mgerdin03>
References: <7134937.oekIdTSvyk@mgerdin03>
Message-ID: <53AC03ED.9070808@oracle.com>


On 2014-06-26 13:33, Mikael Gerdin wrote:
> Hi all!
>
> A small build issue occurs with the change for 8047818 due to some strange
> include order effects.
> The symptom is that a template function in G1SATBCardTableModRefBS is not
> instantiated when compiling on Windows and the link of jvm.dll fails.
>
> Since 8047818 is already reviewed and is a change we want to keep separate I'd
> like to push the fix for this issue before 8047818 instead of folding it into
> that change.
>
> My suggested fix is to move the implementations of the callers of the template
> function into the cpp file as well. They override virtual functions so they
> should not have been inlined in the first place (since we always call through
> a base class pointer to the BarrierSet).
>
> Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev

Looks good.

StefanK

> Bug: https://bugs.openjdk.java.net/browse/JDK-8048214
>
> Thanks
> /Mikael


From bengt.rutisson at oracle.com  Thu Jun 26 11:39:24 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 26 Jun 2014 13:39:24 +0200
Subject: RFR: 8048214: Linker error when compiling G1SATBCardTableModRefBS
	after include order changes
In-Reply-To: <53AC03ED.9070808@oracle.com>
References: <7134937.oekIdTSvyk@mgerdin03> <53AC03ED.9070808@oracle.com>
Message-ID: <53AC066C.4000909@oracle.com>


On 2014-06-26 13:28, Stefan Karlsson wrote:
>
> On 2014-06-26 13:33, Mikael Gerdin wrote:
>> Hi all!
>>
>> A small build issue occurs with the change for 8047818 due to some 
>> strange
>> include order effects.
>> The symptom is that a template function in G1SATBCardTableModRefBS is 
>> not
>> instantiated when compiling on Windows and the link of jvm.dll fails.
>>
>> Since 8047818 is already reviewed and is a change we want to keep 
>> separate I'd
>> like to push the fix for this issue before 8047818 instead of folding 
>> it into
>> that change.
>>
>> My suggested fix is to move the implementations of the callers of the 
>> template
>> function into the cpp file as well. They override virtual functions 
>> so they
>> should not have been inlined in the first place (since we always call 
>> through
>> a base class pointer to the BarrierSet).
>>
>> Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev
>
> Looks good.

+1

Bengt

>
> StefanK
>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8048214
>>
>> Thanks
>> /Mikael
>


From thomas.schatzl at oracle.com  Thu Jun 26 11:41:59 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 26 Jun 2014 13:41:59 +0200
Subject: RFR: 8048214: Linker error when compiling
	G1SATBCardTableModRefBS after include order changes
In-Reply-To: <7134937.oekIdTSvyk@mgerdin03>
References: <7134937.oekIdTSvyk@mgerdin03>
Message-ID: <1403782919.2656.33.camel@cirrus>

Hi,

On Thu, 2014-06-26 at 13:33 +0200, Mikael Gerdin wrote:
> Hi all!
> 
> A small build issue occurs with the change for 8047818 due to some strange 
> include order effects.
> The symptom is that a template function in G1SATBCardTableModRefBS is not 
> instantiated when compiling on Windows and the link of jvm.dll fails.
> 
> Since 8047818 is already reviewed and is a change we want to keep separate I'd 
> like to push the fix for this issue before 8047818 instead of folding it into 
> that change.
> 
> My suggested fix is to move the implementations of the callers of the template 
> function into the cpp file as well. They override virtual functions so they 
> should not have been inlined in the first place (since we always call through 
> a base class pointer to the BarrierSet).
> 
> Webrev: http://cr.openjdk.java.net/~mgerdin/8048214/webrev
> Bug: https://bugs.openjdk.java.net/browse/JDK-8048214

Looks good.

Thomas


From bengt.rutisson at oracle.com  Thu Jun 26 11:43:17 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 26 Jun 2014 13:43:17 +0200
Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState
	members
In-Reply-To: <1401187165.2682.14.camel@cirrus>
References: <1397827791.2717.23.camel@cirrus>
	<5356CC15.7070507@oracle.com>		<1398198974.2532.32.camel@cirrus>
	<5356DFB3.7050400@oracle.com>	 <5360E01E.7020906@oracle.com>
	<1401187165.2682.14.camel@cirrus>
Message-ID: <53AC0755.3050300@oracle.com>


Hi Thomas,

Looks good. One question though. In g1ParScanThreadState.hpp you have:

  131   inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz);
  132   inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t 
word_sz);
  133   inline void undo_allocation(GCAllocPurpose purpose, HeapWord* 
obj, size_t word_sz);

But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't 
the implementation be placed in g1ParScanThreadState.inline.hpp?

Thanks,
Bengt


On 2014-05-27 12:39, Thomas Schatzl wrote:
> Hi Bengt,
>
>    thanks for the review.
>
> On Wed, 2014-04-30 at 13:35 +0200, Bengt Rutisson wrote:
>> Hi Thomas,
>>
>> Over all this looks good to me too.
>>
>> One question for g1ParScanThreadState.cpp. You have marked the
>> deal_with_refrence() methods as "inline" even though they are in the
>> same cpp file. Does that have any effect?
> I moved them to the inline files.
>
>>    394
>>    395 template <class T> inline void
>> G1ParScanThreadState::deal_with_reference(T* ref_to_scan) {
>>    396   if (!has_partial_array_mask(ref_to_scan)) {
>>    397     // Note: we can use "raw" versions of "region_containing" because
>>    398     // "obj_to_scan" is definitely in the heap, and is not in a
>>    399     // humongous region.
>>    400     HeapRegion* r = _g1h->heap_region_containing_raw(ref_to_scan);
>>    401     do_oop_evac(ref_to_scan, r);
>>    402   } else {
>>    403     do_oop_partial_array((oop*)ref_to_scan);
>>    404   }
>>    405 }
>>    406
>>    407 inline void G1ParScanThreadState::deal_with_reference(StarTask ref) {
>>    408   assert(verify_task(ref), "sanity");
>>    409   if (ref.is_narrow()) {
>>    410     deal_with_reference((narrowOop*)ref);
>>    411   } else {
>>    412     deal_with_reference((oop*)ref);
>>    413   }
>>    414 }
>>
>> Also, I think that you have to declare methods that should be inlined
>> before the place where they are being used on some platforms (Solaris).
>> In this case I think it means that they should be declared before
>> steal_and_trim_queue().
> Moved them to the inline file.
>
>> Personally I also find the new deal_with_reference(StarTask ref) a
>> little confusing. With that method and the two methods generated by
>> deal_with_reference(T* ref_to_scan) I get kind of unsure which method
>> that will be executed by a call like:
>>
>>    156   StarTask stolen_task;
>>    157   while (task_queues->steal(queue_num(), hash_seed(), stolen_task)) {
>>    158     assert(verify_task(stolen_task), "sanity");
>>    159     deal_with_reference(stolen_task);
>>
>> All three deal_with_reference() methods are potential matches. I assume
>> the compiler prefers the deal_with_reference(StarTask ref) but it makes
>> me unsure when I read the code.
> Changed to dispatch_reference().
>
>> One minor nit:
>>
>> g1ParScanThreadState.hpp
>> You have changed the indentation of private/protected/public keywords to
>> have one space indentation. That's fine as I think that is the standard,
>> but since the whole file used no space indentation I would also have
>> been fine with leaving that. However now the last "public" keyword is
>> still having no space before it. Can you indent that too?
>>
>> 218 public:
> Fixed. Also removed superfluous newlines at the end of files.
>
> Also re-checked again for performance regressions, none found.
>
> Diff to last revision
> http://cr.openjdk.java.net/~tschatzl/8035401/webrev.1_to_2/
>
> Full diff:
> http://cr.openjdk.java.net/~tschatzl/8035401/webrev.2/
>
> (based on 8035400)
>
> Thanks,
> Thomas
>
>
>> Thanks,
>> Bengt
>>
>>
>> On 2014-04-22 23:31, Jon Masamitsu wrote:
>>> On 4/22/14 1:36 PM, Thomas Schatzl wrote:
>>>> Hi Jon,
>>>>
>>>> On Tue, 2014-04-22 at 13:07 -0700, Jon Masamitsu wrote:
>>>>> Thomas,
>>>>>
>>>>> What I see in these changes are
>>>>>
>>>>> 1) no semantic changes
>>>> No.
>>>>
>>>>> 2) some methods in .hpp files moved to .cpp files
>>>> Yes, because they were only referenced by the cpp file, so I thought it
>>>> would be good to move them there. They will be inlined as needed anyway
>>>> (and I think for some of them they were never inlined due to their
>>>> size).
>>>>
>>>> I will do some more runs with the inline's added again.
>>>>
>>>>> 3) creation of steal_and_trim_queue() with definition in
>>>>> a .cpp file (I may have missed additional such new
>>>>> methods)
>>>> There are none except queue_is_empty(), see below.
>>>>
>>>>> 4) change in visibility as the CR says
>>>> That's the main change.
>>>>
>>>>> 5) no performance regressions as stated in your RFR
>>>> No. Checked the results for the usual benchmarks (specjvm2008,
>>>> specjbb05/2013) again right now, and there are no significant
>>>> differences in the scores (on x64 and sparc), and for specjbb05/2013 the
>>>> average gc pause time, and the object copy time (assuming that this is
>>>> the part that will be affected most) stay the same as in the baseline.
>>>>
>>>>> If that's what constitutes the change, looks good.
>>>> Thanks.
>>>>
>>>>> Reviewed.
>>>>>
>>>>> If there is something more significant that I have
>>>>> overlooked, please point me at it and I'll look again.
>>>> There is not. Sorry, I should have pointed out the changes in more
>>>> detail instead of you making guesses.
>>>>
>>>> Additional minor changes:
>>>>
>>>> - G1ParScanThreadState accesses members directly instead of using
>>>> getters (e.g. _refs instead of refs()).
>>>>
>>>> - fixed some newlines in method declarations, removing newlines
>>>>
>>>> - removed refs() to avoid direct access from outside, and adding a new
>>>> method queue_is_empty() (only used in asserts as refs()->is_empty(), and
>>>> I did not want to expose refs() just for the asserts).
>>> All looks good.
>>>
>>> Reviewed.
>>>
>>> Jon
>>>
>>>> Thanks,
>>>>     Thomas
>>>>
>>>>> On 4/18/14 6:29 AM, Thomas Schatzl wrote:
>>>>>> Hi all,
>>>>>>
>>>>>>      can I have reviews for this change? After moving
>>>>>> G1ParScanThreadState,
>>>>>> this change cleans up visibility, making a whole lot of stuff private.
>>>>>>
>>>>>> CR:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8035401
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~tschatzl/8035401/webrev/
>>>>>>
>>>>>> Testing:
>>>>>> perf testing indicated no changes, jprt
>>>>>>
>>>>>> Thanks,
>>>>>>      Thomas
>>>>>>
>>>>>>
>


From mikael.gerdin at oracle.com  Thu Jun 26 11:58:59 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 26 Jun 2014 13:58:59 +0200
Subject: RFR (M/L): JDK-8035400: Move G1ParScanThreadState into its own
	files
In-Reply-To: <53AC047C.4090101@oracle.com>
References: <1397829156.2717.24.camel@cirrus>
	<1401187161.2682.13.camel@cirrus> <53AC047C.4090101@oracle.com>
Message-ID: <4767727.8Tv2L2B5Jo@mgerdin03>

Hi,

On Thursday 26 June 2014 13.31.08 Bengt Rutisson wrote:
> Hi Thomas,
> 
> I think this looks good now.

+1
Looks good
/Mikael

> 
> Thanks,
> Bengt
> 
> On 2014-05-27 12:39, Thomas Schatzl wrote:
> > Hi Bengt,
> > 
> >    thanks for the review and sorry for the long delay...
> > 
> > On Wed, 2014-05-07 at 13:18 +0200, Bengt Rutisson wrote:
> >> Hi Thomas,
> >> 
> >> On 2014-04-18 15:52, Thomas Schatzl wrote:
> >>> Hi all,
> >>> 
> >>>     can I have reviews for the above change? It moves
> >>>     G1ParScanThreadState
> >>> 
> >>> into G1ParScanThreadState*pp files.
> >>> 
> >>> The only changes are limited to:
> >>>    - adding a "#pragma warning( disable:4355 ) // 'this' : used in base
> >>> 
> >>> member initializer list" to shut visual C up about the problem (which
> >>> should be cleaned up at some point - I found an issue that slipped
> >>> through because of that, JDK-8040977)
> >> 
> >> As I commented in the review of JDK-8040977 I would prefer to make the
> >> change to not pass this as a parameter to the constructor. That would
> >> also remove the need for disabling the warning. Maybe in that case base
> >> this review on top of the fix for JDK-8040977 rather than the other way
> >> around?
> > 
> > I do not see an advantage either way. Since this would require me make
> > significant changes to all patches, I would prefer keeping the order
> > this way if you do not mind.
> > 
> > In the latest JDK-8040977 I removed the need for the pragma as
> > requested.
> > 
> >>>    - added necessary include file references; I hope the AIX guys can
> >>> 
> >>> compile that change to avoid troubles. It compiles fine with all Oracle
> >>> supported archs.
> >> 
> >> You also moved the definition of the destructor of G1ParScanThreadState
> >> from the hpp file to the cpp file. Makes sense, but was not strictly
> >> needed for this change, right?
> > 
> > Fixed that. This has been an oversight when separating out the changes.
> > 
> >>> There will be another CR for fixing up visibility and cleaning up stuff
> >>> a little.
> >>> 
> >>> CR:
> >>> https://bugs.openjdk.java.net/browse/JDK-8035400
> >>> 
> >>> Webrev:
> >>> http://cr.openjdk.java.net/~tschatzl/8035400/webrev/
> >> 
> >> It is a bit hard to review moved code. But except for the comment
> >> regarding JDK-8040977 above I think it looks good.
> >> 
> >> I think you can clean up the includes a bit more if you have time. Seems
> >> like these includes in g1CollectedHeap.cpp are for example not needed
> >> anymore:
> >> 
> >> #include "oops/oop.inline.hpp"
> >> #include "oops/oop.pcgc.inline.hpp"
> > 
> > I tried to clean up the includes a little more. However you cannot move
> > these particular includes because they are still needed for evacuation
> > failure handling.
> > 
> > I also rebased the change on the current hotspot jdk9 gc repo.
> > 
> > Diff webrev at
> > http://cr.openjdk.java.net/~tschatzl/8035400/webrev.1_to_2/
> > 
> > Complete webrev at
> > http://cr.openjdk.java.net/~tschatzl/8035400/webrev.2/
> > 
> > Thanks,
> > 
> >    Thomas


From thomas.schatzl at oracle.com  Thu Jun 26 11:59:52 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 26 Jun 2014 13:59:52 +0200
Subject: RFR (M/L): JDK-8035400: Move G1ParScanThreadState into its own
	files
In-Reply-To: <4767727.8Tv2L2B5Jo@mgerdin03>
References: <1397829156.2717.24.camel@cirrus>
	<1401187161.2682.13.camel@cirrus> <53AC047C.4090101@oracle.com>
	<4767727.8Tv2L2B5Jo@mgerdin03>
Message-ID: <1403783992.2656.37.camel@cirrus>

Hi Bengt, Mikael,

On Thu, 2014-06-26 at 13:58 +0200, Mikael Gerdin wrote:
> Hi,
> 
> On Thursday 26 June 2014 13.31.08 Bengt Rutisson wrote:
> > Hi Thomas,
> > 
> > I think this looks good now.
> 
> +1
> Looks good
> /Mikael

  thanks for the reviews.

Thomas


From bengt.rutisson at oracle.com  Thu Jun 26 12:07:20 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 26 Jun 2014 14:07:20 +0200
Subject: RFR: Backport of: JDK-8043607: Add a GC id as a log decoration similar
	to PrintGCTimeStamps
Message-ID: <53AC0CF8.5090903@oracle.com>


Hi all,

Can I have a couple of reviews for this backport?

The fix for JDK-8043607 applied cleanly to the 8 update repository, but 
as I mentioned in the original review I would like to change the default 
value for the logging flag when I backport this.

Here's the webrev for the backport to 8u:

http://cr.openjdk.java.net/~brutisso/8043607/webrev.8u.00/

The only change compared to what was pushed to JDK 9 is that in 
globals.hpp the default value for PrintGCID is different.

JDK8:
http://cr.openjdk.java.net/~brutisso/8043607/webrev.8u.00/src/share/vm/runtime/globals.hpp.udiff.html

JDK9:
http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/dabee7bb3a8f#l29.7


Thanks,
Bengt


From mikael.gerdin at oracle.com  Thu Jun 26 12:13:25 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 26 Jun 2014 14:13:25 +0200
Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState
	members
In-Reply-To: <53AC0755.3050300@oracle.com>
References: <1397827791.2717.23.camel@cirrus>
	<1401187165.2682.14.camel@cirrus> <53AC0755.3050300@oracle.com>
Message-ID: <4783916.GRNXiRhIx2@mgerdin03>

Bengt,

On Thursday 26 June 2014 13.43.17 Bengt Rutisson wrote:
> Hi Thomas,
> 
> Looks good. One question though. In g1ParScanThreadState.hpp you have:
> 
>   131   inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz);
>   132   inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t
> word_sz);
>   133   inline void undo_allocation(GCAllocPurpose purpose, HeapWord*
> obj, size_t word_sz);
> 
> But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't
> the implementation be placed in g1ParScanThreadState.inline.hpp?

These are now private methods and are only ever called from the .cpp file so 
they don't need to be visible outside it.
The inline keyword is sort of a hint to the compiler to inline them into the 
caller, I'm not sure if it's strictly needed though.

I think the change in webrev.2 looks good.

/Mikael

> 
> Thanks,
> Bengt
> 
> On 2014-05-27 12:39, Thomas Schatzl wrote:
> > Hi Bengt,
> > 
> >    thanks for the review.
> > 
> > On Wed, 2014-04-30 at 13:35 +0200, Bengt Rutisson wrote:
> >> Hi Thomas,
> >> 
> >> Over all this looks good to me too.
> >> 
> >> One question for g1ParScanThreadState.cpp. You have marked the
> >> deal_with_refrence() methods as "inline" even though they are in the
> >> same cpp file. Does that have any effect?
> > 
> > I moved them to the inline files.
> > 
> >>    394
> >>    395 template <class T> inline void
> >> 
> >> G1ParScanThreadState::deal_with_reference(T* ref_to_scan) {
> >> 
> >>    396   if (!has_partial_array_mask(ref_to_scan)) {
> >>    397     // Note: we can use "raw" versions of "region_containing"
> >>    because
> >>    398     // "obj_to_scan" is definitely in the heap, and is not in a
> >>    399     // humongous region.
> >>    400     HeapRegion* r = _g1h->heap_region_containing_raw(ref_to_scan);
> >>    401     do_oop_evac(ref_to_scan, r);
> >>    402   } else {
> >>    403     do_oop_partial_array((oop*)ref_to_scan);
> >>    404   }
> >>    405 }
> >>    406
> >>    407 inline void G1ParScanThreadState::deal_with_reference(StarTask
> >>    ref) {
> >>    408   assert(verify_task(ref), "sanity");
> >>    409   if (ref.is_narrow()) {
> >>    410     deal_with_reference((narrowOop*)ref);
> >>    411   } else {
> >>    412     deal_with_reference((oop*)ref);
> >>    413   }
> >>    414 }
> >> 
> >> Also, I think that you have to declare methods that should be inlined
> >> before the place where they are being used on some platforms (Solaris).
> >> In this case I think it means that they should be declared before
> >> steal_and_trim_queue().
> > 
> > Moved them to the inline file.
> > 
> >> Personally I also find the new deal_with_reference(StarTask ref) a
> >> little confusing. With that method and the two methods generated by
> >> deal_with_reference(T* ref_to_scan) I get kind of unsure which method
> >> 
> >> that will be executed by a call like:
> >>    156   StarTask stolen_task;
> >>    157   while (task_queues->steal(queue_num(), hash_seed(),
> >>    stolen_task)) {
> >>    158     assert(verify_task(stolen_task), "sanity");
> >>    159     deal_with_reference(stolen_task);
> >> 
> >> All three deal_with_reference() methods are potential matches. I assume
> >> the compiler prefers the deal_with_reference(StarTask ref) but it makes
> >> me unsure when I read the code.
> > 
> > Changed to dispatch_reference().
> > 
> >> One minor nit:
> >> 
> >> g1ParScanThreadState.hpp
> >> You have changed the indentation of private/protected/public keywords to
> >> have one space indentation. That's fine as I think that is the standard,
> >> but since the whole file used no space indentation I would also have
> >> been fine with leaving that. However now the last "public" keyword is
> >> still having no space before it. Can you indent that too?
> > 
> >> 218 public:
> > Fixed. Also removed superfluous newlines at the end of files.
> > 
> > Also re-checked again for performance regressions, none found.
> > 
> > Diff to last revision
> > http://cr.openjdk.java.net/~tschatzl/8035401/webrev.1_to_2/
> > 
> > Full diff:
> > http://cr.openjdk.java.net/~tschatzl/8035401/webrev.2/
> > 
> > (based on 8035400)
> > 
> > Thanks,
> > Thomas
> > 
> >> Thanks,
> >> Bengt
> >> 
> >> On 2014-04-22 23:31, Jon Masamitsu wrote:
> >>> On 4/22/14 1:36 PM, Thomas Schatzl wrote:
> >>>> Hi Jon,
> >>>> 
> >>>> On Tue, 2014-04-22 at 13:07 -0700, Jon Masamitsu wrote:
> >>>>> Thomas,
> >>>>> 
> >>>>> What I see in these changes are
> >>>>> 
> >>>>> 1) no semantic changes
> >>>> 
> >>>> No.
> >>>> 
> >>>>> 2) some methods in .hpp files moved to .cpp files
> >>>> 
> >>>> Yes, because they were only referenced by the cpp file, so I thought it
> >>>> would be good to move them there. They will be inlined as needed anyway
> >>>> (and I think for some of them they were never inlined due to their
> >>>> size).
> >>>> 
> >>>> I will do some more runs with the inline's added again.
> >>>> 
> >>>>> 3) creation of steal_and_trim_queue() with definition in
> >>>>> a .cpp file (I may have missed additional such new
> >>>>> methods)
> >>>> 
> >>>> There are none except queue_is_empty(), see below.
> >>>> 
> >>>>> 4) change in visibility as the CR says
> >>>> 
> >>>> That's the main change.
> >>>> 
> >>>>> 5) no performance regressions as stated in your RFR
> >>>> 
> >>>> No. Checked the results for the usual benchmarks (specjvm2008,
> >>>> specjbb05/2013) again right now, and there are no significant
> >>>> differences in the scores (on x64 and sparc), and for specjbb05/2013
> >>>> the
> >>>> average gc pause time, and the object copy time (assuming that this is
> >>>> the part that will be affected most) stay the same as in the baseline.
> >>>> 
> >>>>> If that's what constitutes the change, looks good.
> >>>> 
> >>>> Thanks.
> >>>> 
> >>>>> Reviewed.
> >>>>> 
> >>>>> If there is something more significant that I have
> >>>>> overlooked, please point me at it and I'll look again.
> >>>> 
> >>>> There is not. Sorry, I should have pointed out the changes in more
> >>>> detail instead of you making guesses.
> >>>> 
> >>>> Additional minor changes:
> >>>> 
> >>>> - G1ParScanThreadState accesses members directly instead of using
> >>>> getters (e.g. _refs instead of refs()).
> >>>> 
> >>>> - fixed some newlines in method declarations, removing newlines
> >>>> 
> >>>> - removed refs() to avoid direct access from outside, and adding a new
> >>>> method queue_is_empty() (only used in asserts as refs()->is_empty(),
> >>>> and
> >>>> I did not want to expose refs() just for the asserts).
> >>> 
> >>> All looks good.
> >>> 
> >>> Reviewed.
> >>> 
> >>> Jon
> >>> 
> >>>> Thanks,
> >>>> 
> >>>>     Thomas
> >>>>> 
> >>>>> On 4/18/14 6:29 AM, Thomas Schatzl wrote:
> >>>>>> Hi all,
> >>>>>> 
> >>>>>>      can I have reviews for this change? After moving
> >>>>>> 
> >>>>>> G1ParScanThreadState,
> >>>>>> this change cleans up visibility, making a whole lot of stuff
> >>>>>> private.
> >>>>>> 
> >>>>>> CR:
> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8035401
> >>>>>> 
> >>>>>> Webrev:
> >>>>>> http://cr.openjdk.java.net/~tschatzl/8035401/webrev/
> >>>>>> 
> >>>>>> Testing:
> >>>>>> perf testing indicated no changes, jprt
> >>>>>> 
> >>>>>> Thanks,
> >>>>>> 
> >>>>>>      Thomas


From thomas.schatzl at oracle.com  Thu Jun 26 12:19:38 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 26 Jun 2014 14:19:38 +0200
Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState
	members
In-Reply-To: <53AC0755.3050300@oracle.com>
References: <1397827791.2717.23.camel@cirrus> <5356CC15.7070507@oracle.com>
	<1398198974.2532.32.camel@cirrus> <5356DFB3.7050400@oracle.com>
	<5360E01E.7020906@oracle.com> <1401187165.2682.14.camel@cirrus>
	<53AC0755.3050300@oracle.com>
Message-ID: <1403785178.2656.40.camel@cirrus>

Hi Bengt,

 thanks for looking at this...

On Thu, 2014-06-26 at 13:43 +0200, Bengt Rutisson wrote:
> Hi Thomas,
> 
> Looks good. One question though. In g1ParScanThreadState.hpp you have:
> 
>   131   inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz);
>   132   inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t 
> word_sz);
>   133   inline void undo_allocation(GCAllocPurpose purpose, HeapWord* 
> obj, size_t word_sz);
> 
> But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't 
> the implementation be placed in g1ParScanThreadState.inline.hpp?
> 

  if an inlined method is only used in one place, at least G1 code often
places that method into the cpp file directly. It is not required to
make them publicly visible.

I do not have a preference either. What do you think?

Thomas


From bengt.rutisson at oracle.com  Thu Jun 26 12:34:24 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 26 Jun 2014 14:34:24 +0200
Subject: RFR (M): JDK-8035401: Fix visibility of G1ParScanThreadState
	members
In-Reply-To: <1403785178.2656.40.camel@cirrus>
References: <1397827791.2717.23.camel@cirrus>
	<5356CC15.7070507@oracle.com>			<1398198974.2532.32.camel@cirrus>
	<5356DFB3.7050400@oracle.com>		 <5360E01E.7020906@oracle.com>
	<1401187165.2682.14.camel@cirrus>	
	<53AC0755.3050300@oracle.com> <1403785178.2656.40.camel@cirrus>
Message-ID: <53AC1350.4070903@oracle.com>


On 2014-06-26 14:19, Thomas Schatzl wrote:
> Hi Bengt,
>
>   thanks for looking at this...
>
> On Thu, 2014-06-26 at 13:43 +0200, Bengt Rutisson wrote:
>> Hi Thomas,
>>
>> Looks good. One question though. In g1ParScanThreadState.hpp you have:
>>
>>    131   inline HeapWord* allocate(GCAllocPurpose purpose, size_t word_sz);
>>    132   inline HeapWord* allocate_slow(GCAllocPurpose purpose, size_t
>> word_sz);
>>    133   inline void undo_allocation(GCAllocPurpose purpose, HeapWord*
>> obj, size_t word_sz);
>>
>> But the methods are implemented in g1ParScanThreadState.cpp. Shouldn't
>> the implementation be placed in g1ParScanThreadState.inline.hpp?
>>
>    if an inlined method is only used in one place, at least G1 code often
> places that method into the cpp file directly. It is not required to
> make them publicly visible.
>
> I do not have a preference either. What do you think?

I think I would prefer to remove the inline keyword in that case. The 
compiler probably does a better job deciding whether or not it is a good 
idea to inline these methods.

Either way is fine with me.

Reveiwed. :)

Bengt

>
> Thomas
>
>


From mikael.gerdin at oracle.com  Thu Jun 26 12:39:19 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 26 Jun 2014 14:39:19 +0200
Subject: RFR (XS): JDK-8040977: G1 crashes when run with
	-XX:-G1DeferredRSUpdate
In-Reply-To: <53AC03F1.1090309@oracle.com>
References: <1398171391.3002.24.camel@cirrus>
	<1401187156.2682.12.camel@cirrus> <53AC03F1.1090309@oracle.com>
Message-ID: <6725819.L3vCke9nVL@mgerdin03>

Thomas,

On Thursday 26 June 2014 13.28.49 Bengt Rutisson wrote:
> Hi Thomas,
> 
> Sorry for the very late reply.
> 
> I think the dependency between G1ParScanClosure is still very awkward,
> but I think your change is a step in the right direction. Thanks for
> fixing this.

I agree with Bengt.
The change seems reasonable.

/Mikael

> 
> Reviewed.
> Bengt
> 
> On 2014-05-27 12:39, Thomas Schatzl wrote:
> > Hi Bengt,
> > 
> >    thanks for the review.
> > 
> > On Wed, 2014-04-30 at 12:50 +0200, Bengt Rutisson wrote:
> >> Hi Thomas,
> >> 
> >> On 2014-04-22 14:56, Thomas Schatzl wrote:
> >>> Hi all,
> >>> 
> >>>     can I have reviews for this change? It fixes wrong order of
> >>> 
> >>> declaration of members of G1ParScanThreadState that causes crashes when
> >>> G1DeferredRSUpdate is disabled.
> >>> 
> >>> The change is based on the changes for 8035400 and8035401 posted
> >>> recently.
> >>> 
> >>> CR:
> >>> https://bugs.openjdk.java.net/browse/JDK-8040977
> >>> 
> >>> Webrev:
> >>> http://cr.openjdk.java.net/~tschatzl/8040977/webrev/
> >> 
> >> I realize that this fixes the code but I would really appreciate a more
> >> stable way of handling the dependencies.
> >> 
> >> As it it now we end up calling methods on a G1ParScanThreadState
> >> instance while we are setting it up. This seems broken to me and will
> >> probably lead to similar initialization order issues again. Best would
> >> be to not pass "this" to the constructor of G1ParScanClosure and instead
> >> manage the circular dependency between G1ParScanClosure and
> >> G1ParScanThreadState more explicitly after they have both been properly
> >> set up.
> >> 
> >> Second best would be to at least pass the worker id/queue num as a
> >> separate parameter to avoid having to call methods on an uninitialized
> >> object.
> > 
> > I fixed this implementing the former idea. Also added some
> > 
> > New webrev at
> > http://cr.openjdk.java.net/~tschatzl/8040977/webrev.1/
> > 
> > (Sorry, I already had merged the changes before making a diff webrev -
> > however, most changes in the VM code have been redone anyway. The test
> > case stayed the same).
> > 
> > Thanks,
> > 
> >    Thomas


From mikael.gerdin at oracle.com  Thu Jun 26 14:16:36 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 26 Jun 2014 16:16:36 +0200
Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create
	a new RelocIterator
In-Reply-To: <53A3038C.9020004@oracle.com>
References: <53A2DB59.9050605@oracle.com> <53A3038C.9020004@oracle.com>
Message-ID: <5113836.YgHIQ7lM1y@mgerdin03>

Hi,

On Thursday 19 June 2014 17.36.44 Stefan Karlsson wrote:
> This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev.
> 
> On 2014-06-19 14:45, Stefan Karlsson wrote:
> > Hi all,
> > 
> > I have a patch that we have been using in the G1 Class Unloading
> > project to lower the remark times.  This changes Compiler code, so I
> > would like to get feedback from the Compiler team.
> > 
> > http://cr.openjdk.java.net/~stefank/8047362/webrev.00/

The change looks good.

I had an offline discussion with Steafan about this and we think that it would 
actually suffice to pass down the Relocation* since it appears to contain all 
the information needed to create the CompiledIC objects.
However in the interest of moving forward with changes built on top of this we 
will look at that for a future cleanup.

/Mikael

> > https://bugs.openjdk.java.net/browse/JDK-8047362
> > 
> > The patch builds upon the patch in:
> > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html
> > 
> > 
> > Summary from the bug report:
> > ---
> > Creation of RelocIterators show up high in profiles of the remark
> > phase, in the G1 Class Unloading project.
> > 
> > There's a pattern in the nmethod/codecache code to create a
> > 
> > RelocIterator and then materialize a CompiledIC:
> >     RelocIterator iter(this, low_boundary);
> >     while(iter.next()) {
> >     
> >       if (iter.type() == relocInfo::virtual_call_type) {
> >       
> >         CompiledIC *ic = CompiledIC_at(iter.reloc());
> > 
> > CompiledIC_at is implemented as:
> >   new CompiledIC(call_site->code(), nativeCall_at(call_site->addr()));
> > 
> > And one of the first thing CompiledIC::CompiledIC(const nmethod* nm,
> > NativeCall* call) does is to create a new RelocIterator:
> > ...
> > address ic_call = call->instruction_address();
> > ...
> > 
> >   RelocIterator iter(nm, ic_call, ic_call+1);
> >   bool ret = iter.next();
> >   assert(ret == true, "relocInfo must exist at this address");
> >   assert(iter.addr() == ic_call, "must find ic_call");
> > 
> > I would like to propose that we pass down the RelocIterator that we
> > already have, instead of creating a new.
> > ---
> > 
> > 
> > I've previously received feedback that this seems like reasonable
> > thing to do, but that the parameter to the new CompileIC_at should
> > take a const RelocIterator* instead of RelocIterator*. I couldn't do
> > that without changing a significant amount of Compiler code, so I have
> > left it out for now. Any opinions on how to handle that?
> > 
> > 
> > To give an idea of the performance difference, I temporarily added the
> > following code:
> > void CodeCache::iterate_through_CIs(int style) {
> > 
> >   int count;
> >   FOR_ALL_ALIVE_NMETHODS(nm) {
> >   
> >     RelocIterator iter(nm);
> >     while(iter.next()) {
> >     
> >       if (iter.type() == relocInfo::virtual_call_type ||
> >       
> >           iter.type() == relocInfo::opt_virtual_call_type) {
> >         
> >         if (style > 0) {
> >         
> >           CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) :
> > CompiledIC_at(iter.reloc());
> > 
> >           if (ic->ic_destination() == (address)0xdeadb000) {
> >           
> >             gclog_or_tty->print_cr("ShouldNotReachHere");
> >           
> >           }
> >         
> >         }
> >       
> >       }
> >     
> >     }
> >   
> >   }
> > 
> > }
> > 
> > and then measured how long time it took to execute
> > iterate_through_CIs(style) 1000 times with style == {0, 1, 2}.
> > 
> > The results are:
> >  iterate_through_CIs(0): 1.210833 s // No CompiledICs created
> >  iterate_through_CIs(1): 1.976557 s // New style
> >  iterate_through_CIs(2): 9.924209 s // Old style
> > 
> > Testing:
> >   A similar version has been used and thoroughly been tested together
> > 
> > with the other G1 Class Unloading changes. This exact version has so
> > far only been tested with Kitchensink and SpecJVM2008
> > compiler.compiler. What test lists would be appropriate to test this
> > with?
> > 
> > 
> > thanks,
> > StefanK


From stefan.karlsson at oracle.com  Thu Jun 26 14:10:05 2014
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 26 Jun 2014 16:10:05 +0200
Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create
	a new RelocIterator
In-Reply-To: <5113836.YgHIQ7lM1y@mgerdin03>
References: <53A2DB59.9050605@oracle.com> <53A3038C.9020004@oracle.com>
	<5113836.YgHIQ7lM1y@mgerdin03>
Message-ID: <53AC29BD.4070004@oracle.com>


On 2014-06-26 16:16, Mikael Gerdin wrote:
> Hi,
>
> On Thursday 19 June 2014 17.36.44 Stefan Karlsson wrote:
>> This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev.
>>
>> On 2014-06-19 14:45, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> I have a patch that we have been using in the G1 Class Unloading
>>> project to lower the remark times.  This changes Compiler code, so I
>>> would like to get feedback from the Compiler team.
>>>
>>> http://cr.openjdk.java.net/~stefank/8047362/webrev.00/
> The change looks good.
>
> I had an offline discussion with Steafan about this and we think that it would
> actually suffice to pass down the Relocation* since it appears to contain all
> the information needed to create the CompiledIC objects.
> However in the interest of moving forward with changes built on top of this we
> will look at that for a future cleanup.

Thanks.

StefanK

>
> /Mikael
>
>>> https://bugs.openjdk.java.net/browse/JDK-8047362
>>>
>>> The patch builds upon the patch in:
>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html
>>>
>>>
>>> Summary from the bug report:
>>> ---
>>> Creation of RelocIterators show up high in profiles of the remark
>>> phase, in the G1 Class Unloading project.
>>>
>>> There's a pattern in the nmethod/codecache code to create a
>>>
>>> RelocIterator and then materialize a CompiledIC:
>>>      RelocIterator iter(this, low_boundary);
>>>      while(iter.next()) {
>>>      
>>>        if (iter.type() == relocInfo::virtual_call_type) {
>>>        
>>>          CompiledIC *ic = CompiledIC_at(iter.reloc());
>>>
>>> CompiledIC_at is implemented as:
>>>    new CompiledIC(call_site->code(), nativeCall_at(call_site->addr()));
>>>
>>> And one of the first thing CompiledIC::CompiledIC(const nmethod* nm,
>>> NativeCall* call) does is to create a new RelocIterator:
>>> ...
>>> address ic_call = call->instruction_address();
>>> ...
>>>
>>>    RelocIterator iter(nm, ic_call, ic_call+1);
>>>    bool ret = iter.next();
>>>    assert(ret == true, "relocInfo must exist at this address");
>>>    assert(iter.addr() == ic_call, "must find ic_call");
>>>
>>> I would like to propose that we pass down the RelocIterator that we
>>> already have, instead of creating a new.
>>> ---
>>>
>>>
>>> I've previously received feedback that this seems like reasonable
>>> thing to do, but that the parameter to the new CompileIC_at should
>>> take a const RelocIterator* instead of RelocIterator*. I couldn't do
>>> that without changing a significant amount of Compiler code, so I have
>>> left it out for now. Any opinions on how to handle that?
>>>
>>>
>>> To give an idea of the performance difference, I temporarily added the
>>> following code:
>>> void CodeCache::iterate_through_CIs(int style) {
>>>
>>>    int count;
>>>    FOR_ALL_ALIVE_NMETHODS(nm) {
>>>    
>>>      RelocIterator iter(nm);
>>>      while(iter.next()) {
>>>      
>>>        if (iter.type() == relocInfo::virtual_call_type ||
>>>        
>>>            iter.type() == relocInfo::opt_virtual_call_type) {
>>>          
>>>          if (style > 0) {
>>>          
>>>            CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) :
>>> CompiledIC_at(iter.reloc());
>>>
>>>            if (ic->ic_destination() == (address)0xdeadb000) {
>>>            
>>>              gclog_or_tty->print_cr("ShouldNotReachHere");
>>>            
>>>            }
>>>          
>>>          }
>>>        
>>>        }
>>>      
>>>      }
>>>    
>>>    }
>>>
>>> }
>>>
>>> and then measured how long time it took to execute
>>> iterate_through_CIs(style) 1000 times with style == {0, 1, 2}.
>>>
>>> The results are:
>>>   iterate_through_CIs(0): 1.210833 s // No CompiledICs created
>>>   iterate_through_CIs(1): 1.976557 s // New style
>>>   iterate_through_CIs(2): 9.924209 s // Old style
>>>
>>> Testing:
>>>    A similar version has been used and thoroughly been tested together
>>>
>>> with the other G1 Class Unloading changes. This exact version has so
>>> far only been tested with Kitchensink and SpecJVM2008
>>> compiler.compiler. What test lists would be appropriate to test this
>>> with?
>>>
>>>
>>> thanks,
>>> StefanK


From mikael.gerdin at oracle.com  Thu Jun 26 14:18:58 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Thu, 26 Jun 2014 16:18:58 +0200
Subject: RFR: 8047326: Add a version of CompiledIC_at that doesn't create
	a new RelocIterator
In-Reply-To: <5113836.YgHIQ7lM1y@mgerdin03>
References: <53A2DB59.9050605@oracle.com> <53A3038C.9020004@oracle.com>
	<5113836.YgHIQ7lM1y@mgerdin03>
Message-ID: <1898495.XQJiaMVt6f@mgerdin03>

I replied to the wrong list, sorry.
Forwarding my review to hotspot-dev.

/Mikael

On Thursday 26 June 2014 16.16.36 Mikael Gerdin wrote:
> Hi,
> 
> On Thursday 19 June 2014 17.36.44 Stefan Karlsson wrote:
> > This was meant for the hotspot-dev list. BCC:ing hotspot-gc-dev.
> > 
> > On 2014-06-19 14:45, Stefan Karlsson wrote:
> > > Hi all,
> > > 
> > > I have a patch that we have been using in the G1 Class Unloading
> > > project to lower the remark times.  This changes Compiler code, so I
> > > would like to get feedback from the Compiler team.
> > > 
> > > http://cr.openjdk.java.net/~stefank/8047362/webrev.00/
> 
> The change looks good.
> 
> I had an offline discussion with Steafan about this and we think that it
> would actually suffice to pass down the Relocation* since it appears to
> contain all the information needed to create the CompiledIC objects.
> However in the interest of moving forward with changes built on top of this
> we will look at that for a future cleanup.
> 
> /Mikael
> 
> > > https://bugs.openjdk.java.net/browse/JDK-8047362
> > > 
> > > The patch builds upon the patch in:
> > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html
> > > 
> > > 
> > > Summary from the bug report:
> > > ---
> > > Creation of RelocIterators show up high in profiles of the remark
> > > phase, in the G1 Class Unloading project.
> > > 
> > > There's a pattern in the nmethod/codecache code to create a
> > > 
> > > RelocIterator and then materialize a CompiledIC:
> > >     RelocIterator iter(this, low_boundary);
> > >     while(iter.next()) {
> > >     
> > >       if (iter.type() == relocInfo::virtual_call_type) {
> > >       
> > >         CompiledIC *ic = CompiledIC_at(iter.reloc());
> > > 
> > > CompiledIC_at is implemented as:
> > >   new CompiledIC(call_site->code(), nativeCall_at(call_site->addr()));
> > > 
> > > And one of the first thing CompiledIC::CompiledIC(const nmethod* nm,
> > > NativeCall* call) does is to create a new RelocIterator:
> > > ...
> > > address ic_call = call->instruction_address();
> > > ...
> > > 
> > >   RelocIterator iter(nm, ic_call, ic_call+1);
> > >   bool ret = iter.next();
> > >   assert(ret == true, "relocInfo must exist at this address");
> > >   assert(iter.addr() == ic_call, "must find ic_call");
> > > 
> > > I would like to propose that we pass down the RelocIterator that we
> > > already have, instead of creating a new.
> > > ---
> > > 
> > > 
> > > I've previously received feedback that this seems like reasonable
> > > thing to do, but that the parameter to the new CompileIC_at should
> > > take a const RelocIterator* instead of RelocIterator*. I couldn't do
> > > that without changing a significant amount of Compiler code, so I have
> > > left it out for now. Any opinions on how to handle that?
> > > 
> > > 
> > > To give an idea of the performance difference, I temporarily added the
> > > following code:
> > > void CodeCache::iterate_through_CIs(int style) {
> > > 
> > >   int count;
> > >   FOR_ALL_ALIVE_NMETHODS(nm) {
> > >   
> > >     RelocIterator iter(nm);
> > >     while(iter.next()) {
> > >     
> > >       if (iter.type() == relocInfo::virtual_call_type ||
> > >       
> > >           iter.type() == relocInfo::opt_virtual_call_type) {
> > >         
> > >         if (style > 0) {
> > >         
> > >           CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) :
> > > CompiledIC_at(iter.reloc());
> > > 
> > >           if (ic->ic_destination() == (address)0xdeadb000) {
> > >           
> > >             gclog_or_tty->print_cr("ShouldNotReachHere");
> > >           
> > >           }
> > >         
> > >         }
> > >       
> > >       }
> > >     
> > >     }
> > >   
> > >   }
> > > 
> > > }
> > > 
> > > and then measured how long time it took to execute
> > > iterate_through_CIs(style) 1000 times with style == {0, 1, 2}.
> > > 
> > > The results are:
> > >  iterate_through_CIs(0): 1.210833 s // No CompiledICs created
> > >  iterate_through_CIs(1): 1.976557 s // New style
> > >  iterate_through_CIs(2): 9.924209 s // Old style
> > > 
> > > Testing:
> > >   A similar version has been used and thoroughly been tested together
> > > 
> > > with the other G1 Class Unloading changes. This exact version has so
> > > far only been tested with Kitchensink and SpecJVM2008
> > > compiler.compiler. What test lists would be appropriate to test this
> > > with?
> > > 
> > > 
> > > thanks,
> > > StefanK


From andreas.sjoberg at oracle.com  Thu Jun 26 14:24:23 2014
From: andreas.sjoberg at oracle.com (=?ISO-8859-1?Q?Andreas_Sj=F6berg?=)
Date: Thu, 26 Jun 2014 16:24:23 +0200
Subject: RFR JDK-8047328: Change typedef CardIdx_t from int to uint16_t
Message-ID: <53AC2D17.60507@oracle.com>

Hi all,

could I please have reviews for this patch that changes the typedef 
CardIdx_t from int to uint16_t. The motivation behind this patch is to 
reduce the memory footprint caused by the G1 remembered sets.

This adds a _next_null field to the SparsePRTEntry class which keeps 
track of where the next possible insert could be. The other 
modifications are to make use of the fact that we know exactly how many 
cards are contained in the SparsePRTEntry, and no longer have to compare 
against a designated NullEntry value.

webrev: http://cr.openjdk.java.net/~jwilhelm/8047328/webrev/

Thanks,
Andreas


From thomas.schatzl at oracle.com  Thu Jun 26 14:38:49 2014
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 26 Jun 2014 16:38:49 +0200
Subject: RFR(S): JDK-8047330: Remove unrolled card loops in G1
	SparsePRTEntry
In-Reply-To: <53AA7402.1040608@oracle.com>
References: <53A2E53B.3050508@oracle.com> <2772002.dt1otWjIrg@mgerdin03>
	<53AA7402.1040608@oracle.com>
Message-ID: <1403793529.2656.50.camel@cirrus>

Hi,

On Wed, 2014-06-25 at 09:02 +0200, Andreas Sj?berg wrote:
> Hi!
> 
> Following Mikael's review and some offline comments from Thomas I've 
> made these changes in addition to removing the unrolled card loops:
> 
> * removed the now unused define
> * added braces for the for-loop in SparsePRTEntry::init
> * changed the implementation of copy_cards to use memcpy
> 
> New webrev: http://cr.openjdk.java.net/~jwilhelm/8047330/webrev.02/

  looks okay to me.

Thomas


From jon.masamitsu at oracle.com  Thu Jun 26 17:46:55 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 26 Jun 2014 10:46:55 -0700
Subject: Request for Review (vs) - 8034056: assert(_heap_alignment >=
	_space_alignment) failed: heap_alignment less than space_alignment
Message-ID: <53AC5C8F.4040203@oracle.com>

8034056: assert(_heap_alignment >= _space_alignment) failed: 
heap_alignment less than space_alignment

In the calculation of the heap alignment there was an exception for the
UseParalelGC collector for the case of large pages.  The fix was to remove
that  exception.

Fixed was tested on a machine that exhibited the failure
(thanks, Stefan J)

http://cr.openjdk.java.net/~jmasa/8034056/webrev.00/

https://bugs.openjdk.java.net/browse/JDK-8034056

Thanks.

Jon


From erik.helin at oracle.com  Fri Jun 27 10:55:55 2014
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 27 Jun 2014 12:55:55 +0200
Subject: FW: RFR(S): 8047812: Ensure
	ClassLoaderDataGraph::classes_unloading_do only delivers
	klasses from CLDs with non-reclaimed class loader oops
In-Reply-To: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default>
References: <2bf3b050-e3cf-43c2-ac64-61ebe0320061@default>
Message-ID: <18720451.n6H3skUL5s@ehelin-laptop>

Looks good, Reviewd.

Thanks,
Erik

On Monday 23 June 2014 08:26:51 AM Markus Gr?nlund wrote:
> Sending this to the Hotspot-GC-dev group as well.
> 
>  
> 
> /Markus
> 
>  
> 
> From: Markus Gr?nlund
> Sent: den 23 juni 2014 17:03
> To: hotspot-runtime-dev; serviceability-dev
> Subject: RFR(S): 8047812: Ensure ClassLoaderDataGraph::classes_unloading_do
> only delivers klasses from CLDs with non-reclaimed class loader oops
> 
>  
> 
> Greetings,
> 
>  
> 
> Kindly asking for reviews for the following change:
> 
>  
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8047812
> 
> Webrev: http://cr.openjdk.java.net/~mgronlun/8047812/webrev01
> 
>  
> 
> Description:
> 
> The "8038212: Method::is_valid_method() check has performance regression
>   impact for stackwalking" - changeset introduced a change in how the
> ClassLoaderDataGraph::_unloading list of ClassLoaderData's is purged.
> 
> This change to the purging of the CLD's work the same as before for most
> GC's, but when using CMS GC, SystemDictionary::do_unloading() is called
> twice with no explicit purge call in between. On the second call
> (post-sweep), we can now get stale class loader oops delivered as part of
> the Klass closure callbacks from the _unloading list. Again, this is
> because there is no explicit purge call in between these two entries to
> SystemDictionary::do_unloading() - and being CMS and concurrent, it is very
> hard to accommodate a timely and proper purge call here.
> 
> The first do_unloading call comes after CMS concurrent marking, and the
> second comes from a Full GC triggered while sweeping the CMS heap.
> 
> This fix ensures the unloading purge mechanism to work correctly also for
> the CMS collector, in that only CLDs with non-reclaimed class loader oops
> will deliver klasses from the _unloading list. In addition, this will
> ensure a single "logical" pass is achieved when iterating the unloading
> list in-between purges (avoiding the processing of the same data twice).
> 
> This fix is precipitated by nightly testing failures with CMS after the
> introduction of 8038212: Method::is_valid_method() check has performance
> regression impact for stackwalking" - for example
> "nsk/sysdict/vm/stress/jck12a//sysdictj12a008" which is crashing because of
> following up stale klass loader oop's from the
> ClassLoaderDataGraph::_unloading list.
> 
>  
> 
> Thanks
> 
> Markus


From tprintezis at twitter.com  Fri Jun 27 13:00:11 2014
From: tprintezis at twitter.com (Tony Printezis)
Date: Fri, 27 Jun 2014 09:00:11 -0400
Subject: The GCLocker blues...
Message-ID: <53AD6ADB.10301@twitter.com>

Hi all,

(trying again from my Twitter address; moderator: feel free to disregard 
the original I accidentally sent from my personal address)

We have recently noticed an interesting problem which seems to happen 
quite frequently under certain circumstances. Immediately after a young 
GC, a second one happens which seems unnecessary given that it starts 
with an empty or almost empty eden. Here's an example:

{Heap before GC invocations=2 (full 0):
  par new generation   total 471872K, used 433003K [0x00000007bae00000, 
0x00000007dae00000, 0x00000007dae00000)
   eden space 419456K, 100% used [0x00000007bae00000, 
0x00000007d47a0000, 0x00000007d47a0000)
   from space 52416K,  25% used [0x00000007d47a0000, 0x00000007d54dacb0, 
0x00000007d7ad0000)
   to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 
0x00000007dae00000)
  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
0x00000007fae00000, 0x00000007fae00000)
    the space 524288K,   0% used [0x00000007dae00000, 
0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
0x00000007fc2c0000, 0x0000000800000000)
    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 
0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
user=0.03 sys=0.00, real=0.01 secs]
Heap after GC invocations=3 (full 0):
  par new generation   total 471872K, used 15843K [0x00000007bae00000, 
0x00000007dae00000, 0x00000007dae00000)
   eden space 419456K,   0% used [0x00000007bae00000, 
0x00000007bae00000, 0x00000007d47a0000)
   from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 
0x00000007dae00000)
   to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 
0x00000007d7ad0000)
  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
0x00000007fae00000, 0x00000007fae00000)
    the space 524288K,   0% used [0x00000007dae00000, 
0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
0x00000007fc2c0000, 0x0000000800000000)
    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 
0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
}
{Heap before GC invocations=3 (full 0):
  par new generation   total 471872K, used 24002K [0x00000007bae00000, 
0x00000007dae00000, 0x00000007dae00000)
   eden space 419456K,   1% used [0x00000007bae00000, 
0x00000007bb5f7c50, 0x00000007d47a0000)
   from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 
0x00000007dae00000)
   to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 
0x00000007d7ad0000)
  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
0x00000007fae00000, 0x00000007fae00000)
    the space 524288K,   0% used [0x00000007dae00000, 
0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
0x00000007fc2c0000, 0x0000000800000000)
    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 
0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 
0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: 
user=0.04 sys=0.01, real=0.01 secs]
Heap after GC invocations=4 (full 0):
  par new generation   total 471872K, used 12748K [0x00000007bae00000, 
0x00000007dae00000, 0x00000007dae00000)
   eden space 419456K,   0% used [0x00000007bae00000, 
0x00000007bae00000, 0x00000007d47a0000)
   from space 52416K,  24% used [0x00000007d47a0000, 0x00000007d5413320, 
0x00000007d7ad0000)
   to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 
0x00000007dae00000)
  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
0x00000007fae00000, 0x00000007fae00000)
    the space 524288K,   0% used [0x00000007dae00000, 
0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
0x00000007fc2c0000, 0x0000000800000000)
    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 
0x00000007fb07d800, 0x00000007fc2c0000)
No shared spaces configured.
}

Notice that:

* The timestamp of the second GC (1.130) is almost equal to the 
timestamp of the first GC plus the duration of the first GC (1.119 + 
0.0103320 = 1.1293). In this test young GCs normally happen at a 
frequency of one every 100ms-110ms or so.
* The eden at the start of the second GC is almost empty (1% occupancy). 
We've also seen it very often with a completely empty eden.
* (the big hint) The second GC is GClocker-initiated.

This happens most often with ParNew (in some cases, more than 30% of the 
GCs are those unnecessary ones) but also happens with ParallelGC too but 
less frequently (maybe 1%-1.5% of the GCs are those unnecessary ones). I 
was unable to reproduce it with G1.

I can reproduce it with with latest JDK 7, JDK 8, and also the latest 
hotspot-gc/hotspot workspace.

Are you guys looking into this (and is there a CR?)? I have a small test 
I can reproduce it with and a diagnosis / proposed fix(es) if you're 
interested.

Tony

-- 
Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com


From jon.masamitsu at oracle.com  Fri Jun 27 15:25:49 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 27 Jun 2014 08:25:49 -0700
Subject: The GCLocker blues...
In-Reply-To: <53AD6ADB.10301@twitter.com>
References: <53AD6ADB.10301@twitter.com>
Message-ID: <53AD8CFD.4080903@oracle.com>

Tony,

I don't recall talk within the GC group about this type of
problem.   I didn't find a CR that relates to that behavior.
If there is one, I don't think it is on anyone's radar.

Can I infer that the problem does not occur in jdk6?

Any theories on what's going on?

Jon


On 6/27/2014 6:00 AM, Tony Printezis wrote:
> Hi all,
>
> (trying again from my Twitter address; moderator: feel free to 
> disregard the original I accidentally sent from my personal address)
>
> We have recently noticed an interesting problem which seems to happen 
> quite frequently under certain circumstances. Immediately after a 
> young GC, a second one happens which seems unnecessary given that it 
> starts with an empty or almost empty eden. Here's an example:
>
> {Heap before GC invocations=2 (full 0):
>  par new generation   total 471872K, used 433003K [0x00000007bae00000, 
> 0x00000007dae00000, 0x00000007dae00000)
>   eden space 419456K, 100% used [0x00000007bae00000, 
> 0x00000007d47a0000, 0x00000007d47a0000)
>   from space 52416K,  25% used [0x00000007d47a0000, 
> 0x00000007d54dacb0, 0x00000007d7ad0000)
>   to   space 52416K,   0% used [0x00000007d7ad0000, 
> 0x00000007d7ad0000, 0x00000007dae00000)
>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
> 0x00000007fae00000, 0x00000007fae00000)
>    the space 524288K,   0% used [0x00000007dae00000, 
> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
> 0x00000007fc2c0000, 0x0000000800000000)
>    the space 21248K,  12% used [0x00000007fae00000, 
> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
> No shared spaces configured.
> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
> user=0.03 sys=0.00, real=0.01 secs]
> Heap after GC invocations=3 (full 0):
>  par new generation   total 471872K, used 15843K [0x00000007bae00000, 
> 0x00000007dae00000, 0x00000007dae00000)
>   eden space 419456K,   0% used [0x00000007bae00000, 
> 0x00000007bae00000, 0x00000007d47a0000)
>   from space 52416K,  30% used [0x00000007d7ad0000, 
> 0x00000007d8a48c88, 0x00000007dae00000)
>   to   space 52416K,   0% used [0x00000007d47a0000, 
> 0x00000007d47a0000, 0x00000007d7ad0000)
>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
> 0x00000007fae00000, 0x00000007fae00000)
>    the space 524288K,   0% used [0x00000007dae00000, 
> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
> 0x00000007fc2c0000, 0x0000000800000000)
>    the space 21248K,  12% used [0x00000007fae00000, 
> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
> No shared spaces configured.
> }
> {Heap before GC invocations=3 (full 0):
>  par new generation   total 471872K, used 24002K [0x00000007bae00000, 
> 0x00000007dae00000, 0x00000007dae00000)
>   eden space 419456K,   1% used [0x00000007bae00000, 
> 0x00000007bb5f7c50, 0x00000007d47a0000)
>   from space 52416K,  30% used [0x00000007d7ad0000, 
> 0x00000007d8a48c88, 0x00000007dae00000)
>   to   space 52416K,   0% used [0x00000007d47a0000, 
> 0x00000007d47a0000, 0x00000007d7ad0000)
>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
> 0x00000007fae00000, 0x00000007fae00000)
>    the space 524288K,   0% used [0x00000007dae00000, 
> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
> 0x00000007fc2c0000, 0x0000000800000000)
>    the space 21248K,  12% used [0x00000007fae00000, 
> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
> No shared spaces configured.
> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 
> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: 
> user=0.04 sys=0.01, real=0.01 secs]
> Heap after GC invocations=4 (full 0):
>  par new generation   total 471872K, used 12748K [0x00000007bae00000, 
> 0x00000007dae00000, 0x00000007dae00000)
>   eden space 419456K,   0% used [0x00000007bae00000, 
> 0x00000007bae00000, 0x00000007d47a0000)
>   from space 52416K,  24% used [0x00000007d47a0000, 
> 0x00000007d5413320, 0x00000007d7ad0000)
>   to   space 52416K,   0% used [0x00000007d7ad0000, 
> 0x00000007d7ad0000, 0x00000007dae00000)
>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
> 0x00000007fae00000, 0x00000007fae00000)
>    the space 524288K,   0% used [0x00000007dae00000, 
> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
> 0x00000007fc2c0000, 0x0000000800000000)
>    the space 21248K,  12% used [0x00000007fae00000, 
> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
> No shared spaces configured.
> }
>
> Notice that:
>
> * The timestamp of the second GC (1.130) is almost equal to the 
> timestamp of the first GC plus the duration of the first GC (1.119 + 
> 0.0103320 = 1.1293). In this test young GCs normally happen at a 
> frequency of one every 100ms-110ms or so.
> * The eden at the start of the second GC is almost empty (1% 
> occupancy). We've also seen it very often with a completely empty eden.
> * (the big hint) The second GC is GClocker-initiated.
>
> This happens most often with ParNew (in some cases, more than 30% of 
> the GCs are those unnecessary ones) but also happens with ParallelGC 
> too but less frequently (maybe 1%-1.5% of the GCs are those 
> unnecessary ones). I was unable to reproduce it with G1.
>
> I can reproduce it with with latest JDK 7, JDK 8, and also the latest 
> hotspot-gc/hotspot workspace.
>
> Are you guys looking into this (and is there a CR?)? I have a small 
> test I can reproduce it with and a diagnosis / proposed fix(es) if 
> you're interested.
>
> Tony
>


From tprintezis at twitter.com  Fri Jun 27 16:41:38 2014
From: tprintezis at twitter.com (Tony Printezis)
Date: Fri, 27 Jun 2014 12:41:38 -0400
Subject: The GCLocker blues...
In-Reply-To: <53AD8CFD.4080903@oracle.com>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
Message-ID: <53AD9EC2.6080803@twitter.com>

Hi Jon,

Great to hear from you! :-)

I haven't actually tried running the test with JDK 6 (I could if it'd be 
helpful...).

Yes, I know exactly what's going on. There's a race between one thread 
in jni_unlock() scheduling the GCLocker-initiated young GC (let's call 
it GC-L) and another thread also scheduling a young GC (let's call it 
GC-A) because it couldn't allocate due to the eden being full. Under 
certain circumstances, GC-A can happen first, with GC-L being scheduled 
and going ahead as soon as GC-A finishes.

I'll open a CR and add a more detailed analysis to it.

Tony

On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
> Tony,
>
> I don't recall talk within the GC group about this type of
> problem.   I didn't find a CR that relates to that behavior.
> If there is one, I don't think it is on anyone's radar.
>
> Can I infer that the problem does not occur in jdk6?
>
> Any theories on what's going on?
>
> Jon
>
>
> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>> Hi all,
>>
>> (trying again from my Twitter address; moderator: feel free to 
>> disregard the original I accidentally sent from my personal address)
>>
>> We have recently noticed an interesting problem which seems to happen 
>> quite frequently under certain circumstances. Immediately after a 
>> young GC, a second one happens which seems unnecessary given that it 
>> starts with an empty or almost empty eden. Here's an example:
>>
>> {Heap before GC invocations=2 (full 0):
>>  par new generation   total 471872K, used 433003K 
>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K, 100% used [0x00000007bae00000, 
>> 0x00000007d47a0000, 0x00000007d47a0000)
>>   from space 52416K,  25% used [0x00000007d47a0000, 
>> 0x00000007d54dacb0, 0x00000007d7ad0000)
>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>> 0x00000007d7ad0000, 0x00000007dae00000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
>> user=0.03 sys=0.00, real=0.01 secs]
>> Heap after GC invocations=3 (full 0):
>>  par new generation   total 471872K, used 15843K [0x00000007bae00000, 
>> 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K,   0% used [0x00000007bae00000, 
>> 0x00000007bae00000, 0x00000007d47a0000)
>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>> 0x00000007d8a48c88, 0x00000007dae00000)
>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> }
>> {Heap before GC invocations=3 (full 0):
>>  par new generation   total 471872K, used 24002K [0x00000007bae00000, 
>> 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K,   1% used [0x00000007bae00000, 
>> 0x00000007bb5f7c50, 0x00000007d47a0000)
>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>> 0x00000007d8a48c88, 0x00000007dae00000)
>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 
>> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: 
>> user=0.04 sys=0.01, real=0.01 secs]
>> Heap after GC invocations=4 (full 0):
>>  par new generation   total 471872K, used 12748K [0x00000007bae00000, 
>> 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K,   0% used [0x00000007bae00000, 
>> 0x00000007bae00000, 0x00000007d47a0000)
>>   from space 52416K,  24% used [0x00000007d47a0000, 
>> 0x00000007d5413320, 0x00000007d7ad0000)
>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>> 0x00000007d7ad0000, 0x00000007dae00000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> }
>>
>> Notice that:
>>
>> * The timestamp of the second GC (1.130) is almost equal to the 
>> timestamp of the first GC plus the duration of the first GC (1.119 + 
>> 0.0103320 = 1.1293). In this test young GCs normally happen at a 
>> frequency of one every 100ms-110ms or so.
>> * The eden at the start of the second GC is almost empty (1% 
>> occupancy). We've also seen it very often with a completely empty eden.
>> * (the big hint) The second GC is GClocker-initiated.
>>
>> This happens most often with ParNew (in some cases, more than 30% of 
>> the GCs are those unnecessary ones) but also happens with ParallelGC 
>> too but less frequently (maybe 1%-1.5% of the GCs are those 
>> unnecessary ones). I was unable to reproduce it with G1.
>>
>> I can reproduce it with with latest JDK 7, JDK 8, and also the latest 
>> hotspot-gc/hotspot workspace.
>>
>> Are you guys looking into this (and is there a CR?)? I have a small 
>> test I can reproduce it with and a diagnosis / proposed fix(es) if 
>> you're interested.
>>
>> Tony
>>
>

-- 
Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com


From tprintezis at twitter.com  Fri Jun 27 16:45:52 2014
From: tprintezis at twitter.com (Tony Printezis)
Date: Fri, 27 Jun 2014 12:45:52 -0400
Subject: The GCLocker blues...
In-Reply-To: <53AD8CFD.4080903@oracle.com>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
Message-ID: <53AD9FC0.1010302@twitter.com>

Jon,

https://bugs.openjdk.java.net/browse/JDK-8048556

Tony

On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
> Tony,
>
> I don't recall talk within the GC group about this type of
> problem.   I didn't find a CR that relates to that behavior.
> If there is one, I don't think it is on anyone's radar.
>
> Can I infer that the problem does not occur in jdk6?
>
> Any theories on what's going on?
>
> Jon
>
>
> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>> Hi all,
>>
>> (trying again from my Twitter address; moderator: feel free to 
>> disregard the original I accidentally sent from my personal address)
>>
>> We have recently noticed an interesting problem which seems to happen 
>> quite frequently under certain circumstances. Immediately after a 
>> young GC, a second one happens which seems unnecessary given that it 
>> starts with an empty or almost empty eden. Here's an example:
>>
>> {Heap before GC invocations=2 (full 0):
>>  par new generation   total 471872K, used 433003K 
>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K, 100% used [0x00000007bae00000, 
>> 0x00000007d47a0000, 0x00000007d47a0000)
>>   from space 52416K,  25% used [0x00000007d47a0000, 
>> 0x00000007d54dacb0, 0x00000007d7ad0000)
>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>> 0x00000007d7ad0000, 0x00000007dae00000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
>> user=0.03 sys=0.00, real=0.01 secs]
>> Heap after GC invocations=3 (full 0):
>>  par new generation   total 471872K, used 15843K [0x00000007bae00000, 
>> 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K,   0% used [0x00000007bae00000, 
>> 0x00000007bae00000, 0x00000007d47a0000)
>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>> 0x00000007d8a48c88, 0x00000007dae00000)
>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> }
>> {Heap before GC invocations=3 (full 0):
>>  par new generation   total 471872K, used 24002K [0x00000007bae00000, 
>> 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K,   1% used [0x00000007bae00000, 
>> 0x00000007bb5f7c50, 0x00000007d47a0000)
>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>> 0x00000007d8a48c88, 0x00000007dae00000)
>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 
>> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: 
>> user=0.04 sys=0.01, real=0.01 secs]
>> Heap after GC invocations=4 (full 0):
>>  par new generation   total 471872K, used 12748K [0x00000007bae00000, 
>> 0x00000007dae00000, 0x00000007dae00000)
>>   eden space 419456K,   0% used [0x00000007bae00000, 
>> 0x00000007bae00000, 0x00000007d47a0000)
>>   from space 52416K,  24% used [0x00000007d47a0000, 
>> 0x00000007d5413320, 0x00000007d7ad0000)
>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>> 0x00000007d7ad0000, 0x00000007dae00000)
>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>> 0x00000007fae00000, 0x00000007fae00000)
>>    the space 524288K,   0% used [0x00000007dae00000, 
>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>> 0x00000007fc2c0000, 0x0000000800000000)
>>    the space 21248K,  12% used [0x00000007fae00000, 
>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>> No shared spaces configured.
>> }
>>
>> Notice that:
>>
>> * The timestamp of the second GC (1.130) is almost equal to the 
>> timestamp of the first GC plus the duration of the first GC (1.119 + 
>> 0.0103320 = 1.1293). In this test young GCs normally happen at a 
>> frequency of one every 100ms-110ms or so.
>> * The eden at the start of the second GC is almost empty (1% 
>> occupancy). We've also seen it very often with a completely empty eden.
>> * (the big hint) The second GC is GClocker-initiated.
>>
>> This happens most often with ParNew (in some cases, more than 30% of 
>> the GCs are those unnecessary ones) but also happens with ParallelGC 
>> too but less frequently (maybe 1%-1.5% of the GCs are those 
>> unnecessary ones). I was unable to reproduce it with G1.
>>
>> I can reproduce it with with latest JDK 7, JDK 8, and also the latest 
>> hotspot-gc/hotspot workspace.
>>
>> Are you guys looking into this (and is there a CR?)? I have a small 
>> test I can reproduce it with and a diagnosis / proposed fix(es) if 
>> you're interested.
>>
>> Tony
>>
>

-- 
Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com


From monica.b at servergy.com  Fri Jun 27 17:24:25 2014
From: monica.b at servergy.com (Monica Beckwith)
Date: Fri, 27 Jun 2014 12:24:25 -0500
Subject: The GCLocker blues...
In-Reply-To: <53AD9FC0.1010302@twitter.com>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
	<53AD9FC0.1010302@twitter.com>
Message-ID: <53ADA8C9.8040104@servergy.com>

Hi Tony/ Jon -

AFAIK, this was observed and fixed a while back for G1. I will see if I 
can find the CR# for G1.

-Monica


On 6/27/14, 11:45 AM, Tony Printezis wrote:
> Jon,
>
> https://bugs.openjdk.java.net/browse/JDK-8048556
>
> Tony
>
> On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
>> Tony,
>>
>> I don't recall talk within the GC group about this type of
>> problem.   I didn't find a CR that relates to that behavior.
>> If there is one, I don't think it is on anyone's radar.
>>
>> Can I infer that the problem does not occur in jdk6?
>>
>> Any theories on what's going on?
>>
>> Jon
>>
>>
>> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>>> Hi all,
>>>
>>> (trying again from my Twitter address; moderator: feel free to 
>>> disregard the original I accidentally sent from my personal address)
>>>
>>> We have recently noticed an interesting problem which seems to 
>>> happen quite frequently under certain circumstances. Immediately 
>>> after a young GC, a second one happens which seems unnecessary given 
>>> that it starts with an empty or almost empty eden. Here's an example:
>>>
>>> {Heap before GC invocations=2 (full 0):
>>>  par new generation   total 471872K, used 433003K 
>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K, 100% used [0x00000007bae00000, 
>>> 0x00000007d47a0000, 0x00000007d47a0000)
>>>   from space 52416K,  25% used [0x00000007d47a0000, 
>>> 0x00000007d54dacb0, 0x00000007d7ad0000)
>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>> 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
>>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
>>> user=0.03 sys=0.00, real=0.01 secs]
>>> Heap after GC invocations=3 (full 0):
>>>  par new generation   total 471872K, used 15843K 
>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>> 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> }
>>> {Heap before GC invocations=3 (full 0):
>>>  par new generation   total 471872K, used 24002K 
>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K,   1% used [0x00000007bae00000, 
>>> 0x00000007bb5f7c50, 0x00000007d47a0000)
>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>> 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 
>>> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: 
>>> user=0.04 sys=0.01, real=0.01 secs]
>>> Heap after GC invocations=4 (full 0):
>>>  par new generation   total 471872K, used 12748K 
>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>   from space 52416K,  24% used [0x00000007d47a0000, 
>>> 0x00000007d5413320, 0x00000007d7ad0000)
>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>> 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> }
>>>
>>> Notice that:
>>>
>>> * The timestamp of the second GC (1.130) is almost equal to the 
>>> timestamp of the first GC plus the duration of the first GC (1.119 + 
>>> 0.0103320 = 1.1293). In this test young GCs normally happen at a 
>>> frequency of one every 100ms-110ms or so.
>>> * The eden at the start of the second GC is almost empty (1% 
>>> occupancy). We've also seen it very often with a completely empty eden.
>>> * (the big hint) The second GC is GClocker-initiated.
>>>
>>> This happens most often with ParNew (in some cases, more than 30% of 
>>> the GCs are those unnecessary ones) but also happens with ParallelGC 
>>> too but less frequently (maybe 1%-1.5% of the GCs are those 
>>> unnecessary ones). I was unable to reproduce it with G1.
>>>
>>> I can reproduce it with with latest JDK 7, JDK 8, and also the 
>>> latest hotspot-gc/hotspot workspace.
>>>
>>> Are you guys looking into this (and is there a CR?)? I have a small 
>>> test I can reproduce it with and a diagnosis / proposed fix(es) if 
>>> you're interested.
>>>
>>> Tony
>>>
>>
>


From Peter.B.Kessler at Oracle.COM  Fri Jun 27 18:35:18 2014
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Fri, 27 Jun 2014 11:35:18 -0700
Subject: The GCLocker blues...
In-Reply-To: <53AD9EC2.6080803@twitter.com>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
	<53AD9EC2.6080803@twitter.com>
Message-ID: <53ADB966.7090805@Oracle.COM>

I thought there was code somewhere to address this in the case that multiple threads requested collections (for whatever reason, not just GC-locker).

Something like: when the collection is requested, record the time (or an epoch number like the collection count), and then when compare that time (or epoch number) to the time (or epoch number) of the last collection when the requested collection is processed.  If there's been a collection since the request, assume the requested collection is redundant.

There it is: line 88 of

     http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp

VM_GC_Operation::skip_operation().  With a comment describing why it's there.  Maybe it's not detailed enough to prevent an extra collection in your situation?

			... peter

On 06/27/14 09:41, Tony Printezis wrote:
> Hi Jon,
>
> Great to hear from you! :-)
>
> I haven't actually tried running the test with JDK 6 (I could if it'd be helpful...).
>
> Yes, I know exactly what's going on. There's a race between one thread in jni_unlock() scheduling the GCLocker-initiated young GC (let's call it GC-L) and another thread also scheduling a young GC (let's call it GC-A) because it couldn't allocate due to the eden being full. Under certain circumstances, GC-A can happen first, with GC-L being scheduled and going ahead as soon as GC-A finishes.
>
> I'll open a CR and add a more detailed analysis to it.
>
> Tony
>
> On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
>> Tony,
>>
>> I don't recall talk within the GC group about this type of
>> problem.   I didn't find a CR that relates to that behavior.
>> If there is one, I don't think it is on anyone's radar.
>>
>> Can I infer that the problem does not occur in jdk6?
>>
>> Any theories on what's going on?
>>
>> Jon
>>
>>
>> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>>> Hi all,
>>>
>>> (trying again from my Twitter address; moderator: feel free to disregard the original I accidentally sent from my personal address)
>>>
>>> We have recently noticed an interesting problem which seems to happen quite frequently under certain circumstances. Immediately after a young GC, a second one happens which seems unnecessary given that it starts with an empty or almost empty eden. Here's an example:
>>>
>>> {Heap before GC invocations=2 (full 0):
>>>  par new generation   total 471872K, used 433003K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K, 100% used [0x00000007bae00000, 0x00000007d47a0000, 0x00000007d47a0000)
>>>   from space 52416K,  25% used [0x00000007d47a0000, 0x00000007d54dacb0, 0x00000007d7ad0000)
>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>>> Heap after GC invocations=3 (full 0):
>>>  par new generation   total 471872K, used 15843K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K,   0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000)
>>>   from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000)
>>>   to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> }
>>> {Heap before GC invocations=3 (full 0):
>>>  par new generation   total 471872K, used 24002K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K,   1% used [0x00000007bae00000, 0x00000007bb5f7c50, 0x00000007d47a0000)
>>>   from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000)
>>>   to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs]
>>> Heap after GC invocations=4 (full 0):
>>>  par new generation   total 471872K, used 12748K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>   eden space 419456K,   0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000)
>>>   from space 52416K,  24% used [0x00000007d47a0000, 0x00000007d5413320, 0x00000007d7ad0000)
>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000)
>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>> No shared spaces configured.
>>> }
>>>
>>> Notice that:
>>>
>>> * The timestamp of the second GC (1.130) is almost equal to the timestamp of the first GC plus the duration of the first GC (1.119 + 0.0103320 = 1.1293). In this test young GCs normally happen at a frequency of one every 100ms-110ms or so.
>>> * The eden at the start of the second GC is almost empty (1% occupancy). We've also seen it very often with a completely empty eden.
>>> * (the big hint) The second GC is GClocker-initiated.
>>>
>>> This happens most often with ParNew (in some cases, more than 30% of the GCs are those unnecessary ones) but also happens with ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are those unnecessary ones). I was unable to reproduce it with G1.
>>>
>>> I can reproduce it with with latest JDK 7, JDK 8, and also the latest hotspot-gc/hotspot workspace.
>>>
>>> Are you guys looking into this (and is there a CR?)? I have a small test I can reproduce it with and a diagnosis / proposed fix(es) if you're interested.
>>>
>>> Tony
>>>
>>
>


From jon.masamitsu at oracle.com  Fri Jun 27 19:18:20 2014
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 27 Jun 2014 12:18:20 -0700
Subject: The GCLocker blues...
In-Reply-To: <53ADB966.7090805@Oracle.COM>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
	<53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM>
Message-ID: <53ADC37C.6070102@oracle.com>

Peter,

Some of the cases that Tony points out have edens
with small amounts of used.  That would indicate that
a GC finished, the mutators were restarted and then
the GC-locker needed GC was requested.

Jon

On 6/27/2014 11:35 AM, Peter B. Kessler wrote:
> I thought there was code somewhere to address this in the case that 
> multiple threads requested collections (for whatever reason, not just 
> GC-locker).
>
> Something like: when the collection is requested, record the time (or 
> an epoch number like the collection count), and then when compare that 
> time (or epoch number) to the time (or epoch number) of the last 
> collection when the requested collection is processed.  If there's 
> been a collection since the request, assume the requested collection 
> is redundant.
>
> There it is: line 88 of
>
> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp
>
> VM_GC_Operation::skip_operation().  With a comment describing why it's 
> there.  Maybe it's not detailed enough to prevent an extra collection 
> in your situation?
>
>             ... peter
>
> On 06/27/14 09:41, Tony Printezis wrote:
>> Hi Jon,
>>
>> Great to hear from you! :-)
>>
>> I haven't actually tried running the test with JDK 6 (I could if it'd 
>> be helpful...).
>>
>> Yes, I know exactly what's going on. There's a race between one 
>> thread in jni_unlock() scheduling the GCLocker-initiated young GC 
>> (let's call it GC-L) and another thread also scheduling a young GC 
>> (let's call it GC-A) because it couldn't allocate due to the eden 
>> being full. Under certain circumstances, GC-A can happen first, with 
>> GC-L being scheduled and going ahead as soon as GC-A finishes.
>>
>> I'll open a CR and add a more detailed analysis to it.
>>
>> Tony
>>
>> On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
>>> Tony,
>>>
>>> I don't recall talk within the GC group about this type of
>>> problem.   I didn't find a CR that relates to that behavior.
>>> If there is one, I don't think it is on anyone's radar.
>>>
>>> Can I infer that the problem does not occur in jdk6?
>>>
>>> Any theories on what's going on?
>>>
>>> Jon
>>>
>>>
>>> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>>>> Hi all,
>>>>
>>>> (trying again from my Twitter address; moderator: feel free to 
>>>> disregard the original I accidentally sent from my personal address)
>>>>
>>>> We have recently noticed an interesting problem which seems to 
>>>> happen quite frequently under certain circumstances. Immediately 
>>>> after a young GC, a second one happens which seems unnecessary 
>>>> given that it starts with an empty or almost empty eden. Here's an 
>>>> example:
>>>>
>>>> {Heap before GC invocations=2 (full 0):
>>>>  par new generation   total 471872K, used 433003K 
>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>   eden space 419456K, 100% used [0x00000007bae00000, 
>>>> 0x00000007d47a0000, 0x00000007d47a0000)
>>>>   from space 52416K,  25% used [0x00000007d47a0000, 
>>>> 0x00000007d54dacb0, 0x00000007d7ad0000)
>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>> No shared spaces configured.
>>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
>>>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
>>>> user=0.03 sys=0.00, real=0.01 secs]
>>>> Heap after GC invocations=3 (full 0):
>>>>  par new generation   total 471872K, used 15843K 
>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>> No shared spaces configured.
>>>> }
>>>> {Heap before GC invocations=3 (full 0):
>>>>  par new generation   total 471872K, used 24002K 
>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>   eden space 419456K,   1% used [0x00000007bae00000, 
>>>> 0x00000007bb5f7c50, 0x00000007d47a0000)
>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>> No shared spaces configured.
>>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 
>>>> 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: 
>>>> user=0.04 sys=0.01, real=0.01 secs]
>>>> Heap after GC invocations=4 (full 0):
>>>>  par new generation   total 471872K, used 12748K 
>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>>   from space 52416K,  24% used [0x00000007d47a0000, 
>>>> 0x00000007d5413320, 0x00000007d7ad0000)
>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 
>>>> 0x00000007fc2c0000, 0x0000000800000000)
>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>> No shared spaces configured.
>>>> }
>>>>
>>>> Notice that:
>>>>
>>>> * The timestamp of the second GC (1.130) is almost equal to the 
>>>> timestamp of the first GC plus the duration of the first GC (1.119 
>>>> + 0.0103320 = 1.1293). In this test young GCs normally happen at a 
>>>> frequency of one every 100ms-110ms or so.
>>>> * The eden at the start of the second GC is almost empty (1% 
>>>> occupancy). We've also seen it very often with a completely empty 
>>>> eden.
>>>> * (the big hint) The second GC is GClocker-initiated.
>>>>
>>>> This happens most often with ParNew (in some cases, more than 30% 
>>>> of the GCs are those unnecessary ones) but also happens with 
>>>> ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are 
>>>> those unnecessary ones). I was unable to reproduce it with G1.
>>>>
>>>> I can reproduce it with with latest JDK 7, JDK 8, and also the 
>>>> latest hotspot-gc/hotspot workspace.
>>>>
>>>> Are you guys looking into this (and is there a CR?)? I have a small 
>>>> test I can reproduce it with and a diagnosis / proposed fix(es) if 
>>>> you're interested.
>>>>
>>>> Tony
>>>>
>>>
>>


From Peter.B.Kessler at Oracle.COM  Fri Jun 27 21:04:32 2014
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Fri, 27 Jun 2014 14:04:32 -0700
Subject: The GCLocker blues...
In-Reply-To: <53ADC37C.6070102@oracle.com>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
	<53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM>
	<53ADC37C.6070102@oracle.com>
Message-ID: <53ADDC60.6040702@Oracle.COM>

Right.  VM_GC_Operation::skip_operation() could be made smarter.  The current version just looks at the count of collections.  There's lots of data available about why a collection was requested and again when it gets to the prologue.  E.g., young generation occupancy.  There's all the time in the world to make a good decision, relative to the time for the collection (if one happens) or the savings (if a collection is avoided).

			... peter

On 06/27/14 12:18, Jon Masamitsu wrote:
> Peter,
>
> Some of the cases that Tony points out have edens
> with small amounts of used.  That would indicate that
> a GC finished, the mutators were restarted and then
> the GC-locker needed GC was requested.
>
> Jon
>
> On 6/27/2014 11:35 AM, Peter B. Kessler wrote:
>> I thought there was code somewhere to address this in the case that multiple threads requested collections (for whatever reason, not just GC-locker).
>>
>> Something like: when the collection is requested, record the time (or an epoch number like the collection count), and then when compare that time (or epoch number) to the time (or epoch number) of the last collection when the requested collection is processed.  If there's been a collection since the request, assume the requested collection is redundant.
>>
>> There it is: line 88 of
>>
>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp
>>
>> VM_GC_Operation::skip_operation().  With a comment describing why it's there.  Maybe it's not detailed enough to prevent an extra collection in your situation?
>>
>>             ... peter
>>
>> On 06/27/14 09:41, Tony Printezis wrote:
>>> Hi Jon,
>>>
>>> Great to hear from you! :-)
>>>
>>> I haven't actually tried running the test with JDK 6 (I could if it'd be helpful...).
>>>
>>> Yes, I know exactly what's going on. There's a race between one thread in jni_unlock() scheduling the GCLocker-initiated young GC (let's call it GC-L) and another thread also scheduling a young GC (let's call it GC-A) because it couldn't allocate due to the eden being full. Under certain circumstances, GC-A can happen first, with GC-L being scheduled and going ahead as soon as GC-A finishes.
>>>
>>> I'll open a CR and add a more detailed analysis to it.
>>>
>>> Tony
>>>
>>> On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
>>>> Tony,
>>>>
>>>> I don't recall talk within the GC group about this type of
>>>> problem.   I didn't find a CR that relates to that behavior.
>>>> If there is one, I don't think it is on anyone's radar.
>>>>
>>>> Can I infer that the problem does not occur in jdk6?
>>>>
>>>> Any theories on what's going on?
>>>>
>>>> Jon
>>>>
>>>>
>>>> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>>>>> Hi all,
>>>>>
>>>>> (trying again from my Twitter address; moderator: feel free to disregard the original I accidentally sent from my personal address)
>>>>>
>>>>> We have recently noticed an interesting problem which seems to happen quite frequently under certain circumstances. Immediately after a young GC, a second one happens which seems unnecessary given that it starts with an empty or almost empty eden. Here's an example:
>>>>>
>>>>> {Heap before GC invocations=2 (full 0):
>>>>>  par new generation   total 471872K, used 433003K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>   eden space 419456K, 100% used [0x00000007bae00000, 0x00000007d47a0000, 0x00000007d47a0000)
>>>>>   from space 52416K,  25% used [0x00000007d47a0000, 0x00000007d54dacb0, 0x00000007d7ad0000)
>>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000)
>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>> No shared spaces configured.
>>>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>>>>> Heap after GC invocations=3 (full 0):
>>>>>  par new generation   total 471872K, used 15843K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>   eden space 419456K,   0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000)
>>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000)
>>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>> No shared spaces configured.
>>>>> }
>>>>> {Heap before GC invocations=3 (full 0):
>>>>>  par new generation   total 471872K, used 24002K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>   eden space 419456K,   1% used [0x00000007bae00000, 0x00000007bb5f7c50, 0x00000007d47a0000)
>>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 0x00000007d8a48c88, 0x00000007dae00000)
>>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>> No shared spaces configured.
>>>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs]
>>>>> Heap after GC invocations=4 (full 0):
>>>>>  par new generation   total 471872K, used 12748K [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>   eden space 419456K,   0% used [0x00000007bae00000, 0x00000007bae00000, 0x00000007d47a0000)
>>>>>   from space 52416K,  24% used [0x00000007d47a0000, 0x00000007d5413320, 0x00000007d7ad0000)
>>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 0x00000007d7ad0000, 0x00000007dae00000)
>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 0x00000007fae00000, 0x00000007fae00000)
>>>>>    the space 524288K,   0% used [0x00000007dae00000, 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>  compacting perm gen  total 21248K, used 2549K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>    the space 21248K,  12% used [0x00000007fae00000, 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>> No shared spaces configured.
>>>>> }
>>>>>
>>>>> Notice that:
>>>>>
>>>>> * The timestamp of the second GC (1.130) is almost equal to the timestamp of the first GC plus the duration of the first GC (1.119 + 0.0103320 = 1.1293). In this test young GCs normally happen at a frequency of one every 100ms-110ms or so.
>>>>> * The eden at the start of the second GC is almost empty (1% occupancy). We've also seen it very often with a completely empty eden.
>>>>> * (the big hint) The second GC is GClocker-initiated.
>>>>>
>>>>> This happens most often with ParNew (in some cases, more than 30% of the GCs are those unnecessary ones) but also happens with ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are those unnecessary ones). I was unable to reproduce it with G1.
>>>>>
>>>>> I can reproduce it with with latest JDK 7, JDK 8, and also the latest hotspot-gc/hotspot workspace.
>>>>>
>>>>> Are you guys looking into this (and is there a CR?)? I have a small test I can reproduce it with and a diagnosis / proposed fix(es) if you're interested.
>>>>>
>>>>> Tony
>>>>>
>>>>
>>>
>


From tprintezis at twitter.com  Mon Jun 30 13:28:21 2014
From: tprintezis at twitter.com (Tony Printezis)
Date: Mon, 30 Jun 2014 09:28:21 -0400
Subject: The GCLocker blues...
In-Reply-To: <53ADDC60.6040702@Oracle.COM>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
	<53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM>
	<53ADC37C.6070102@oracle.com> <53ADDC60.6040702@Oracle.COM>
Message-ID: <53B165F5.5090900@twitter.com>

Peter,

Yes, each GC VM op has the current GC counts and if by the time it runs 
the GC counts are out-of-date, the op bails out in the prologue without 
a safepoint. However, as I described on the CR, this doesn't work here. 
By the time the thread gets the Heap_lock and reads the GC counts, 
another GC has already happened. So, when the GC VM op runs, the GC 
counts are up-to-date. The GC count protocol works for threads that are 
trying to do allocations, given that those decisions are serialized 
using the Heap_lock. However, the thread jni_unlock() is not part of 
that protocol hence it is oblivious to those decisions.

Tony

On 6/27/14, 5:04 PM, Peter B. Kessler wrote:
> Right. VM_GC_Operation::skip_operation() could be made smarter.  The 
> current version just looks at the count of collections.  There's lots 
> of data available about why a collection was requested and again when 
> it gets to the prologue.  E.g., young generation occupancy.  There's 
> all the time in the world to make a good decision, relative to the 
> time for the collection (if one happens) or the savings (if a 
> collection is avoided).
>
>             ... peter
>
> On 06/27/14 12:18, Jon Masamitsu wrote:
>> Peter,
>>
>> Some of the cases that Tony points out have edens
>> with small amounts of used.  That would indicate that
>> a GC finished, the mutators were restarted and then
>> the GC-locker needed GC was requested.
>>
>> Jon
>>
>> On 6/27/2014 11:35 AM, Peter B. Kessler wrote:
>>> I thought there was code somewhere to address this in the case that 
>>> multiple threads requested collections (for whatever reason, not 
>>> just GC-locker).
>>>
>>> Something like: when the collection is requested, record the time 
>>> (or an epoch number like the collection count), and then when 
>>> compare that time (or epoch number) to the time (or epoch number) of 
>>> the last collection when the requested collection is processed.  If 
>>> there's been a collection since the request, assume the requested 
>>> collection is redundant.
>>>
>>> There it is: line 88 of
>>>
>>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp 
>>>
>>>
>>> VM_GC_Operation::skip_operation().  With a comment describing why 
>>> it's there.  Maybe it's not detailed enough to prevent an extra 
>>> collection in your situation?
>>>
>>>             ... peter
>>>
>>> On 06/27/14 09:41, Tony Printezis wrote:
>>>> Hi Jon,
>>>>
>>>> Great to hear from you! :-)
>>>>
>>>> I haven't actually tried running the test with JDK 6 (I could if 
>>>> it'd be helpful...).
>>>>
>>>> Yes, I know exactly what's going on. There's a race between one 
>>>> thread in jni_unlock() scheduling the GCLocker-initiated young GC 
>>>> (let's call it GC-L) and another thread also scheduling a young GC 
>>>> (let's call it GC-A) because it couldn't allocate due to the eden 
>>>> being full. Under certain circumstances, GC-A can happen first, 
>>>> with GC-L being scheduled and going ahead as soon as GC-A finishes.
>>>>
>>>> I'll open a CR and add a more detailed analysis to it.
>>>>
>>>> Tony
>>>>
>>>> On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
>>>>> Tony,
>>>>>
>>>>> I don't recall talk within the GC group about this type of
>>>>> problem.   I didn't find a CR that relates to that behavior.
>>>>> If there is one, I don't think it is on anyone's radar.
>>>>>
>>>>> Can I infer that the problem does not occur in jdk6?
>>>>>
>>>>> Any theories on what's going on?
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> (trying again from my Twitter address; moderator: feel free to 
>>>>>> disregard the original I accidentally sent from my personal address)
>>>>>>
>>>>>> We have recently noticed an interesting problem which seems to 
>>>>>> happen quite frequently under certain circumstances. Immediately 
>>>>>> after a young GC, a second one happens which seems unnecessary 
>>>>>> given that it starts with an empty or almost empty eden. Here's 
>>>>>> an example:
>>>>>>
>>>>>> {Heap before GC invocations=2 (full 0):
>>>>>>  par new generation   total 471872K, used 433003K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K, 100% used [0x00000007bae00000, 
>>>>>> 0x00000007d47a0000, 0x00000007d47a0000)
>>>>>>   from space 52416K,  25% used [0x00000007d47a0000, 
>>>>>> 0x00000007d54dacb0, 0x00000007d7ad0000)
>>>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
>>>>>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
>>>>>> user=0.03 sys=0.00, real=0.01 secs]
>>>>>> Heap after GC invocations=3 (full 0):
>>>>>>  par new generation   total 471872K, used 15843K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>>>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>>>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> }
>>>>>> {Heap before GC invocations=3 (full 0):
>>>>>>  par new generation   total 471872K, used 24002K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K,   1% used [0x00000007bae00000, 
>>>>>> 0x00000007bb5f7c50, 0x00000007d47a0000)
>>>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>>>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 
>>>>>> 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 
>>>>>> 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs]
>>>>>> Heap after GC invocations=4 (full 0):
>>>>>>  par new generation   total 471872K, used 12748K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>>>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>>>>   from space 52416K,  24% used [0x00000007d47a0000, 
>>>>>> 0x00000007d5413320, 0x00000007d7ad0000)
>>>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> }
>>>>>>
>>>>>> Notice that:
>>>>>>
>>>>>> * The timestamp of the second GC (1.130) is almost equal to the 
>>>>>> timestamp of the first GC plus the duration of the first GC 
>>>>>> (1.119 + 0.0103320 = 1.1293). In this test young GCs normally 
>>>>>> happen at a frequency of one every 100ms-110ms or so.
>>>>>> * The eden at the start of the second GC is almost empty (1% 
>>>>>> occupancy). We've also seen it very often with a completely empty 
>>>>>> eden.
>>>>>> * (the big hint) The second GC is GClocker-initiated.
>>>>>>
>>>>>> This happens most often with ParNew (in some cases, more than 30% 
>>>>>> of the GCs are those unnecessary ones) but also happens with 
>>>>>> ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are 
>>>>>> those unnecessary ones). I was unable to reproduce it with G1.
>>>>>>
>>>>>> I can reproduce it with with latest JDK 7, JDK 8, and also the 
>>>>>> latest hotspot-gc/hotspot workspace.
>>>>>>
>>>>>> Are you guys looking into this (and is there a CR?)? I have a 
>>>>>> small test I can reproduce it with and a diagnosis / proposed 
>>>>>> fix(es) if you're interested.
>>>>>>
>>>>>> Tony
>>>>>>
>>>>>
>>>>
>>

-- 
Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com


From tprintezis at twitter.com  Mon Jun 30 13:32:18 2014
From: tprintezis at twitter.com (Tony Printezis)
Date: Mon, 30 Jun 2014 09:32:18 -0400
Subject: The GCLocker blues...
In-Reply-To: <53ADDC60.6040702@Oracle.COM>
References: <53AD6ADB.10301@twitter.com> <53AD8CFD.4080903@oracle.com>
	<53AD9EC2.6080803@twitter.com> <53ADB966.7090805@Oracle.COM>
	<53ADC37C.6070102@oracle.com> <53ADDC60.6040702@Oracle.COM>
Message-ID: <53B166E2.40706@twitter.com>

PS You're right that skip_operation() is not performance critical, 
however I'm not sure it's possible to make it smarter in this case. The 
GC VM op that's about to do the GCLocker-initiated GC doesn't have 
enough information to know whether another GC happened meanwhile or not. 
The only way to do that is to attach the correct GC counts to the VM op 
(see the CR for this and another suggested fix).

On 6/27/14, 5:04 PM, Peter B. Kessler wrote:
> Right. VM_GC_Operation::skip_operation() could be made smarter.  The 
> current version just looks at the count of collections.  There's lots 
> of data available about why a collection was requested and again when 
> it gets to the prologue.  E.g., young generation occupancy.  There's 
> all the time in the world to make a good decision, relative to the 
> time for the collection (if one happens) or the savings (if a 
> collection is avoided).
>
>             ... peter
>
> On 06/27/14 12:18, Jon Masamitsu wrote:
>> Peter,
>>
>> Some of the cases that Tony points out have edens
>> with small amounts of used.  That would indicate that
>> a GC finished, the mutators were restarted and then
>> the GC-locker needed GC was requested.
>>
>> Jon
>>
>> On 6/27/2014 11:35 AM, Peter B. Kessler wrote:
>>> I thought there was code somewhere to address this in the case that 
>>> multiple threads requested collections (for whatever reason, not 
>>> just GC-locker).
>>>
>>> Something like: when the collection is requested, record the time 
>>> (or an epoch number like the collection count), and then when 
>>> compare that time (or epoch number) to the time (or epoch number) of 
>>> the last collection when the requested collection is processed.  If 
>>> there's been a collection since the request, assume the requested 
>>> collection is redundant.
>>>
>>> There it is: line 88 of
>>>
>>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/b67a3f81b630/src/share/vm/gc_implementation/shared/vmGCOperations.cpp 
>>>
>>>
>>> VM_GC_Operation::skip_operation().  With a comment describing why 
>>> it's there.  Maybe it's not detailed enough to prevent an extra 
>>> collection in your situation?
>>>
>>>             ... peter
>>>
>>> On 06/27/14 09:41, Tony Printezis wrote:
>>>> Hi Jon,
>>>>
>>>> Great to hear from you! :-)
>>>>
>>>> I haven't actually tried running the test with JDK 6 (I could if 
>>>> it'd be helpful...).
>>>>
>>>> Yes, I know exactly what's going on. There's a race between one 
>>>> thread in jni_unlock() scheduling the GCLocker-initiated young GC 
>>>> (let's call it GC-L) and another thread also scheduling a young GC 
>>>> (let's call it GC-A) because it couldn't allocate due to the eden 
>>>> being full. Under certain circumstances, GC-A can happen first, 
>>>> with GC-L being scheduled and going ahead as soon as GC-A finishes.
>>>>
>>>> I'll open a CR and add a more detailed analysis to it.
>>>>
>>>> Tony
>>>>
>>>> On 6/27/14, 11:25 AM, Jon Masamitsu wrote:
>>>>> Tony,
>>>>>
>>>>> I don't recall talk within the GC group about this type of
>>>>> problem.   I didn't find a CR that relates to that behavior.
>>>>> If there is one, I don't think it is on anyone's radar.
>>>>>
>>>>> Can I infer that the problem does not occur in jdk6?
>>>>>
>>>>> Any theories on what's going on?
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>> On 6/27/2014 6:00 AM, Tony Printezis wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> (trying again from my Twitter address; moderator: feel free to 
>>>>>> disregard the original I accidentally sent from my personal address)
>>>>>>
>>>>>> We have recently noticed an interesting problem which seems to 
>>>>>> happen quite frequently under certain circumstances. Immediately 
>>>>>> after a young GC, a second one happens which seems unnecessary 
>>>>>> given that it starts with an empty or almost empty eden. Here's 
>>>>>> an example:
>>>>>>
>>>>>> {Heap before GC invocations=2 (full 0):
>>>>>>  par new generation   total 471872K, used 433003K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K, 100% used [0x00000007bae00000, 
>>>>>> 0x00000007d47a0000, 0x00000007d47a0000)
>>>>>>   from space 52416K,  25% used [0x00000007d47a0000, 
>>>>>> 0x00000007d54dacb0, 0x00000007d7ad0000)
>>>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> 1.119: [GC (Allocation Failure)[ParNew: 433003K->15843K(471872K), 
>>>>>> 0.0103090 secs] 433003K->15843K(996160K), 0.0103320 secs] [Times: 
>>>>>> user=0.03 sys=0.00, real=0.01 secs]
>>>>>> Heap after GC invocations=3 (full 0):
>>>>>>  par new generation   total 471872K, used 15843K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>>>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>>>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> }
>>>>>> {Heap before GC invocations=3 (full 0):
>>>>>>  par new generation   total 471872K, used 24002K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K,   1% used [0x00000007bae00000, 
>>>>>> 0x00000007bb5f7c50, 0x00000007d47a0000)
>>>>>>   from space 52416K,  30% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d8a48c88, 0x00000007dae00000)
>>>>>>   to   space 52416K,   0% used [0x00000007d47a0000, 
>>>>>> 0x00000007d47a0000, 0x00000007d7ad0000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> 1.130: [GC (GCLocker Initiated GC)[ParNew: 
>>>>>> 24002K->12748K(471872K), 0.0123930 secs] 24002K->12748K(996160K), 
>>>>>> 0.0124130 secs] [Times: user=0.04 sys=0.01, real=0.01 secs]
>>>>>> Heap after GC invocations=4 (full 0):
>>>>>>  par new generation   total 471872K, used 12748K 
>>>>>> [0x00000007bae00000, 0x00000007dae00000, 0x00000007dae00000)
>>>>>>   eden space 419456K,   0% used [0x00000007bae00000, 
>>>>>> 0x00000007bae00000, 0x00000007d47a0000)
>>>>>>   from space 52416K,  24% used [0x00000007d47a0000, 
>>>>>> 0x00000007d5413320, 0x00000007d7ad0000)
>>>>>>   to   space 52416K,   0% used [0x00000007d7ad0000, 
>>>>>> 0x00000007d7ad0000, 0x00000007dae00000)
>>>>>>  tenured generation   total 524288K, used 0K [0x00000007dae00000, 
>>>>>> 0x00000007fae00000, 0x00000007fae00000)
>>>>>>    the space 524288K,   0% used [0x00000007dae00000, 
>>>>>> 0x00000007dae00000, 0x00000007dae00200, 0x00000007fae00000)
>>>>>>  compacting perm gen  total 21248K, used 2549K 
>>>>>> [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
>>>>>>    the space 21248K,  12% used [0x00000007fae00000, 
>>>>>> 0x00000007fb07d7a0, 0x00000007fb07d800, 0x00000007fc2c0000)
>>>>>> No shared spaces configured.
>>>>>> }
>>>>>>
>>>>>> Notice that:
>>>>>>
>>>>>> * The timestamp of the second GC (1.130) is almost equal to the 
>>>>>> timestamp of the first GC plus the duration of the first GC 
>>>>>> (1.119 + 0.0103320 = 1.1293). In this test young GCs normally 
>>>>>> happen at a frequency of one every 100ms-110ms or so.
>>>>>> * The eden at the start of the second GC is almost empty (1% 
>>>>>> occupancy). We've also seen it very often with a completely empty 
>>>>>> eden.
>>>>>> * (the big hint) The second GC is GClocker-initiated.
>>>>>>
>>>>>> This happens most often with ParNew (in some cases, more than 30% 
>>>>>> of the GCs are those unnecessary ones) but also happens with 
>>>>>> ParallelGC too but less frequently (maybe 1%-1.5% of the GCs are 
>>>>>> those unnecessary ones). I was unable to reproduce it with G1.
>>>>>>
>>>>>> I can reproduce it with with latest JDK 7, JDK 8, and also the 
>>>>>> latest hotspot-gc/hotspot workspace.
>>>>>>
>>>>>> Are you guys looking into this (and is there a CR?)? I have a 
>>>>>> small test I can reproduce it with and a diagnosis / proposed 
>>>>>> fix(es) if you're interested.
>>>>>>
>>>>>> Tony
>>>>>>
>>>>>
>>>>
>>

-- 
Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com


From serkanozal86 at hotmail.com  Sat Jun 28 12:21:15 2014
From: serkanozal86 at hotmail.com (=?utf-8?B?c2Vya2FuIMO2emFs?=)
Date: Sat, 28 Jun 2014 15:21:15 +0300
Subject: =?utf-8?Q?Compressed?= =?utf-8?Q?-OOP's_on_?= =?utf-8?B?SlZN4oCP?=
Message-ID: <DUB129-W19CAF8C9FE98CF58DA3F25D91A0@phx.gbl>

Hi all,
As you know, sometimes, although compressed-oops are used, if java heap size < 4Gb and it can be moved into low virtual address space (below 4Gb) then compressed oops can be used without encoding/decoding. (https://wikis.oracle.com/display/HotSpotInternals/CompressedOops)
In 64 bit JVM with compressed-oops enable and and with minimum heap size 1G and maximum heap size 1G, object references are 4 byte. In this case, compressed-oop is real native address.
But in 64 bit JVM with compressed-oops enable and and with minimum heap size 4G and maximum heap size 8G, object references are 4 byte. But in this case, compressed-oop is needed to be encoded/decoded (by 3 bit shifting) before getting real native address.
In both of cases, compressed-oop is enable, but how can I detect compressed-oops are used as native address or are they need to be  encoded/decoded ? If they are encoded/decoded, what is the value of bit shifting ?
Thanks in advance.
--
Serkan ?ZAL 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140628/1f637b35/attachment.htm>