From bartosz.markocki at gmail.com  Fri Apr  1 02:16:31 2011
From: bartosz.markocki at gmail.com (Bartek Markocki)
Date: Fri, 1 Apr 2011 11:16:31 +0200
Subject: Why abortable-preclean phase is not being aborted after YG occupancy
	exceeds 50%?
Message-ID: <AANLkTinkdqq_-r9O6HNVieivQ=ERiCurP=ZtTd7v_9fi@mail.gmail.com>

Hi all,

Can I ask any of you to review the attached extracts from our
production GC log and share your thoughts about them?

We have a router-type web application running under tomcat 6.0.28 with
Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC
settings are:
-Xmx2048m -Xms2048m -XX:NewSize=1024m
-XX:PermSize=64m -XX:MaxPermSize=128m
-XX:ThreadStackSize=128
-XX:+DisableExplicitGC
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:+PrintGCDetails

What we did:

Lately we have changed from ParallelOld to CMS due to unacceptable
long Full GC pauses times. In preparation to the change of the
collector we performed a lot of GC tuning related tests and found out
that the above (simple) set of settings fulfill our needs in the best
way.
So far we are happy with what we see (frequency of minor scans/CMS
cycles, times of STW pauses) with one exception.


What is the problem:

Some of our remark phases last much longer than others (up to 8 times
on avg.). Normal remark phase lasts between 55 and 90ms, the longest
one lasted for 538ms.
At first we thought that this is due to aborting the preceding
abortable-preclean phase. After a closer look we found out that
depending on the volume of traffic (i.e., time of day) in fact some of
our abortable-preclean phases are aborted due to time limit (5sec).
Despite that most of the following remark phases times still are
within acceptable limit (up to 100ms). So we kept digging. As a result
of that we found out that the abnormal long remark phases are preceded
by aborted abortable-preclean phase. The phase was always aborted due
to the time limit however if we have a look at the following report
for the young generation occupancy in all cases we were able to find
that YG was occupied in far more than 50%.
Per my (current :)) understanding the abortable-preclean phase can be
aborted due of the time limit or because YG got full in about 50% (so
remark phase will happen midway during two minor collections) -
whatever comes first. In our case the 'about 50%' condition is not
executed and the phase continues until it hits the time limit. The
following remark phase always last longer, i.e., 350-550ms.


The big question:

What can we do to cut down the time of those long lasting remark phases?


Below I enclose three samples from our GC log presenting:
first one - a CMS cycle that aborted the abortable-preclean phase due
time limit and the following remark phase does not show the abnormal
behavior.
second one - an "ideal" CMS cycle
third one - a CMS cycle with aborted the abortable-preclean phase (due
to time limit even though YG occupancy is much greater than 50%) and
the following remark phase lasts for 0.5second.

--
1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K),
0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times:
user=0.33 sys=0.01, real=0.07 secs]
1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)]
1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00,
real=0.05 secs]
1142110.602: [CMS-concurrent-mark-start]
1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96
sys=0.07, real=0.41 secs]
1142111.011: [CMS-concurrent-preclean-start]
1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times:
user=0.02 sys=0.00, real=0.02 secs]
1142111.028: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 1142116.036:
[CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times:
user=7.31 sys=0.57, real=5.00 secs]
1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051:
[Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs
processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)]
1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00,
real=0.06 secs]
1142116.107: [CMS-concurrent-sweep-start]
1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times:
user=2.41 sys=0.24, real=1.61 secs]
1142117.721: [CMS-concurrent-reset-start]
1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times:
user=0.01 sys=0.00, real=0.01 secs]
1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K),
0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times:
user=0.29 sys=0.01, real=0.07 secs]
1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K),
0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times:
user=0.29 sys=0.01, real=0.07 secs]
--
1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K),
0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times:
user=0.33 sys=0.00, real=0.08 secs]
1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)]
1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00,
real=0.06 secs]
1165584.463: [CMS-concurrent-mark-start]
1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40
sys=0.21, real=0.47 secs]
1165584.934: [CMS-concurrent-preclean-start]
1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times:
user=0.05 sys=0.00, real=0.02 secs]
1165584.955: [CMS-concurrent-abortable-preclean-start]
1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs]
[Times: user=5.51 sys=0.65, real=2.92 secs]
1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892:
[Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs
processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)]
1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01,
real=0.09 secs]
1165587.986: [CMS-concurrent-sweep-start]
1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times:
user=3.39 sys=0.46, real=1.69 secs]
1165589.671: [CMS-concurrent-reset-start]
1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
user=0.01 sys=0.00, real=0.01 secs]
1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K),
0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times:
user=0.34 sys=0.00, real=0.09 secs]
1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K),
0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times:
user=0.31 sys=0.01, real=0.09 secs]
--
1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K),
0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times:
user=0.32 sys=0.01, real=0.08 secs]
1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)]
1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00,
real=0.07 secs]
1166753.932: [CMS-concurrent-mark-start]
1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76
sys=0.28, real=0.54 secs]
1166754.471: [CMS-concurrent-preclean-start]
1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times:
user=0.04 sys=0.00, real=0.01 secs]
1166754.488: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 1166759.533:
[CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times:
user=9.75 sys=1.21, real=5.05 secs]
1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549:
[Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs
processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)]
1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06,
real=0.56 secs]
1166760.105: [CMS-concurrent-sweep-start]
1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K),
0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times:
user=0.31 sys=0.02, real=0.08 secs]
1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times:
user=3.48 sys=0.49, real=1.49 secs]
1166761.593: [CMS-concurrent-reset-start]
1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
user=0.02 sys=0.01, real=0.01 secs]
1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K),
0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times:
user=0.31 sys=0.01, real=0.08 secs]
--

Thank you in advance,
Bartek

From y.s.ramakrishna at oracle.com  Fri Apr  1 09:30:19 2011
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 01 Apr 2011 09:30:19 -0700
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <AANLkTinkdqq_-r9O6HNVieivQ=ERiCurP=ZtTd7v_9fi@mail.gmail.com>
References: <AANLkTinkdqq_-r9O6HNVieivQ=ERiCurP=ZtTd7v_9fi@mail.gmail.com>
Message-ID: <4D95FD9B.9080909@oracle.com>

Hi Bartek --

Try -XX:+CMSSCavengeBeforeRemark as a temporary workaround
for this, and let us know if the performance is reasonable
or not.

I'll look at your log (can you send me your whole GC log,
showing the problem, off-list?).
I think there's probably an open CR for this, which i'll
dig up for you.

-- ramki

On 4/1/2011 2:16 AM, Bartek Markocki wrote:
> Hi all,
>
> Can I ask any of you to review the attached extracts from our
> production GC log and share your thoughts about them?
>
> We have a router-type web application running under tomcat 6.0.28 with
> Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC
> settings are:
> -Xmx2048m -Xms2048m -XX:NewSize=1024m
> -XX:PermSize=64m -XX:MaxPermSize=128m
> -XX:ThreadStackSize=128
> -XX:+DisableExplicitGC
> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
> -XX:+PrintGCDetails
>
> What we did:
>
> Lately we have changed from ParallelOld to CMS due to unacceptable
> long Full GC pauses times. In preparation to the change of the
> collector we performed a lot of GC tuning related tests and found out
> that the above (simple) set of settings fulfill our needs in the best
> way.
> So far we are happy with what we see (frequency of minor scans/CMS
> cycles, times of STW pauses) with one exception.
>
>
> What is the problem:
>
> Some of our remark phases last much longer than others (up to 8 times
> on avg.). Normal remark phase lasts between 55 and 90ms, the longest
> one lasted for 538ms.
> At first we thought that this is due to aborting the preceding
> abortable-preclean phase. After a closer look we found out that
> depending on the volume of traffic (i.e., time of day) in fact some of
> our abortable-preclean phases are aborted due to time limit (5sec).
> Despite that most of the following remark phases times still are
> within acceptable limit (up to 100ms). So we kept digging. As a result
> of that we found out that the abnormal long remark phases are preceded
> by aborted abortable-preclean phase. The phase was always aborted due
> to the time limit however if we have a look at the following report
> for the young generation occupancy in all cases we were able to find
> that YG was occupied in far more than 50%.
> Per my (current :)) understanding the abortable-preclean phase can be
> aborted due of the time limit or because YG got full in about 50% (so
> remark phase will happen midway during two minor collections) -
> whatever comes first. In our case the 'about 50%' condition is not
> executed and the phase continues until it hits the time limit. The
> following remark phase always last longer, i.e., 350-550ms.
>
>
> The big question:
>
> What can we do to cut down the time of those long lasting remark phases?
>
>
> Below I enclose three samples from our GC log presenting:
> first one - a CMS cycle that aborted the abortable-preclean phase due
> time limit and the following remark phase does not show the abnormal
> behavior.
> second one - an "ideal" CMS cycle
> third one - a CMS cycle with aborted the abortable-preclean phase (due
> to time limit even though YG occupancy is much greater than 50%) and
> the following remark phase lasts for 0.5second.
>
> --
> 1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K),
> 0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times:
> user=0.33 sys=0.01, real=0.07 secs]
> 1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)]
> 1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00,
> real=0.05 secs]
> 1142110.602: [CMS-concurrent-mark-start]
> 1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96
> sys=0.07, real=0.41 secs]
> 1142111.011: [CMS-concurrent-preclean-start]
> 1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times:
> user=0.02 sys=0.00, real=0.02 secs]
> 1142111.028: [CMS-concurrent-abortable-preclean-start]
>   CMS: abort preclean due to time 1142116.036:
> [CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times:
> user=7.31 sys=0.57, real=5.00 secs]
> 1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051:
> [Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs
> processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)]
> 1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00,
> real=0.06 secs]
> 1142116.107: [CMS-concurrent-sweep-start]
> 1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times:
> user=2.41 sys=0.24, real=1.61 secs]
> 1142117.721: [CMS-concurrent-reset-start]
> 1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times:
> user=0.01 sys=0.00, real=0.01 secs]
> 1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K),
> 0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times:
> user=0.29 sys=0.01, real=0.07 secs]
> 1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K),
> 0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times:
> user=0.29 sys=0.01, real=0.07 secs]
> --
> 1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K),
> 0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times:
> user=0.33 sys=0.00, real=0.08 secs]
> 1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)]
> 1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00,
> real=0.06 secs]
> 1165584.463: [CMS-concurrent-mark-start]
> 1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40
> sys=0.21, real=0.47 secs]
> 1165584.934: [CMS-concurrent-preclean-start]
> 1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times:
> user=0.05 sys=0.00, real=0.02 secs]
> 1165584.955: [CMS-concurrent-abortable-preclean-start]
> 1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs]
> [Times: user=5.51 sys=0.65, real=2.92 secs]
> 1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892:
> [Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs
> processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)]
> 1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01,
> real=0.09 secs]
> 1165587.986: [CMS-concurrent-sweep-start]
> 1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times:
> user=3.39 sys=0.46, real=1.69 secs]
> 1165589.671: [CMS-concurrent-reset-start]
> 1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
> user=0.01 sys=0.00, real=0.01 secs]
> 1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K),
> 0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times:
> user=0.34 sys=0.00, real=0.09 secs]
> 1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K),
> 0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times:
> user=0.31 sys=0.01, real=0.09 secs]
> --
> 1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K),
> 0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times:
> user=0.32 sys=0.01, real=0.08 secs]
> 1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)]
> 1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00,
> real=0.07 secs]
> 1166753.932: [CMS-concurrent-mark-start]
> 1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76
> sys=0.28, real=0.54 secs]
> 1166754.471: [CMS-concurrent-preclean-start]
> 1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times:
> user=0.04 sys=0.00, real=0.01 secs]
> 1166754.488: [CMS-concurrent-abortable-preclean-start]
>   CMS: abort preclean due to time 1166759.533:
> [CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times:
> user=9.75 sys=1.21, real=5.05 secs]
> 1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549:
> [Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs
> processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)]
> 1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06,
> real=0.56 secs]
> 1166760.105: [CMS-concurrent-sweep-start]
> 1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K),
> 0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times:
> user=0.31 sys=0.02, real=0.08 secs]
> 1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times:
> user=3.48 sys=0.49, real=1.49 secs]
> 1166761.593: [CMS-concurrent-reset-start]
> 1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
> user=0.02 sys=0.01, real=0.01 secs]
> 1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K),
> 0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times:
> user=0.31 sys=0.01, real=0.08 secs]
> --
>
> Thank you in advance,
> Bartek
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From bartosz.markocki at gmail.com  Fri Apr  1 10:40:26 2011
From: bartosz.markocki at gmail.com (Bartek Markocki)
Date: Fri, 1 Apr 2011 19:40:26 +0200
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <4D95FD9B.9080909@oracle.com>
References: <AANLkTinkdqq_-r9O6HNVieivQ=ERiCurP=ZtTd7v_9fi@mail.gmail.com>
	<4D95FD9B.9080909@oracle.com>
Message-ID: <AANLkTinQ2b3q-oP33i-yr8cNLw3oVawwAE8sqzvcR2-U@mail.gmail.com>

Hi Ramki,

On Fri, Apr 1, 2011 at 6:30 PM, Y. Srinivas Ramakrishna
<y.s.ramakrishna at oracle.com> wrote:
> Try -XX:+CMSSCavengeBeforeRemark as a temporary workaround
> for this, and let us know if the performance is reasonable
> or not.
We will try to push the +CMSScavengeBeforeRemark to our production but
as we are talking about the production environment it might take some
time to return to you with the results.

> I'll look at your log (can you send me your whole GC log,
> showing the problem, off-list?).
Just did.

> I think there's probably an open CR for this, which i'll
> dig up for you.
Thanks a lot!

Bartek


> On 4/1/2011 2:16 AM, Bartek Markocki wrote:
>>
>> Hi all,
>>
>> Can I ask any of you to review the attached extracts from our
>> production GC log and share your thoughts about them?
>>
>> We have a router-type web application running under tomcat 6.0.28 with
>> Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC
>> settings are:
>> -Xmx2048m -Xms2048m -XX:NewSize=1024m
>> -XX:PermSize=64m -XX:MaxPermSize=128m
>> -XX:ThreadStackSize=128
>> -XX:+DisableExplicitGC
>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> -XX:+PrintGCDetails
>>
>> What we did:
>>
>> Lately we have changed from ParallelOld to CMS due to unacceptable
>> long Full GC pauses times. In preparation to the change of the
>> collector we performed a lot of GC tuning related tests and found out
>> that the above (simple) set of settings fulfill our needs in the best
>> way.
>> So far we are happy with what we see (frequency of minor scans/CMS
>> cycles, times of STW pauses) with one exception.
>>
>>
>> What is the problem:
>>
>> Some of our remark phases last much longer than others (up to 8 times
>> on avg.). Normal remark phase lasts between 55 and 90ms, the longest
>> one lasted for 538ms.
>> At first we thought that this is due to aborting the preceding
>> abortable-preclean phase. After a closer look we found out that
>> depending on the volume of traffic (i.e., time of day) in fact some of
>> our abortable-preclean phases are aborted due to time limit (5sec).
>> Despite that most of the following remark phases times still are
>> within acceptable limit (up to 100ms). So we kept digging. As a result
>> of that we found out that the abnormal long remark phases are preceded
>> by aborted abortable-preclean phase. The phase was always aborted due
>> to the time limit however if we have a look at the following report
>> for the young generation occupancy in all cases we were able to find
>> that YG was occupied in far more than 50%.
>> Per my (current :)) understanding the abortable-preclean phase can be
>> aborted due of the time limit or because YG got full in about 50% (so
>> remark phase will happen midway during two minor collections) -
>> whatever comes first. In our case the 'about 50%' condition is not
>> executed and the phase continues until it hits the time limit. The
>> following remark phase always last longer, i.e., 350-550ms.
>>
>>
>> The big question:
>>
>> What can we do to cut down the time of those long lasting remark phases?
>>
>>
>> Below I enclose three samples from our GC log presenting:
>> first one - a CMS cycle that aborted the abortable-preclean phase due
>> time limit and the following remark phase does not show the abnormal
>> behavior.
>> second one - an "ideal" CMS cycle
>> third one - a CMS cycle with aborted the abortable-preclean phase (due
>> to time limit even though YG occupancy is much greater than 50%) and
>> the following remark phase lasts for 0.5second.
>>
>> --
>> 1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K),
>> 0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times:
>> user=0.33 sys=0.01, real=0.07 secs]
>> 1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)]
>> 1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00,
>> real=0.05 secs]
>> 1142110.602: [CMS-concurrent-mark-start]
>> 1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96
>> sys=0.07, real=0.41 secs]
>> 1142111.011: [CMS-concurrent-preclean-start]
>> 1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times:
>> user=0.02 sys=0.00, real=0.02 secs]
>> 1142111.028: [CMS-concurrent-abortable-preclean-start]
>> ?CMS: abort preclean due to time 1142116.036:
>> [CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times:
>> user=7.31 sys=0.57, real=5.00 secs]
>> 1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051:
>> [Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs
>> processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)]
>> 1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00,
>> real=0.06 secs]
>> 1142116.107: [CMS-concurrent-sweep-start]
>> 1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times:
>> user=2.41 sys=0.24, real=1.61 secs]
>> 1142117.721: [CMS-concurrent-reset-start]
>> 1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times:
>> user=0.01 sys=0.00, real=0.01 secs]
>> 1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K),
>> 0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times:
>> user=0.29 sys=0.01, real=0.07 secs]
>> 1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K),
>> 0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times:
>> user=0.29 sys=0.01, real=0.07 secs]
>> --
>> 1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K),
>> 0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times:
>> user=0.33 sys=0.00, real=0.08 secs]
>> 1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)]
>> 1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00,
>> real=0.06 secs]
>> 1165584.463: [CMS-concurrent-mark-start]
>> 1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40
>> sys=0.21, real=0.47 secs]
>> 1165584.934: [CMS-concurrent-preclean-start]
>> 1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times:
>> user=0.05 sys=0.00, real=0.02 secs]
>> 1165584.955: [CMS-concurrent-abortable-preclean-start]
>> 1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs]
>> [Times: user=5.51 sys=0.65, real=2.92 secs]
>> 1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892:
>> [Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs
>> processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)]
>> 1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01,
>> real=0.09 secs]
>> 1165587.986: [CMS-concurrent-sweep-start]
>> 1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times:
>> user=3.39 sys=0.46, real=1.69 secs]
>> 1165589.671: [CMS-concurrent-reset-start]
>> 1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
>> user=0.01 sys=0.00, real=0.01 secs]
>> 1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K),
>> 0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times:
>> user=0.34 sys=0.00, real=0.09 secs]
>> 1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K),
>> 0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times:
>> user=0.31 sys=0.01, real=0.09 secs]
>> --
>> 1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K),
>> 0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times:
>> user=0.32 sys=0.01, real=0.08 secs]
>> 1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)]
>> 1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00,
>> real=0.07 secs]
>> 1166753.932: [CMS-concurrent-mark-start]
>> 1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76
>> sys=0.28, real=0.54 secs]
>> 1166754.471: [CMS-concurrent-preclean-start]
>> 1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times:
>> user=0.04 sys=0.00, real=0.01 secs]
>> 1166754.488: [CMS-concurrent-abortable-preclean-start]
>> ?CMS: abort preclean due to time 1166759.533:
>> [CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times:
>> user=9.75 sys=1.21, real=5.05 secs]
>> 1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549:
>> [Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs
>> processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)]
>> 1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06,
>> real=0.56 secs]
>> 1166760.105: [CMS-concurrent-sweep-start]
>> 1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K),
>> 0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times:
>> user=0.31 sys=0.02, real=0.08 secs]
>> 1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times:
>> user=3.48 sys=0.49, real=1.49 secs]
>> 1166761.593: [CMS-concurrent-reset-start]
>> 1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
>> user=0.02 sys=0.01, real=0.01 secs]
>> 1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K),
>> 0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times:
>> user=0.31 sys=0.01, real=0.08 secs]
>> --
>>
>> Thank you in advance,
>> Bartek
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>

From aaisinzon at guidewire.com  Fri Apr  1 11:48:47 2011
From: aaisinzon at guidewire.com (Alex Aisinzon)
Date: Fri, 1 Apr 2011 11:48:47 -0700
Subject: G1 feedback
Message-ID: <BF1AA7231657174881389A59FBB4EB7C06FBE84F@newexchange.guidewire.com>

Hi all

 
Thoughts on this feedback about G1?

 
Take care

 
Alex A

 
From: Alex Aisinzon 
Sent: Saturday, March 26, 2011 6:46 AM
To: hotspot-gc-use at openjdk.java.net
Subject: G1 feedback

 
Hi all

 
I experimented with G1 and Sun JDK 1.6 update 24 and ran two long
running tests (8 hours) with it:

With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m
-Xmx24576m", most pauses were very short. 4 pauses were around 7 seconds
and 4 at 34 seconds.

I then added the objective of keeping the longest pause around 1 second
and used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000
-XX:+UseCompressedOops -Xms24576m -Xmx24576m". Most pauses were a little
above 1 second except for one pause at 8 seconds and one at 57 seconds.

The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its
average CPU utilization was around 60-65% so it was not over-used.

Everything would be perfect if it were not for the 7, 34 and 8, 57
seconds pauses.

What would you recommend I do to either reduce these longer pauses or
give insights into what happened so that G1 can avoid these very rare
but pretty long pauses in the future?

 
Thanks in advance

 
Alex A

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110401/b0b1871f/attachment.html 

From todd at cloudera.com  Fri Apr  1 12:05:21 2011
From: todd at cloudera.com (Todd Lipcon)
Date: Fri, 1 Apr 2011 12:05:21 -0700
Subject: G1 feedback
In-Reply-To: <BF1AA7231657174881389A59FBB4EB7C06FBE84F@newexchange.guidewire.com>
References: <BF1AA7231657174881389A59FBB4EB7C06FBE84F@newexchange.guidewire.com>
Message-ID: <AANLkTimPR4xmsyriSHNMGb20uftDoO2Gs4vygKsujcNC@mail.gmail.com>

Hi Alex,

I've had similar results - see my threads from a few months back on this
mailing list.

The summary is that, when there is a tight pause bound, some regions will
accumulate which have estimates that are "stuck" higher than the goal. As
these accumulate, they're never collected, which means that memory usage
slowly grows until a full GC is required.

The two situations that caused this in my experiments were:
1) The JVM got context-switched out for several scheduling quanta during the
"other" portion of a non-young region collection. Because "other" time is
considered constant overhead, even mostly-garbage regions were carrying this
as part of their estimate, causing all non-young regions to be deemed "too
expensive". I fixed this with a patch to notice when the "other time"
estimate was greater than the pause goal and decay it back towards 0.

2) A bad region accumulates many inter-region references in its "remember
set", overflowing into the coarse rset. Once the coarse rset has been used,
there's no facility to "uncoarsen" the rset entries even after all the
referring regions are dead. If the number of coarse entries is high enough
that the time estimate is greater than the pause time goal, then again, this
region will never be collected, and memory will fill up until a full GC. I
wrote a patch to improve the time estimation for coarse rset entries based
on the liveness info in the coarse regions. This helped somewhat for my
application.

Let me know if you'd like to try these patches out, I can dig them up again
(or you might find them in the mailing list archives).

-Todd

On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon <aaisinzon at guidewire.com>wrote:

> Hi all
>
>
>
> Thoughts on this feedback about G1?
>
>
>
> Take care
>
>
>
> Alex A
>
>
>
> *From:* Alex Aisinzon
> *Sent:* Saturday, March 26, 2011 6:46 AM
> *To:* hotspot-gc-use at openjdk.java.net
> *Subject:* G1 feedback
>
>
>
> Hi all
>
>
>
> I experimented with G1 and Sun JDK 1.6 update 24 and ran two long running
> tests (8 hours) with it:
>
> With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m -Xmx24576m",
> most pauses were very short. 4 pauses were around 7 seconds and 4 at 34
> seconds.
>
> I then added the objective of keeping the longest pause around 1 second and
> used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:+UseCompressedOops
> -Xms24576m -Xmx24576m". Most pauses were a little above 1 second except for
> one pause at 8 seconds and one at 57 seconds.
>
> The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its average
> CPU utilization was around 60-65% so it was not over-used.
>
> Everything would be perfect if it were not for the 7, 34 and 8, 57 seconds
> pauses.
>
> What would you recommend I do to either reduce these longer pauses or give
> insights into what happened so that G1 can avoid these very rare but pretty
> long pauses in the future?
>
>
>
> Thanks in advance
>
>
>
> Alex A
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110401/fa01154a/attachment.html 

From aaisinzon at guidewire.com  Fri Apr  1 13:52:33 2011
From: aaisinzon at guidewire.com (Alex Aisinzon)
Date: Fri, 1 Apr 2011 13:52:33 -0700
Subject: G1 feedback
In-Reply-To: <AANLkTimPR4xmsyriSHNMGb20uftDoO2Gs4vygKsujcNC@mail.gmail.com>
References: <BF1AA7231657174881389A59FBB4EB7C06FBE84F@newexchange.guidewire.com>
	<AANLkTimPR4xmsyriSHNMGb20uftDoO2Gs4vygKsujcNC@mail.gmail.com>
Message-ID: <BF1AA7231657174881389A59FBB4EB7C06FBEA28@newexchange.guidewire.com>

Hi Todd

 
This is very interesting. I feel I need to read some additional material
about G1 to fully understand your explanation. I guess the research
paper on it would help. I will plan to read it.

In any case, I am happy to give your patch a try. The challenge is that
I am not well set to rebuild G1. Is there a way I could get the java
binary/executable with the patch included?

 
Thanks in advance

 
Alex A

 
From: Todd Lipcon [mailto:todd at cloudera.com] 
Sent: Friday, April 01, 2011 12:05 PM
To: Alex Aisinzon
Cc: hotspot-gc-use at openjdk.java.net
Subject: Re: G1 feedback

 
Hi Alex,

 
I've had similar results - see my threads from a few months back on this
mailing list.

 
The summary is that, when there is a tight pause bound, some regions
will accumulate which have estimates that are "stuck" higher than the
goal. As these accumulate, they're never collected, which means that
memory usage slowly grows until a full GC is required.

 
The two situations that caused this in my experiments were:

1) The JVM got context-switched out for several scheduling quanta during
the "other" portion of a non-young region collection. Because "other"
time is considered constant overhead, even mostly-garbage regions were
carrying this as part of their estimate, causing all non-young regions
to be deemed "too expensive". I fixed this with a patch to notice when
the "other time" estimate was greater than the pause goal and decay it
back towards 0.

 
2) A bad region accumulates many inter-region references in its
"remember set", overflowing into the coarse rset. Once the coarse rset
has been used, there's no facility to "uncoarsen" the rset entries even
after all the referring regions are dead. If the number of coarse
entries is high enough that the time estimate is greater than the pause
time goal, then again, this region will never be collected, and memory
will fill up until a full GC. I wrote a patch to improve the time
estimation for coarse rset entries based on the liveness info in the
coarse regions. This helped somewhat for my application.

 
Let me know if you'd like to try these patches out, I can dig them up
again (or you might find them in the mailing list archives).

 
-Todd

On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon <aaisinzon at guidewire.com>
wrote:

Hi all

 
Thoughts on this feedback about G1?

 
Take care

 
Alex A

 
From: Alex Aisinzon 
Sent: Saturday, March 26, 2011 6:46 AM
To: hotspot-gc-use at openjdk.java.net
Subject: G1 feedback

 
Hi all

 
I experimented with G1 and Sun JDK 1.6 update 24 and ran two long
running tests (8 hours) with it:

With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m
-Xmx24576m", most pauses were very short. 4 pauses were around 7 seconds
and 4 at 34 seconds.

I then added the objective of keeping the longest pause around 1 second
and used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000
-XX:+UseCompressedOops -Xms24576m -Xmx24576m". Most pauses were a little
above 1 second except for one pause at 8 seconds and one at 57 seconds.

The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its
average CPU utilization was around 60-65% so it was not over-used.

Everything would be perfect if it were not for the 7, 34 and 8, 57
seconds pauses.

What would you recommend I do to either reduce these longer pauses or
give insights into what happened so that G1 can avoid these very rare
but pretty long pauses in the future?

 
Thanks in advance

 
Alex A


_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


-- 
Todd Lipcon
Software Engineer, Cloudera

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110401/73b13690/attachment-0001.html 

From todd at cloudera.com  Fri Apr  1 13:55:05 2011
From: todd at cloudera.com (Todd Lipcon)
Date: Fri, 1 Apr 2011 13:55:05 -0700
Subject: G1 feedback
In-Reply-To: <BF1AA7231657174881389A59FBB4EB7C06FBEA28@newexchange.guidewire.com>
References: <BF1AA7231657174881389A59FBB4EB7C06FBE84F@newexchange.guidewire.com>
	<AANLkTimPR4xmsyriSHNMGb20uftDoO2Gs4vygKsujcNC@mail.gmail.com>
	<BF1AA7231657174881389A59FBB4EB7C06FBEA28@newexchange.guidewire.com>
Message-ID: <AANLkTi=FAOOfMydEhLHX5VB5FrB3A6n+wYiEH+YpJ9-h@mail.gmail.com>

On Fri, Apr 1, 2011 at 1:52 PM, Alex Aisinzon <aaisinzon at guidewire.com>wrote:

> Hi Todd
>
>
>
> This is very interesting. I feel I need to read some additional material
> about G1 to fully understand your explanation. I guess the research paper on
> it would help. I will plan to read it.
>
> In any case, I am happy to give your patch a try. The challenge is that I
> am not well set to rebuild G1. Is there a way I could get the java
> binary/executable with the patch included?
>
>
>

Sorry, I'm not well equipped to distribute a binary (and there might be
licensing issues, I'm not even sure).

-Todd


>
>
> *From:* Todd Lipcon [mailto:todd at cloudera.com]
> *Sent:* Friday, April 01, 2011 12:05 PM
> *To:* Alex Aisinzon
> *Cc:* hotspot-gc-use at openjdk.java.net
> *Subject:* Re: G1 feedback
>
>
>
> Hi Alex,
>
>
>
> I've had similar results - see my threads from a few months back on this
> mailing list.
>
>
>
> The summary is that, when there is a tight pause bound, some regions will
> accumulate which have estimates that are "stuck" higher than the goal. As
> these accumulate, they're never collected, which means that memory usage
> slowly grows until a full GC is required.
>
>
>
> The two situations that caused this in my experiments were:
>
> 1) The JVM got context-switched out for several scheduling quanta during
> the "other" portion of a non-young region collection. Because "other" time
> is considered constant overhead, even mostly-garbage regions were carrying
> this as part of their estimate, causing all non-young regions to be deemed
> "too expensive". I fixed this with a patch to notice when the "other time"
> estimate was greater than the pause goal and decay it back towards 0.
>
>
>
> 2) A bad region accumulates many inter-region references in its "remember
> set", overflowing into the coarse rset. Once the coarse rset has been used,
> there's no facility to "uncoarsen" the rset entries even after all the
> referring regions are dead. If the number of coarse entries is high enough
> that the time estimate is greater than the pause time goal, then again, this
> region will never be collected, and memory will fill up until a full GC. I
> wrote a patch to improve the time estimation for coarse rset entries based
> on the liveness info in the coarse regions. This helped somewhat for my
> application.
>
>
>
> Let me know if you'd like to try these patches out, I can dig them up again
> (or you might find them in the mailing list archives).
>
>
>
> -Todd
>
> On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon <aaisinzon at guidewire.com>
> wrote:
>
> Hi all
>
>
>
> Thoughts on this feedback about G1?
>
>
>
> Take care
>
>
>
> Alex A
>
>
>
> *From:* Alex Aisinzon
> *Sent:* Saturday, March 26, 2011 6:46 AM
> *To:* hotspot-gc-use at openjdk.java.net
> *Subject:* G1 feedback
>
>
>
> Hi all
>
>
>
> I experimented with G1 and Sun JDK 1.6 update 24 and ran two long running
> tests (8 hours) with it:
>
> With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m -Xmx24576m",
> most pauses were very short. 4 pauses were around 7 seconds and 4 at 34
> seconds.
>
> I then added the objective of keeping the longest pause around 1 second and
> used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:+UseCompressedOops
> -Xms24576m -Xmx24576m". Most pauses were a little above 1 second except for
> one pause at 8 seconds and one at 57 seconds.
>
> The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its average
> CPU utilization was around 60-65% so it was not over-used.
>
> Everything would be perfect if it were not for the 7, 34 and 8, 57 seconds
> pauses.
>
> What would you recommend I do to either reduce these longer pauses or give
> insights into what happened so that G1 can avoid these very rare but pretty
> long pauses in the future?
>
>
>
> Thanks in advance
>
>
>
> Alex A
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


-- 
Todd Lipcon
Software Engineer, Cloudera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110401/592ed7f1/attachment.html 

From aaisinzon at guidewire.com  Fri Apr  1 14:07:02 2011
From: aaisinzon at guidewire.com (Alex Aisinzon)
Date: Fri, 1 Apr 2011 14:07:02 -0700
Subject: G1 feedback
In-Reply-To: <AANLkTi=FAOOfMydEhLHX5VB5FrB3A6n+wYiEH+YpJ9-h@mail.gmail.com>
References: <BF1AA7231657174881389A59FBB4EB7C06FBE84F@newexchange.guidewire.com>
	<AANLkTimPR4xmsyriSHNMGb20uftDoO2Gs4vygKsujcNC@mail.gmail.com>
	<BF1AA7231657174881389A59FBB4EB7C06FBEA28@newexchange.guidewire.com>
	<AANLkTi=FAOOfMydEhLHX5VB5FrB3A6n+wYiEH+YpJ9-h@mail.gmail.com>
Message-ID: <BF1AA7231657174881389A59FBB4EB7C06FBEA4A@newexchange.guidewire.com>

Todd

 
Please share the patch. I will see if I can build it.

 
Regards

 
Alex Aisinzon

 
From: Todd Lipcon [mailto:todd at cloudera.com] 
Sent: Friday, April 01, 2011 1:55 PM
To: Alex Aisinzon
Cc: hotspot-gc-use at openjdk.java.net
Subject: Re: G1 feedback

 
On Fri, Apr 1, 2011 at 1:52 PM, Alex Aisinzon <aaisinzon at guidewire.com>
wrote:

	Hi Todd

	 
	This is very interesting. I feel I need to read some additional
material about G1 to fully understand your explanation. I guess the
research paper on it would help. I will plan to read it.

	In any case, I am happy to give your patch a try. The challenge
is that I am not well set to rebuild G1. Is there a way I could get the
java binary/executable with the patch included?

	 
Sorry, I'm not well equipped to distribute a binary (and there might be
licensing issues, I'm not even sure).

 
-Todd

 
	From: Todd Lipcon [mailto:todd at cloudera.com] 
	Sent: Friday, April 01, 2011 12:05 PM
	To: Alex Aisinzon
	Cc: hotspot-gc-use at openjdk.java.net
	Subject: Re: G1 feedback

	 
	Hi Alex,

	 
	I've had similar results - see my threads from a few months back
on this mailing list.

	 
	The summary is that, when there is a tight pause bound, some
regions will accumulate which have estimates that are "stuck" higher
than the goal. As these accumulate, they're never collected, which means
that memory usage slowly grows until a full GC is required.

	 
	The two situations that caused this in my experiments were:

	1) The JVM got context-switched out for several scheduling
quanta during the "other" portion of a non-young region collection.
Because "other" time is considered constant overhead, even
mostly-garbage regions were carrying this as part of their estimate,
causing all non-young regions to be deemed "too expensive". I fixed this
with a patch to notice when the "other time" estimate was greater than
the pause goal and decay it back towards 0.

	 
	2) A bad region accumulates many inter-region references in its
"remember set", overflowing into the coarse rset. Once the coarse rset
has been used, there's no facility to "uncoarsen" the rset entries even
after all the referring regions are dead. If the number of coarse
entries is high enough that the time estimate is greater than the pause
time goal, then again, this region will never be collected, and memory
will fill up until a full GC. I wrote a patch to improve the time
estimation for coarse rset entries based on the liveness info in the
coarse regions. This helped somewhat for my application.

	 
	Let me know if you'd like to try these patches out, I can dig
them up again (or you might find them in the mailing list archives).

	 
	-Todd

	On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon
<aaisinzon at guidewire.com> wrote:

	Hi all

	 
	Thoughts on this feedback about G1?

	 
	Take care

	 
	Alex A

	 
	From: Alex Aisinzon 
	Sent: Saturday, March 26, 2011 6:46 AM
	To: hotspot-gc-use at openjdk.java.net
	Subject: G1 feedback

	 
	Hi all

	 
	I experimented with G1 and Sun JDK 1.6 update 24 and ran two
long running tests (8 hours) with it:

	With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m
-Xmx24576m", most pauses were very short. 4 pauses were around 7 seconds
and 4 at 34 seconds.

	I then added the objective of keeping the longest pause around 1
second and used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000
-XX:+UseCompressedOops -Xms24576m -Xmx24576m". Most pauses were a little
above 1 second except for one pause at 8 seconds and one at 57 seconds.

	The server is a dual X5570 (8 cores total) and has 48GB of RAM.
Its average CPU utilization was around 60-65% so it was not over-used.

	Everything would be perfect if it were not for the 7, 34 and 8,
57 seconds pauses.

	What would you recommend I do to either reduce these longer
pauses or give insights into what happened so that G1 can avoid these
very rare but pretty long pauses in the future?

	 
	Thanks in advance

	 
	Alex A

	
	_______________________________________________
	hotspot-gc-use mailing list
	hotspot-gc-use at openjdk.java.net
	http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

	
	-- 
	Todd Lipcon
	Software Engineer, Cloudera


-- 
Todd Lipcon
Software Engineer, Cloudera

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110401/e0bcb9d4/attachment-0001.html 

From todd at cloudera.com  Fri Apr  1 14:15:25 2011
From: todd at cloudera.com (Todd Lipcon)
Date: Fri, 1 Apr 2011 14:15:25 -0700
Subject: G1 feedback
In-Reply-To: <BF1AA7231657174881389A59FBB4EB7C06FBEA4A@newexchange.guidewire.com>
References: <BF1AA7231657174881389A59FBB4EB7C06FBE84F@newexchange.guidewire.com>
	<AANLkTimPR4xmsyriSHNMGb20uftDoO2Gs4vygKsujcNC@mail.gmail.com>
	<BF1AA7231657174881389A59FBB4EB7C06FBEA28@newexchange.guidewire.com>
	<AANLkTi=FAOOfMydEhLHX5VB5FrB3A6n+wYiEH+YpJ9-h@mail.gmail.com>
	<BF1AA7231657174881389A59FBB4EB7C06FBEA4A@newexchange.guidewire.com>
Message-ID: <AANLkTinTqWdK72KNojwYdWUM9F=650=9b7TRa_ELXPnp@mail.gmail.com>

Here's a patch of all my local changes against a checkout from a couple
months back (may not completely apply against JDK7 trunk). This has the
fixes mentioned as well as a few other experiments I'd done.

I expressly grant permission to the JDK team to include this patch or parts
thereof in the JDK should they find the code useful.

-Todd

On Fri, Apr 1, 2011 at 2:07 PM, Alex Aisinzon <aaisinzon at guidewire.com>wrote:

> Todd
>
>
>
> Please share the patch. I will see if I can build it.
>
>
>
> Regards
>
>
>
> Alex Aisinzon
>
>
>
> *From:* Todd Lipcon [mailto:todd at cloudera.com]
> *Sent:* Friday, April 01, 2011 1:55 PM
>
> *To:* Alex Aisinzon
> *Cc:* hotspot-gc-use at openjdk.java.net
> *Subject:* Re: G1 feedback
>
>
>
> On Fri, Apr 1, 2011 at 1:52 PM, Alex Aisinzon <aaisinzon at guidewire.com>
> wrote:
>
> Hi Todd
>
>
>
> This is very interesting. I feel I need to read some additional material
> about G1 to fully understand your explanation. I guess the research paper on
> it would help. I will plan to read it.
>
> In any case, I am happy to give your patch a try. The challenge is that I
> am not well set to rebuild G1. Is there a way I could get the java
> binary/executable with the patch included?
>
>
>
>
>
> Sorry, I'm not well equipped to distribute a binary (and there might be
> licensing issues, I'm not even sure).
>
>
>
> -Todd
>
>
>
>
>
> *From:* Todd Lipcon [mailto:todd at cloudera.com]
> *Sent:* Friday, April 01, 2011 12:05 PM
> *To:* Alex Aisinzon
> *Cc:* hotspot-gc-use at openjdk.java.net
> *Subject:* Re: G1 feedback
>
>
>
> Hi Alex,
>
>
>
> I've had similar results - see my threads from a few months back on this
> mailing list.
>
>
>
> The summary is that, when there is a tight pause bound, some regions will
> accumulate which have estimates that are "stuck" higher than the goal. As
> these accumulate, they're never collected, which means that memory usage
> slowly grows until a full GC is required.
>
>
>
> The two situations that caused this in my experiments were:
>
> 1) The JVM got context-switched out for several scheduling quanta during
> the "other" portion of a non-young region collection. Because "other" time
> is considered constant overhead, even mostly-garbage regions were carrying
> this as part of their estimate, causing all non-young regions to be deemed
> "too expensive". I fixed this with a patch to notice when the "other time"
> estimate was greater than the pause goal and decay it back towards 0.
>
>
>
> 2) A bad region accumulates many inter-region references in its "remember
> set", overflowing into the coarse rset. Once the coarse rset has been used,
> there's no facility to "uncoarsen" the rset entries even after all the
> referring regions are dead. If the number of coarse entries is high enough
> that the time estimate is greater than the pause time goal, then again, this
> region will never be collected, and memory will fill up until a full GC. I
> wrote a patch to improve the time estimation for coarse rset entries based
> on the liveness info in the coarse regions. This helped somewhat for my
> application.
>
>
>
> Let me know if you'd like to try these patches out, I can dig them up again
> (or you might find them in the mailing list archives).
>
>
>
> -Todd
>
> On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon <aaisinzon at guidewire.com>
> wrote:
>
> Hi all
>
>
>
> Thoughts on this feedback about G1?
>
>
>
> Take care
>
>
>
> Alex A
>
>
>
> *From:* Alex Aisinzon
> *Sent:* Saturday, March 26, 2011 6:46 AM
> *To:* hotspot-gc-use at openjdk.java.net
> *Subject:* G1 feedback
>
>
>
> Hi all
>
>
>
> I experimented with G1 and Sun JDK 1.6 update 24 and ran two long running
> tests (8 hours) with it:
>
> With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m -Xmx24576m",
> most pauses were very short. 4 pauses were around 7 seconds and 4 at 34
> seconds.
>
> I then added the objective of keeping the longest pause around 1 second and
> used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:+UseCompressedOops
> -Xms24576m -Xmx24576m". Most pauses were a little above 1 second except for
> one pause at 8 seconds and one at 57 seconds.
>
> The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its average
> CPU utilization was around 60-65% so it was not over-used.
>
> Everything would be perfect if it were not for the 7, 34 and 8, 57 seconds
> pauses.
>
> What would you recommend I do to either reduce these longer pauses or give
> insights into what happened so that G1 can avoid these very rare but pretty
> long pauses in the future?
>
>
>
> Thanks in advance
>
>
>
> Alex A
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


-- 
Todd Lipcon
Software Engineer, Cloudera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110401/18da6b85/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jdk7-g1-fixes.patch
Type: text/x-patch
Size: 36582 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110401/18da6b85/attachment-0001.bin 

From y.s.ramakrishna at oracle.com  Wed Apr  6 16:53:01 2011
From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna)
Date: Wed, 06 Apr 2011 16:53:01 -0700
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <AANLkTinQ2b3q-oP33i-yr8cNLw3oVawwAE8sqzvcR2-U@mail.gmail.com>
References: <AANLkTinkdqq_-r9O6HNVieivQ=ERiCurP=ZtTd7v_9fi@mail.gmail.com>
	<4D95FD9B.9080909@oracle.com>
	<AANLkTinQ2b3q-oP33i-yr8cNLw3oVawwAE8sqzvcR2-U@mail.gmail.com>
Message-ID: <4D9CFCDD.9040000@oracle.com>

Hi Bartek --

On 04/01/11 10:40, Bartek Markocki wrote:
> Hi Ramki,
> 
> On Fri, Apr 1, 2011 at 6:30 PM, Y. Srinivas Ramakrishna
> <y.s.ramakrishna at oracle.com> wrote:
>> Try -XX:+CMSSCavengeBeforeRemark as a temporary workaround
>> for this, and let us know if the performance is reasonable
>> or not.
> We will try to push the +CMSScavengeBeforeRemark to our production but
> as we are talking about the production environment it might take some
> time to return to you with the results.
> 
>> I'll look at your log (can you send me your whole GC log,
>> showing the problem, off-list?).
> Just did.
> 
>> I think there's probably an open CR for this, which i'll
>> dig up for you.
> Thanks a lot!

The CR I had in mind is this one:-

   6990419 CMS: Remaining work for 6572569: consistently skewed work distribution in (long) re-mark pauses

It's an RFE, and I added you to the "Service Request".
If you have a support contract with Oracle, please
send the SR# to your support engineer, so he can
do the needful.

I looked at yr logs and it seems very much like
the problem I mention in this CR, although data
from -XX:PrintCMSStatistics=2 would help ascertain
if that was the issue. (Note to self, make
PrintCMSStatistics a manageable flag so it can
be turned on in a live JVM rather than to restart
a fresh run; ditto for CMSScavengeBeforeRemark:
i'll file RFE's for those, although not sure
when we can get them done.)

An alternative workaround that might also work
for you would be -XX:CMSWaitDuration=X
where X = at least two times the maximum interscavenge
duration observed by yr application. (The RFE
is to, among other things, ergonomify that setting.)

-- ramki

> 
> Bartek
> 
> 
>> On 4/1/2011 2:16 AM, Bartek Markocki wrote:
>>> Hi all,
>>>
>>> Can I ask any of you to review the attached extracts from our
>>> production GC log and share your thoughts about them?
>>>
>>> We have a router-type web application running under tomcat 6.0.28 with
>>> Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC
>>> settings are:
>>> -Xmx2048m -Xms2048m -XX:NewSize=1024m
>>> -XX:PermSize=64m -XX:MaxPermSize=128m
>>> -XX:ThreadStackSize=128
>>> -XX:+DisableExplicitGC
>>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>> -XX:+PrintGCDetails
>>>
>>> What we did:
>>>
>>> Lately we have changed from ParallelOld to CMS due to unacceptable
>>> long Full GC pauses times. In preparation to the change of the
>>> collector we performed a lot of GC tuning related tests and found out
>>> that the above (simple) set of settings fulfill our needs in the best
>>> way.
>>> So far we are happy with what we see (frequency of minor scans/CMS
>>> cycles, times of STW pauses) with one exception.
>>>
>>>
>>> What is the problem:
>>>
>>> Some of our remark phases last much longer than others (up to 8 times
>>> on avg.). Normal remark phase lasts between 55 and 90ms, the longest
>>> one lasted for 538ms.
>>> At first we thought that this is due to aborting the preceding
>>> abortable-preclean phase. After a closer look we found out that
>>> depending on the volume of traffic (i.e., time of day) in fact some of
>>> our abortable-preclean phases are aborted due to time limit (5sec).
>>> Despite that most of the following remark phases times still are
>>> within acceptable limit (up to 100ms). So we kept digging. As a result
>>> of that we found out that the abnormal long remark phases are preceded
>>> by aborted abortable-preclean phase. The phase was always aborted due
>>> to the time limit however if we have a look at the following report
>>> for the young generation occupancy in all cases we were able to find
>>> that YG was occupied in far more than 50%.
>>> Per my (current :)) understanding the abortable-preclean phase can be
>>> aborted due of the time limit or because YG got full in about 50% (so
>>> remark phase will happen midway during two minor collections) -
>>> whatever comes first. In our case the 'about 50%' condition is not
>>> executed and the phase continues until it hits the time limit. The
>>> following remark phase always last longer, i.e., 350-550ms.
>>>
>>>
>>> The big question:
>>>
>>> What can we do to cut down the time of those long lasting remark phases?
>>>
>>>
>>> Below I enclose three samples from our GC log presenting:
>>> first one - a CMS cycle that aborted the abortable-preclean phase due
>>> time limit and the following remark phase does not show the abnormal
>>> behavior.
>>> second one - an "ideal" CMS cycle
>>> third one - a CMS cycle with aborted the abortable-preclean phase (due
>>> to time limit even though YG occupancy is much greater than 50%) and
>>> the following remark phase lasts for 0.5second.
>>>
>>> --
>>> 1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K),
>>> 0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times:
>>> user=0.33 sys=0.01, real=0.07 secs]
>>> 1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)]
>>> 1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00,
>>> real=0.05 secs]
>>> 1142110.602: [CMS-concurrent-mark-start]
>>> 1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96
>>> sys=0.07, real=0.41 secs]
>>> 1142111.011: [CMS-concurrent-preclean-start]
>>> 1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times:
>>> user=0.02 sys=0.00, real=0.02 secs]
>>> 1142111.028: [CMS-concurrent-abortable-preclean-start]
>>>  CMS: abort preclean due to time 1142116.036:
>>> [CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times:
>>> user=7.31 sys=0.57, real=5.00 secs]
>>> 1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051:
>>> [Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs
>>> processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)]
>>> 1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00,
>>> real=0.06 secs]
>>> 1142116.107: [CMS-concurrent-sweep-start]
>>> 1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times:
>>> user=2.41 sys=0.24, real=1.61 secs]
>>> 1142117.721: [CMS-concurrent-reset-start]
>>> 1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times:
>>> user=0.01 sys=0.00, real=0.01 secs]
>>> 1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K),
>>> 0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times:
>>> user=0.29 sys=0.01, real=0.07 secs]
>>> 1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K),
>>> 0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times:
>>> user=0.29 sys=0.01, real=0.07 secs]
>>> --
>>> 1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K),
>>> 0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times:
>>> user=0.33 sys=0.00, real=0.08 secs]
>>> 1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)]
>>> 1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00,
>>> real=0.06 secs]
>>> 1165584.463: [CMS-concurrent-mark-start]
>>> 1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40
>>> sys=0.21, real=0.47 secs]
>>> 1165584.934: [CMS-concurrent-preclean-start]
>>> 1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times:
>>> user=0.05 sys=0.00, real=0.02 secs]
>>> 1165584.955: [CMS-concurrent-abortable-preclean-start]
>>> 1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs]
>>> [Times: user=5.51 sys=0.65, real=2.92 secs]
>>> 1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892:
>>> [Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs
>>> processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)]
>>> 1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01,
>>> real=0.09 secs]
>>> 1165587.986: [CMS-concurrent-sweep-start]
>>> 1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times:
>>> user=3.39 sys=0.46, real=1.69 secs]
>>> 1165589.671: [CMS-concurrent-reset-start]
>>> 1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
>>> user=0.01 sys=0.00, real=0.01 secs]
>>> 1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K),
>>> 0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times:
>>> user=0.34 sys=0.00, real=0.09 secs]
>>> 1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K),
>>> 0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times:
>>> user=0.31 sys=0.01, real=0.09 secs]
>>> --
>>> 1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K),
>>> 0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times:
>>> user=0.32 sys=0.01, real=0.08 secs]
>>> 1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)]
>>> 1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00,
>>> real=0.07 secs]
>>> 1166753.932: [CMS-concurrent-mark-start]
>>> 1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76
>>> sys=0.28, real=0.54 secs]
>>> 1166754.471: [CMS-concurrent-preclean-start]
>>> 1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times:
>>> user=0.04 sys=0.00, real=0.01 secs]
>>> 1166754.488: [CMS-concurrent-abortable-preclean-start]
>>>  CMS: abort preclean due to time 1166759.533:
>>> [CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times:
>>> user=9.75 sys=1.21, real=5.05 secs]
>>> 1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549:
>>> [Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs
>>> processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)]
>>> 1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06,
>>> real=0.56 secs]
>>> 1166760.105: [CMS-concurrent-sweep-start]
>>> 1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K),
>>> 0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times:
>>> user=0.31 sys=0.02, real=0.08 secs]
>>> 1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times:
>>> user=3.48 sys=0.49, real=1.49 secs]
>>> 1166761.593: [CMS-concurrent-reset-start]
>>> 1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times:
>>> user=0.02 sys=0.01, real=0.01 secs]
>>> 1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K),
>>> 0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times:
>>> user=0.31 sys=0.01, real=0.08 secs]
>>> --
>>>
>>> Thank you in advance,
>>> Bartek
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>

From y.s.ramakrishna at oracle.com  Wed Apr  6 16:59:54 2011
From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna)
Date: Wed, 06 Apr 2011 16:59:54 -0700
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <4D9CFCDD.9040000@oracle.com>
References: <AANLkTinkdqq_-r9O6HNVieivQ=ERiCurP=ZtTd7v_9fi@mail.gmail.com>	<4D95FD9B.9080909@oracle.com>	<AANLkTinQ2b3q-oP33i-yr8cNLw3oVawwAE8sqzvcR2-U@mail.gmail.com>
	<4D9CFCDD.9040000@oracle.com>
Message-ID: <4D9CFE7A.6040901@oracle.com>

Typo corrected:-


On 04/06/11 16:53, Y. S. Ramakrishna wrote:
...
> An alternative workaround that might also work
> for you would be -XX:CMSWaitDuration=X

That should have been:

   -XX:CMSMaxAbortablePrecleanTime=X

> where X = at least two times the maximum interscavenge
> duration observed by yr application. (The RFE
> is to, among other things, ergonomify that setting.)

Sorry about the typo.
-- ramki


From shane.cox at gmail.com  Fri Apr  8 04:40:27 2011
From: shane.cox at gmail.com (Shane Cox)
Date: Fri, 8 Apr 2011 07:40:27 -0400
Subject: understanding CMS logs
Message-ID: <BANLkTi=mE2K8iO=c5eGK26no8bXFgsqNMg@mail.gmail.com>

I'm having trouble reconciling the timings in the following log entries.
Could someone please explain?  Take the first example (concurrent mark).
The cpu time was ~ 25 seconds, elapsed time was ~ 43 seconds.  So how can
"user" time be 283 seconds?  There are only 2 Parallel CMS threads.  I would
think User time would be more along the lines of 50 seconds.  Obviously
there is something that I'm misinterpreting.


Also, If cpu time is 25 seconds but elapsed time is 43 seconds, does that
mean the CM spent 18 seconds doing something other than executing on CPU?
I'm wondering if some condition is extending the CM execution time because
it cannot obtain adequate cpu resources.


Log excerpts:
2011-04-05T19:20:38.944-0400: 164034.340: [CMS-concurrent-mark:
25.221/43.448 secs] [Times: user=283.46 sys=8.05, real=43.44 secs]

2011-04-05T19:20:51.735-0400: 164047.131: [CMS-concurrent-preclean:
10.211/17.222 secs] [Times: user=72.78 sys=2.24, real=17.22 secs]


GC Threads:
"Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x000000005f598000
nid=0x2ab5 runnable
"Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x000000005f59a000
nid=0x2ab6 runnable
"Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x000000005f59c000
nid=0x2ab7 runnable
"Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x000000005f59d800
nid=0x2ab8 runnable
"Gang worker#4 (Parallel GC Threads)" prio=10 tid=0x000000005f59f800
nid=0x2ab9 runnable
"Gang worker#5 (Parallel GC Threads)" prio=10 tid=0x000000005f5a1800
nid=0x2aba runnable
"Gang worker#6 (Parallel GC Threads)" prio=10 tid=0x000000005f5a3000
nid=0x2abb runnable
"Gang worker#7 (Parallel GC Threads)" prio=10 tid=0x000000005f5a5000
nid=0x2abc runnable

"Concurrent Mark-Sweep GC Thread" prio=10 tid=0x000000005f698800 nid=0x2ac3
runnable
"Gang worker#0 (Parallel CMS Threads)" prio=10 tid=0x000000005f694800
nid=0x2ac1 runnable
"Gang worker#1 (Parallel CMS Threads)" prio=10 tid=0x000000005f696800
nid=0x2ac2 runnable


Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110408/3f6d5350/attachment.html 

From bartosz.markocki at gmail.com  Fri Apr  8 05:25:41 2011
From: bartosz.markocki at gmail.com (Bartek Markocki)
Date: Fri, 8 Apr 2011 14:25:41 +0200
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <4D9CFE7A.6040901@oracle.com>
References: <AANLkTinkdqq_-r9O6HNVieivQ=ERiCurP=ZtTd7v_9fi@mail.gmail.com>
	<4D95FD9B.9080909@oracle.com>
	<AANLkTinQ2b3q-oP33i-yr8cNLw3oVawwAE8sqzvcR2-U@mail.gmail.com>
	<4D9CFCDD.9040000@oracle.com> <4D9CFE7A.6040901@oracle.com>
Message-ID: <BANLkTikLeP28X7eE=fVKd-YQcKUwRkOjow@mail.gmail.com>

Hi Ramki,

Thanks for the information. Currently we are mid way to our production
with the scavenge before remark option enabled. I try my best to add
CMS statistics to the list of changes.

As a separate action we will try to increase the max abortable
preclean time from 5 to 11-12 seconds.

I will update you as soon as we have the data.

Thanks,
Bartek

On Thu, Apr 7, 2011 at 1:59 AM, Y. S. Ramakrishna
<y.s.ramakrishna at oracle.com> wrote:
> Typo corrected:-
>
>
> On 04/06/11 16:53, Y. S. Ramakrishna wrote:
> ...
>>
>> An alternative workaround that might also work
>> for you would be -XX:CMSWaitDuration=X
>
> That should have been:
>
> ?-XX:CMSMaxAbortablePrecleanTime=X
>
>> where X = at least two times the maximum interscavenge
>> duration observed by yr application. (The RFE
>> is to, among other things, ergonomify that setting.)
>
> Sorry about the typo.
> -- ramki
>
>

From y.s.ramakrishna at oracle.com  Fri Apr  8 16:21:10 2011
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 08 Apr 2011 16:21:10 -0700
Subject: understanding CMS logs
In-Reply-To: <BANLkTi=mE2K8iO=c5eGK26no8bXFgsqNMg@mail.gmail.com>
References: <BANLkTi=mE2K8iO=c5eGK26no8bXFgsqNMg@mail.gmail.com>
Message-ID: <4D9F9866.7080906@oracle.com>

Hi Shane --

I transferred the snippet from below  for easy reference:-

>> Log excerpts:
>> 2011-04-05T19:20:38.944-0400: 164034.340: [CMS-concurrent-mark:
>> 25.221/43.448 secs] [Times: user=283.46 sys=8.05, real=43.44 secs]
>>
>> 2011-04-05T19:20:51.735-0400: 164047.131: [CMS-concurrent-preclean:
>> 10.211/17.222 secs] [Times: user=72.78 sys=2.24, real=17.22 secs]
>>
...

> I'm having trouble reconciling the timings in the following log entries.
> Could someone please explain?  Take the first example (concurrent mark).
> The cpu time was ~ 25 seconds, elapsed time was ~ 43 seconds.  So how can
> "user" time be 283 seconds?  There are only 2 Parallel CMS threads.  I would
> think User time would be more along the lines of 50 seconds.  Obviously
> there is something that I'm misinterpreting.

Yes and no. The interpretation that CMS conc mark took about 25 s wall clock
time executing, out of a total wall clock time of 43 seconds in indeed
correct). It is also correct that there are only 2 marking threads.

The [Times:...] part is however misleading in the sense that it is the
time for the whole JVM process, not just the virtual time for the marking
threads. In other words, for the real elapsed time of 43 seconds, the
entire process executed 283.6 virtual seconds in used mode and
8.05 virtual seconds in system mode.

I agree that this can be misleading as presented because it may be
interpreted as virtual time attrributed just to the marking threads,
which is what you did.

>
>
> Also, If cpu time is 25 seconds but elapsed time is 43 seconds, does that
> mean the CM spent 18 seconds doing something other than executing on CPU?

That's correct. Typically, what might happen is that some amount of time may be
spent for foreground GC (scavenge) work, or waiting for locks, during which
time the marking threads may not be running. Also this time is an upper limit
on the actual time that they may have been executing on cpu because they are
calculated by means of bracketing hi-res timers, rather than as virtual cpu
time.

> I'm wondering if some condition is extending the CM execution time because
> it cannot obtain adequate cpu resources.

One would have to look at the complete logs to understand.
You'd first want to total up all of the foreground GC time spent in
between and see how much of balance is left from those 18 seconds.
There may be other STW operations 9such as bulk bias revocation)
that might also interrupt the concurrent marking, as well as, may
be (but am not certin without checking the code) direct allocations
into the old generation or class loading which can cause allocation
into the perm gen.

-- ramki

>
>
> GC Threads:
> "Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x000000005f598000
> nid=0x2ab5 runnable
> "Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x000000005f59a000
> nid=0x2ab6 runnable
> "Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x000000005f59c000
> nid=0x2ab7 runnable
> "Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x000000005f59d800
> nid=0x2ab8 runnable
> "Gang worker#4 (Parallel GC Threads)" prio=10 tid=0x000000005f59f800
> nid=0x2ab9 runnable
> "Gang worker#5 (Parallel GC Threads)" prio=10 tid=0x000000005f5a1800
> nid=0x2aba runnable
> "Gang worker#6 (Parallel GC Threads)" prio=10 tid=0x000000005f5a3000
> nid=0x2abb runnable
> "Gang worker#7 (Parallel GC Threads)" prio=10 tid=0x000000005f5a5000
> nid=0x2abc runnable
>
> "Concurrent Mark-Sweep GC Thread" prio=10 tid=0x000000005f698800 nid=0x2ac3
> runnable
> "Gang worker#0 (Parallel CMS Threads)" prio=10 tid=0x000000005f694800
> nid=0x2ac1 runnable
> "Gang worker#1 (Parallel CMS Threads)" prio=10 tid=0x000000005f696800
> nid=0x2ac2 runnable
>
>
> Thanks!
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From angarita.rafael at gmail.com  Wed Apr 13 10:25:56 2011
From: angarita.rafael at gmail.com (Rafael Angarita)
Date: Wed, 13 Apr 2011 12:55:56 -0430
Subject: Java heap space, GC, and Promotion Failed
In-Reply-To: <BANLkTikdNbaHNASNPZRF7dbf2Z0idvc-KA@mail.gmail.com>
References: <BANLkTikdNbaHNASNPZRF7dbf2Z0idvc-KA@mail.gmail.com>
Message-ID: <BANLkTinOi+dOy_YVw8Gm=6t8PEeJont5Sg@mail.gmail.com>

Hello everybody,

I'm building a code generation application as an Eclipse and one of my test
projects contains around 15000 source files. My application started having
memory problems, so after doing some optimizations especific to the
framework I'm using to develope my DSL, I started learning about GC, but I
think I'm still lost.

I have tried with different JVM options for the GC with no success.
Currently, I'm trying:

-Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails
 -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode
-XX:+CMSIncrementalPacing -XX:CMSInitiatingOccupancyFraction=5
-XX:MaxTenuringThreshold=300 -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSIncrementalDutyCycleMin=1

but this is just one of the several things I have tried.

At first everything seems to go fine, but after awhile I get "promotion
failed" and everything gets really slow, and finally the application crash
with java.lang.OutOfMemoryError: Java heap space.


CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: user=0.66
sys=0.02, real=0.58 secs]
[GC[YG occupancy: 28574 K (29504 K)][Rescan (parallel) , 0.0198420
secs][weak refs processing, 0.0015200 secs] [1 CMS-remark:
1961760K(2015232K)] 1990335K(2044736K), 0.0215890 secs] [Times: user=0.03
sys=0.00, real=0.03 secs]
[GC [ParNew: 29096K->2087K(29504K), 0.0270900 secs]
1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs] [Times: user=0.05
sys=0.00, real=0.03 secs]
[GC [ParNew: 28327K->3264K(29504K), 0.0430410 secs]
1990448K->1969326K(2044736K) icms_dc=100 , 0.0431180 secs] [Times: user=0.07
sys=0.01, real=0.04 secs]
[GC [ParNew: 29504K->3264K(29504K), 0.0658260 secs]
1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] [Times: user=0.11
sys=0.00, real=0.07 secs]
[GC [ParNew: 29504K->3264K(29504K), 0.0630250 secs]
2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] [Times: user=0.11
sys=0.00, real=0.06 secs]
[GC [ParNew: 29504K->3263K(29504K), 0.0435130 secs]
2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs] [Times: user=0.07
sys=0.00, real=0.04 secs]
[CMS-concurrent-sweep: 1.813/2.058 secs] [Times: user=3.76 sys=0.02,
real=2.05 secs]
[CMS-concurrent-reset: 0.035/0.035 secs] [Times: user=0.06 sys=0.00,
real=0.04 secs]
[GC [ParNew (promotion failed): 29503K->29504K(29504K), 0.5729750
secs][CMS[Unloading class sun.reflect.GeneratedConstructorAccessor6]
[Unloading class sun.reflect.GeneratedConstructorAccessor26]
[Unloading class sun.reflect.GeneratedMethodAccessor9]
[Unloading class sun.reflect.GeneratedConstructorAccessor17]
[Unloading class sun.reflect.GeneratedConstructorAccessor20]
[Unloading class sun.reflect.GeneratedMethodAccessor4]
[Unloading class sun.reflect.GeneratedMethodAccessor8]
[Unloading class sun.reflect.GeneratedConstructorAccessor25]
[Unloading class sun.reflect.GeneratedMethodAccessor18]
[Unloading class sun.reflect.GeneratedMethodAccessor17]
[Unloading class sun.reflect.GeneratedConstructorAccessor27]
[Unloading class sun.reflect.GeneratedConstructorAccessor19]
[Unloading class sun.reflect.GeneratedConstructorAccessor12]
[Unloading class sun.reflect.GeneratedMethodAccessor2]
[Unloading class sun.reflect.GeneratedConstructorAccessor14]
[Unloading class sun.reflect.GeneratedConstructorAccessor28]
[Unloading class sun.reflect.GeneratedConstructorAccessor5]
[Unloading class sun.reflect.GeneratedMethodAccessor16]
[Unloading class sun.reflect.GeneratedMethodAccessor19]
[Unloading class sun.reflect.GeneratedConstructorAccessor9]
[Unloading class sun.reflect.GeneratedConstructorAccessor11]
[Unloading class sun.reflect.GeneratedConstructorAccessor8]
[Unloading class sun.reflect.GeneratedConstructorAccessor29]
[Unloading class sun.reflect.GeneratedMethodAccessor3]
[Unloading class sun.reflect.GeneratedConstructorAccessor24]
[Unloading class sun.reflect.GeneratedConstructorAccessor18]
[Unloading class sun.reflect.GeneratedMethodAccessor15]
[Unloading class sun.reflect.GeneratedConstructorAccessor10]
[Unloading class sun.reflect.GeneratedConstructorAccessor16]
[Unloading class sun.reflect.GeneratedConstructorAccessor15]

[Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95
sys=0.02, real=9.00 secs]
 (concurrent mode failure): 2008891K->2014802K(2015232K), 24.2053380 secs]
2038395K->2014802K(2044736K), [CMS Perm : 50779K->50779K(86244K)]
icms_dc=100 , 24.2054320 secs] [Times: user=24.16 sys=0.03, real=24.20
secs]
[GC [1 CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), 0.0023250
secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

This is the end of the output:

Heap
 par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000,
0x308b0000)
  eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
  from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
  to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
 concurrent mark-sweep generation total 2015232K, used 61766K [0x308b0000,
0xab8b0000, 0xab8b0000)
 concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000,
0xb0e75000, 0xb38b0000)


I would appreciate if anybody can give me an advise about this.

Thank you very much for your help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110413/0572df06/attachment.html 

From y.s.ramakrishna at oracle.com  Wed Apr 13 12:28:31 2011
From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna)
Date: Wed, 13 Apr 2011 12:28:31 -0700
Subject: Java heap space, GC, and Promotion Failed
In-Reply-To: <BANLkTinOi+dOy_YVw8Gm=6t8PEeJont5Sg@mail.gmail.com>
References: <BANLkTikdNbaHNASNPZRF7dbf2Z0idvc-KA@mail.gmail.com>
	<BANLkTinOi+dOy_YVw8Gm=6t8PEeJont5Sg@mail.gmail.com>
Message-ID: <4DA5F95F.1010101@oracle.com>

Hi Rafael --

Looks like you need more heap: size your -Xmx bigger to
accommodate all of the objects that your Eclipse project creates.
Here's the state of the old gen in the penultimate display:-

>> [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 
>> sys=0.02, real=9.00 secs] 
>>  (concurrent mode failure): 2008891K->2014802K(2015232K), 24.2053380 
>> secs] 2038395K->2014802K(2044736K), [CMS Perm : 50779K->50779K(86244K)] 
>> icms_dc=100 , 24.2054320 secs] [Times: user=24.16 sys=0.03, real=24.20 
>> secs] 
>> [GC [1 CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), 
>> 0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 

The last line shows that the old gen has:-
  2015232 - 2014802 = 430 KB
of free space. Perhaps you were trying to allocate an object bigger than that.
I'd suggest running with a larger heap (possibly using a 64-bit JVM if
you need more Java heap).

However, the end of your message does not show the heap to be too full.
Perhaps Eclipse catches the OOM, and drops all of the objects before it exits,
so you see the heap as not full in the final display:-
(Eclipse experts on the list might want to weigh in.)

>> This is the end of the output:
>> 
>> Heap
>>  par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000, 
>> 0x308b0000)
>>   eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
>>   from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
>>   to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
>>  concurrent mark-sweep generation total 2015232K, used 61766K 
>> [0x308b0000, 0xab8b0000, 0xab8b0000)
>>  concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, 
>> 0xb0e75000, 0xb38b0000)


Asides:-
Never, never use values for MaxTenuringThreshold exceeding 15, unless
you are sure you want that kind of behaviour. I'd suggest
just leave that option out unless you know how to tune for it (there's
lots of experience on this alias with tuning that though, should
you need to tune that for performance in the future).

More asides (specific to CMS):-
Depending on what your platform is, if it has anything more than 2 cores,
i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then
want to drop other options starting with "CMSIncremental".
CMS does not unload classes by default. With Eclipse etc. you would
want to unload classes concurrently so as not to get OOM's:
use -XX:+CMSClassUnloadingEnabled (and if on older JVM's -XX:+CMSPermGenSweepingEnabled).

Bottom line: looks like you need more Java heap.
-- ramki

On 04/13/11 10:25, Rafael Angarita wrote:
> Hello everybody,
> 
> I'm building a code generation application as an Eclipse and one of my 
> test projects contains around 15000 source files. My application started 
> having memory problems, so after doing some optimizations especific to 
> the framework I'm using to develope my DSL, I started learning about GC, 
> but I think I'm still lost.
> 
> I have tried with different JVM options for the GC with no success. 
> Currently, I'm trying:
> 
> -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails 
>  -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC 
> -XX:+CMSIncrementalMode 
> -XX:+CMSIncrementalPacing -XX:CMSInitiatingOccupancyFraction=5 
> -XX:MaxTenuringThreshold=300 -XX:+UseCMSInitiatingOccupancyOnly 
> -XX:CMSIncrementalDutyCycleMin=1
> 
> but this is just one of the several things I have tried.
> 
> At first everything seems to go fine, but after awhile I get "promotion 
> failed" and everything gets really slow, and finally the application 
> crash with java.lang.OutOfMemoryError: Java heap space.
> 
> 
> CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: user=0.66 
> sys=0.02, real=0.58 secs] 
> [GC[YG occupancy: 28574 K (29504 K)][Rescan (parallel) , 0.0198420 
> secs][weak refs processing, 0.0015200 secs] [1 CMS-remark: 
> 1961760K(2015232K)] 1990335K(2044736K), 0.0215890 secs] [Times: 
> user=0.03 sys=0.00, real=0.03 secs] 
> [GC [ParNew: 29096K->2087K(29504K), 0.0270900 secs] 
> 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs] [Times: 
> user=0.05 sys=0.00, real=0.03 secs] 
> [GC [ParNew: 28327K->3264K(29504K), 0.0430410 secs] 
> 1990448K->1969326K(2044736K) icms_dc=100 , 0.0431180 secs] [Times: 
> user=0.07 sys=0.01, real=0.04 secs] 
> [GC [ParNew: 29504K->3264K(29504K), 0.0658260 secs] 
> 1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] [Times: 
> user=0.11 sys=0.00, real=0.07 secs] 
> [GC [ParNew: 29504K->3264K(29504K), 0.0630250 secs] 
> 2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] [Times: 
> user=0.11 sys=0.00, real=0.06 secs] 
> [GC [ParNew: 29504K->3263K(29504K), 0.0435130 secs] 
> 2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs] [Times: 
> user=0.07 sys=0.00, real=0.04 secs] 
> [CMS-concurrent-sweep: 1.813/2.058 secs] [Times: user=3.76 sys=0.02, 
> real=2.05 secs] 
> [CMS-concurrent-reset: 0.035/0.035 secs] [Times: user=0.06 sys=0.00, 
> real=0.04 secs] 
> [GC [ParNew (promotion failed): 29503K->29504K(29504K), 0.5729750 
> secs][CMS[Unloading class sun.reflect.GeneratedConstructorAccessor6]
> [Unloading class sun.reflect.GeneratedConstructorAccessor26]
> [Unloading class sun.reflect.GeneratedMethodAccessor9]
> [Unloading class sun.reflect.GeneratedConstructorAccessor17]
> [Unloading class sun.reflect.GeneratedConstructorAccessor20]
> [Unloading class sun.reflect.GeneratedMethodAccessor4]
> [Unloading class sun.reflect.GeneratedMethodAccessor8]
> [Unloading class sun.reflect.GeneratedConstructorAccessor25]
> [Unloading class sun.reflect.GeneratedMethodAccessor18]
> [Unloading class sun.reflect.GeneratedMethodAccessor17]
> [Unloading class sun.reflect.GeneratedConstructorAccessor27]
> [Unloading class sun.reflect.GeneratedConstructorAccessor19]
> [Unloading class sun.reflect.GeneratedConstructorAccessor12]
> [Unloading class sun.reflect.GeneratedMethodAccessor2]
> [Unloading class sun.reflect.GeneratedConstructorAccessor14]
> [Unloading class sun.reflect.GeneratedConstructorAccessor28]
> [Unloading class sun.reflect.GeneratedConstructorAccessor5]
> [Unloading class sun.reflect.GeneratedMethodAccessor16]
> [Unloading class sun.reflect.GeneratedMethodAccessor19]
> [Unloading class sun.reflect.GeneratedConstructorAccessor9]
> [Unloading class sun.reflect.GeneratedConstructorAccessor11]
> [Unloading class sun.reflect.GeneratedConstructorAccessor8]
> [Unloading class sun.reflect.GeneratedConstructorAccessor29]
> [Unloading class sun.reflect.GeneratedMethodAccessor3]
> [Unloading class sun.reflect.GeneratedConstructorAccessor24]
> [Unloading class sun.reflect.GeneratedConstructorAccessor18]
> [Unloading class sun.reflect.GeneratedMethodAccessor15]
> [Unloading class sun.reflect.GeneratedConstructorAccessor10]
> [Unloading class sun.reflect.GeneratedConstructorAccessor16]
> [Unloading class sun.reflect.GeneratedConstructorAccessor15]
> 
> [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 
> sys=0.02, real=9.00 secs] 
>  (concurrent mode failure): 2008891K->2014802K(2015232K), 24.2053380 
> secs] 2038395K->2014802K(2044736K), [CMS Perm : 50779K->50779K(86244K)] 
> icms_dc=100 , 24.2054320 secs] [Times: user=24.16 sys=0.03, real=24.20 
> secs] 
> [GC [1 CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), 
> 0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
> 
> This is the end of the output:
> 
> Heap
>  par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000, 
> 0x308b0000)
>   eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
>   from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
>   to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
>  concurrent mark-sweep generation total 2015232K, used 61766K 
> [0x308b0000, 0xab8b0000, 0xab8b0000)
>  concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, 
> 0xb0e75000, 0xb38b0000)
> 
> 
> I would appreciate if anybody can give me an advise about this.
> 
> Thank you very much for your help.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From angarita.rafael at gmail.com  Thu Apr 14 07:17:45 2011
From: angarita.rafael at gmail.com (Rafael Angarita)
Date: Thu, 14 Apr 2011 09:47:45 -0430
Subject: Java heap space, GC, and Promotion Failed
In-Reply-To: <4DA5F95F.1010101@oracle.com>
References: <BANLkTikdNbaHNASNPZRF7dbf2Z0idvc-KA@mail.gmail.com>
	<BANLkTinOi+dOy_YVw8Gm=6t8PEeJont5Sg@mail.gmail.com>
	<4DA5F95F.1010101@oracle.com>
Message-ID: <BANLkTimkvviHiPy5e1Sg=QfUOdcH+NYNLA@mail.gmail.com>

Thank you very much!

I took your advise about the JVM GC parameters and removed some of them.

I used  -Xmx2500m. My application gets further with the proccesing it needs
to do, but the whole computer gets really slow and my application crash
anyway.

I'm trying to get the developers of the framework I'm using for my DSL.

If any of you guys have more ideas, I'm here to listen and learn.

Thank you very much.

On 13 April 2011 14:58, Y. S. Ramakrishna <y.s.ramakrishna at oracle.com>wrote:

> Hi Rafael --
>
> Looks like you need more heap: size your -Xmx bigger to
> accommodate all of the objects that your Eclipse project creates.
> Here's the state of the old gen in the penultimate display:-
>
>
>  [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95
>>> sys=0.02, real=9.00 secs]  (concurrent mode failure):
>>> 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K),
>>> [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times:
>>> user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark:
>>> 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00
>>> sys=0.00, real=0.00 secs]
>>>
>>
> The last line shows that the old gen has:-
>  2015232 - 2014802 = 430 KB
> of free space. Perhaps you were trying to allocate an object bigger than
> that.
> I'd suggest running with a larger heap (possibly using a 64-bit JVM if
> you need more Java heap).
>
> However, the end of your message does not show the heap to be too full.
> Perhaps Eclipse catches the OOM, and drops all of the objects before it
> exits,
> so you see the heap as not full in the final display:-
> (Eclipse experts on the list might want to weigh in.)
>
>
>  This is the end of the output:
>>>
>>> Heap
>>>  par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000,
>>> 0x308b0000)
>>>  eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
>>>  from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
>>>  to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
>>>  concurrent mark-sweep generation total 2015232K, used 61766K
>>> [0x308b0000, 0xab8b0000, 0xab8b0000)
>>>  concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000,
>>> 0xb0e75000, 0xb38b0000)
>>>
>>
>
> Asides:-
> Never, never use values for MaxTenuringThreshold exceeding 15, unless
> you are sure you want that kind of behaviour. I'd suggest
> just leave that option out unless you know how to tune for it (there's
> lots of experience on this alias with tuning that though, should
> you need to tune that for performance in the future).
>
> More asides (specific to CMS):-
> Depending on what your platform is, if it has anything more than 2 cores,
> i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then
> want to drop other options starting with "CMSIncremental".
> CMS does not unload classes by default. With Eclipse etc. you would
> want to unload classes concurrently so as not to get OOM's:
> use -XX:+CMSClassUnloadingEnabled (and if on older JVM's
> -XX:+CMSPermGenSweepingEnabled).
>
> Bottom line: looks like you need more Java heap.
> -- ramki
>
>
> On 04/13/11 10:25, Rafael Angarita wrote:
>
>> Hello everybody,
>>
>> I'm building a code generation application as an Eclipse and one of my
>> test projects contains around 15000 source files. My application started
>> having memory problems, so after doing some optimizations especific to the
>> framework I'm using to develope my DSL, I started learning about GC, but I
>> think I'm still lost.
>>
>> I have tried with different JVM options for the GC with no success.
>> Currently, I'm trying:
>>
>> -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails
>>  -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC
>> -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
>> -XX:CMSInitiatingOccupancyFraction=5 -XX:MaxTenuringThreshold=300
>> -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSIncrementalDutyCycleMin=1
>>
>> but this is just one of the several things I have tried.
>>
>> At first everything seems to go fine, but after awhile I get "promotion
>> failed" and everything gets really slow, and finally the application crash
>> with java.lang.OutOfMemoryError: Java heap space.
>>
>>
>> CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: user=0.66
>> sys=0.02, real=0.58 secs] [GC[YG occupancy: 28574 K (29504 K)][Rescan
>> (parallel) , 0.0198420 secs][weak refs processing, 0.0015200 secs] [1
>> CMS-remark: 1961760K(2015232K)] 1990335K(2044736K), 0.0215890 secs] [Times:
>> user=0.03 sys=0.00, real=0.03 secs] [GC [ParNew: 29096K->2087K(29504K),
>> 0.0270900 secs] 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs]
>> [Times: user=0.05 sys=0.00, real=0.03 secs] [GC [ParNew:
>> 28327K->3264K(29504K), 0.0430410 secs] 1990448K->1969326K(2044736K)
>> icms_dc=100 , 0.0431180 secs] [Times: user=0.07 sys=0.01, real=0.04 secs]
>> [GC [ParNew: 29504K->3264K(29504K), 0.0658260 secs]
>> 1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] [Times: user=0.11
>> sys=0.00, real=0.07 secs] [GC [ParNew: 29504K->3264K(29504K), 0.0630250
>> secs] 2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] [Times:
>> user=0.11 sys=0.00, real=0.06 secs] [GC [ParNew: 29504K->3263K(29504K),
>> 0.0435130 secs] 2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs]
>> [Times: user=0.07 sys=0.00, real=0.04 secs] [CMS-concurrent-sweep:
>> 1.813/2.058 secs] [Times: user=3.76 sys=0.02, real=2.05 secs]
>> [CMS-concurrent-reset: 0.035/0.035 secs] [Times: user=0.06 sys=0.00,
>> real=0.04 secs] [GC [ParNew (promotion failed): 29503K->29504K(29504K),
>> 0.5729750 secs][CMS[Unloading class
>> sun.reflect.GeneratedConstructorAccessor6]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor26]
>> [Unloading class sun.reflect.GeneratedMethodAccessor9]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor17]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor20]
>> [Unloading class sun.reflect.GeneratedMethodAccessor4]
>> [Unloading class sun.reflect.GeneratedMethodAccessor8]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor25]
>> [Unloading class sun.reflect.GeneratedMethodAccessor18]
>> [Unloading class sun.reflect.GeneratedMethodAccessor17]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor27]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor19]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor12]
>> [Unloading class sun.reflect.GeneratedMethodAccessor2]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor14]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor28]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor5]
>> [Unloading class sun.reflect.GeneratedMethodAccessor16]
>> [Unloading class sun.reflect.GeneratedMethodAccessor19]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor9]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor11]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor8]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor29]
>> [Unloading class sun.reflect.GeneratedMethodAccessor3]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor24]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor18]
>> [Unloading class sun.reflect.GeneratedMethodAccessor15]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor10]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor16]
>> [Unloading class sun.reflect.GeneratedConstructorAccessor15]
>>
>> [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95
>> sys=0.02, real=9.00 secs]  (concurrent mode failure):
>> 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K),
>> [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times:
>> user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark:
>> 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00
>> sys=0.00, real=0.00 secs]
>> This is the end of the output:
>>
>> Heap
>>  par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000,
>> 0x308b0000)
>>  eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
>>  from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
>>  to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
>>  concurrent mark-sweep generation total 2015232K, used 61766K [0x308b0000,
>> 0xab8b0000, 0xab8b0000)
>>  concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000,
>> 0xb0e75000, 0xb38b0000)
>>
>>
>> I would appreciate if anybody can give me an advise about this.
>>
>> Thank you very much for your help.
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110414/0c4d983a/attachment-0001.html 

From chkwok at digibites.nl  Thu Apr 14 08:47:20 2011
From: chkwok at digibites.nl (Chi Ho Kwok)
Date: Thu, 14 Apr 2011 17:47:20 +0200
Subject: Java heap space, GC, and Promotion Failed
In-Reply-To: <BANLkTimkvviHiPy5e1Sg=QfUOdcH+NYNLA@mail.gmail.com>
References: <BANLkTikdNbaHNASNPZRF7dbf2Z0idvc-KA@mail.gmail.com>
	<BANLkTinOi+dOy_YVw8Gm=6t8PEeJont5Sg@mail.gmail.com>
	<4DA5F95F.1010101@oracle.com>
	<BANLkTimkvviHiPy5e1Sg=QfUOdcH+NYNLA@mail.gmail.com>
Message-ID: <BANLkTi=kdcWY7tAWceydxWydBCmrJr_MKQ@mail.gmail.com>

Well, just use a profiler to see why the app is using so much memory. It's
either leaking or it just requires that much RAM to process everything.
Don't allocate more memory than you have. I doubt this problem has anything
to do with GC at all, so remove all the flags except for -Xmx.

On Thu, Apr 14, 2011 at 4:17 PM, Rafael Angarita
<angarita.rafael at gmail.com>wrote:

> Thank you very much!
>
> I took your advise about the JVM GC parameters and removed some of them.
>
> I used  -Xmx2500m. My application gets further with the proccesing it needs
> to do, but the whole computer gets really slow and my application crash
> anyway.
>
> I'm trying to get the developers of the framework I'm using for my DSL.
>
> If any of you guys have more ideas, I'm here to listen and learn.
>
> Thank you very much.
>
> On 13 April 2011 14:58, Y. S. Ramakrishna <y.s.ramakrishna at oracle.com>wrote:
>
>> Hi Rafael --
>>
>> Looks like you need more heap: size your -Xmx bigger to
>> accommodate all of the objects that your Eclipse project creates.
>> Here's the state of the old gen in the penultimate display:-
>>
>>
>>  [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95
>>>> sys=0.02, real=9.00 secs]  (concurrent mode failure):
>>>> 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K),
>>>> [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times:
>>>> user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark:
>>>> 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00
>>>> sys=0.00, real=0.00 secs]
>>>>
>>>
>> The last line shows that the old gen has:-
>>  2015232 - 2014802 = 430 KB
>> of free space. Perhaps you were trying to allocate an object bigger than
>> that.
>> I'd suggest running with a larger heap (possibly using a 64-bit JVM if
>> you need more Java heap).
>>
>> However, the end of your message does not show the heap to be too full.
>> Perhaps Eclipse catches the OOM, and drops all of the objects before it
>> exits,
>> so you see the heap as not full in the final display:-
>> (Eclipse experts on the list might want to weigh in.)
>>
>>
>>  This is the end of the output:
>>>>
>>>> Heap
>>>>  par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000,
>>>> 0x308b0000)
>>>>  eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
>>>>  from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
>>>>  to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
>>>>  concurrent mark-sweep generation total 2015232K, used 61766K
>>>> [0x308b0000, 0xab8b0000, 0xab8b0000)
>>>>  concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000,
>>>> 0xb0e75000, 0xb38b0000)
>>>>
>>>
>>
>> Asides:-
>> Never, never use values for MaxTenuringThreshold exceeding 15, unless
>> you are sure you want that kind of behaviour. I'd suggest
>> just leave that option out unless you know how to tune for it (there's
>> lots of experience on this alias with tuning that though, should
>> you need to tune that for performance in the future).
>>
>> More asides (specific to CMS):-
>> Depending on what your platform is, if it has anything more than 2 cores,
>> i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then
>> want to drop other options starting with "CMSIncremental".
>> CMS does not unload classes by default. With Eclipse etc. you would
>> want to unload classes concurrently so as not to get OOM's:
>> use -XX:+CMSClassUnloadingEnabled (and if on older JVM's
>> -XX:+CMSPermGenSweepingEnabled).
>>
>> Bottom line: looks like you need more Java heap.
>> -- ramki
>>
>>
>> On 04/13/11 10:25, Rafael Angarita wrote:
>>
>>> Hello everybody,
>>>
>>> I'm building a code generation application as an Eclipse and one of my
>>> test projects contains around 15000 source files. My application started
>>> having memory problems, so after doing some optimizations especific to the
>>> framework I'm using to develope my DSL, I started learning about GC, but I
>>> think I'm still lost.
>>>
>>> I have tried with different JVM options for the GC with no success.
>>> Currently, I'm trying:
>>>
>>> -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails
>>>  -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC
>>> -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
>>> -XX:CMSInitiatingOccupancyFraction=5 -XX:MaxTenuringThreshold=300
>>> -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSIncrementalDutyCycleMin=1
>>>
>>> but this is just one of the several things I have tried.
>>>
>>> At first everything seems to go fine, but after awhile I get "promotion
>>> failed" and everything gets really slow, and finally the application crash
>>> with java.lang.OutOfMemoryError: Java heap space.
>>>
>>>
>>> CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: user=0.66
>>> sys=0.02, real=0.58 secs] [GC[YG occupancy: 28574 K (29504 K)][Rescan
>>> (parallel) , 0.0198420 secs][weak refs processing, 0.0015200 secs] [1
>>> CMS-remark: 1961760K(2015232K)] 1990335K(2044736K), 0.0215890 secs] [Times:
>>> user=0.03 sys=0.00, real=0.03 secs] [GC [ParNew: 29096K->2087K(29504K),
>>> 0.0270900 secs] 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs]
>>> [Times: user=0.05 sys=0.00, real=0.03 secs] [GC [ParNew:
>>> 28327K->3264K(29504K), 0.0430410 secs] 1990448K->1969326K(2044736K)
>>> icms_dc=100 , 0.0431180 secs] [Times: user=0.07 sys=0.01, real=0.04 secs]
>>> [GC [ParNew: 29504K->3264K(29504K), 0.0658260 secs]
>>> 1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] [Times: user=0.11
>>> sys=0.00, real=0.07 secs] [GC [ParNew: 29504K->3264K(29504K), 0.0630250
>>> secs] 2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] [Times:
>>> user=0.11 sys=0.00, real=0.06 secs] [GC [ParNew: 29504K->3263K(29504K),
>>> 0.0435130 secs] 2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs]
>>> [Times: user=0.07 sys=0.00, real=0.04 secs] [CMS-concurrent-sweep:
>>> 1.813/2.058 secs] [Times: user=3.76 sys=0.02, real=2.05 secs]
>>> [CMS-concurrent-reset: 0.035/0.035 secs] [Times: user=0.06 sys=0.00,
>>> real=0.04 secs] [GC [ParNew (promotion failed): 29503K->29504K(29504K),
>>> 0.5729750 secs][CMS[Unloading class
>>> sun.reflect.GeneratedConstructorAccessor6]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor26]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor9]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor17]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor20]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor4]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor8]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor25]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor18]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor17]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor27]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor19]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor12]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor2]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor14]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor28]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor5]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor16]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor19]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor9]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor11]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor8]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor29]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor3]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor24]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor18]
>>> [Unloading class sun.reflect.GeneratedMethodAccessor15]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor10]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor16]
>>> [Unloading class sun.reflect.GeneratedConstructorAccessor15]
>>>
>>> [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95
>>> sys=0.02, real=9.00 secs]  (concurrent mode failure):
>>> 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K),
>>> [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times:
>>> user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark:
>>> 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00
>>> sys=0.00, real=0.00 secs]
>>> This is the end of the output:
>>>
>>> Heap
>>>  par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000,
>>> 0x308b0000)
>>>  eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
>>>  from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
>>>  to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
>>>  concurrent mark-sweep generation total 2015232K, used 61766K
>>> [0x308b0000, 0xab8b0000, 0xab8b0000)
>>>  concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000,
>>> 0xb0e75000, 0xb38b0000)
>>>
>>>
>>> I would appreciate if anybody can give me an advise about this.
>>>
>>> Thank you very much for your help.
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110414/f2e20b49/attachment.html 

From y.s.ramakrishna at oracle.com  Thu Apr 14 09:03:09 2011
From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna)
Date: Thu, 14 Apr 2011 09:03:09 -0700
Subject: Java heap space, GC, and Promotion Failed
In-Reply-To: <BANLkTimkvviHiPy5e1Sg=QfUOdcH+NYNLA@mail.gmail.com>
References: <BANLkTikdNbaHNASNPZRF7dbf2Z0idvc-KA@mail.gmail.com>	<BANLkTinOi+dOy_YVw8Gm=6t8PEeJont5Sg@mail.gmail.com>	<4DA5F95F.1010101@oracle.com>
	<BANLkTimkvviHiPy5e1Sg=QfUOdcH+NYNLA@mail.gmail.com>
Message-ID: <4DA71ABD.1050804@oracle.com>

May be double the heap using -d64 -Xms5g -Xmx5g (assuming your
machine has enough RAM so you are not swapping), and
see what happens.

If that size of heap usage seems excessive, try and use
a heap profiling tool to see why your application is
holding on to so much.

all the best.
-- ramki

On 04/14/11 07:17, Rafael Angarita wrote:
> Thank you very much!
> 
> I took your advise about the JVM GC parameters and removed some of them.
> 
> I used  -Xmx2500m. My application gets further with the proccesing it 
> needs to do, but the whole computer gets really slow and my application 
> crash anyway.
> 
> I'm trying to get the developers of the framework I'm using for my DSL.
> 
> If any of you guys have more ideas, I'm here to listen and learn.
> 
> Thank you very much.
> 
> On 13 April 2011 14:58, Y. S. Ramakrishna <y.s.ramakrishna at oracle.com 
> <mailto:y.s.ramakrishna at oracle.com>> wrote:
> 
>     Hi Rafael --
> 
>     Looks like you need more heap: size your -Xmx bigger to
>     accommodate all of the objects that your Eclipse project creates.
>     Here's the state of the old gen in the penultimate display:-
> 
> 
>             [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times:
>             user=10.95 sys=0.02, real=9.00 secs]  (concurrent mode
>             failure): 2008891K->2014802K(2015232K), 24.2053380 secs]
>             2038395K->2014802K(2044736K), [CMS Perm :
>             50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs]
>             [Times: user=24.16 sys=0.03, real=24.20 secs] [GC [1
>             CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K),
>             0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> 
> 
>     The last line shows that the old gen has:-
>      2015232 - 2014802 = 430 KB
>     of free space. Perhaps you were trying to allocate an object bigger
>     than that.
>     I'd suggest running with a larger heap (possibly using a 64-bit JVM if
>     you need more Java heap).
> 
>     However, the end of your message does not show the heap to be too full.
>     Perhaps Eclipse catches the OOM, and drops all of the objects before
>     it exits,
>     so you see the heap as not full in the final display:-
>     (Eclipse experts on the list might want to weigh in.)
> 
> 
>             This is the end of the output:
> 
>             Heap
>              par new generation   total 29504K, used 23591K [0x2e8b0000,
>             0x308b0000, 0x308b0000)
>              eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20,
>             0x30250000)
>              from space 3264K, 100% used [0x30580000, 0x308b0000,
>             0x308b0000)
>              to   space 3264K,   0% used [0x30250000, 0x30250000,
>             0x30580000)
>              concurrent mark-sweep generation total 2015232K, used
>             61766K [0x308b0000, 0xab8b0000, 0xab8b0000)
>              concurrent-mark-sweep perm gen total 87828K, used 52696K
>             [0xab8b0000, 0xb0e75000, 0xb38b0000)
> 
> 
> 
>     Asides:-
>     Never, never use values for MaxTenuringThreshold exceeding 15, unless
>     you are sure you want that kind of behaviour. I'd suggest
>     just leave that option out unless you know how to tune for it (there's
>     lots of experience on this alias with tuning that though, should
>     you need to tune that for performance in the future).
> 
>     More asides (specific to CMS):-
>     Depending on what your platform is, if it has anything more than 2
>     cores,
>     i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then
>     want to drop other options starting with "CMSIncremental".
>     CMS does not unload classes by default. With Eclipse etc. you would
>     want to unload classes concurrently so as not to get OOM's:
>     use -XX:+CMSClassUnloadingEnabled (and if on older JVM's
>     -XX:+CMSPermGenSweepingEnabled).
> 
>     Bottom line: looks like you need more Java heap.
>     -- ramki
> 
> 
>     On 04/13/11 10:25, Rafael Angarita wrote:
> 
>         Hello everybody,
> 
>         I'm building a code generation application as an Eclipse and one
>         of my test projects contains around 15000 source files. My
>         application started having memory problems, so after doing some
>         optimizations especific to the framework I'm using to develope
>         my DSL, I started learning about GC, but I think I'm still lost.
> 
>         I have tried with different JVM options for the GC with no
>         success. Currently, I'm trying:
> 
>         -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails
>          -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC
>         -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
>         -XX:CMSInitiatingOccupancyFraction=5
>         -XX:MaxTenuringThreshold=300 -XX:+UseCMSInitiatingOccupancyOnly
>         -XX:CMSIncrementalDutyCycleMin=1
> 
>         but this is just one of the several things I have tried.
> 
>         At first everything seems to go fine, but after awhile I get
>         "promotion failed" and everything gets really slow, and finally
>         the application crash with java.lang.OutOfMemoryError: Java heap
>         space.
> 
> 
>         CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times:
>         user=0.66 sys=0.02, real=0.58 secs] [GC[YG occupancy: 28574 K
>         (29504 K)][Rescan (parallel) , 0.0198420 secs][weak refs
>         processing, 0.0015200 secs] [1 CMS-remark: 1961760K(2015232K)]
>         1990335K(2044736K), 0.0215890 secs] [Times: user=0.03 sys=0.00,
>         real=0.03 secs] [GC [ParNew: 29096K->2087K(29504K), 0.0270900
>         secs] 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs]
>         [Times: user=0.05 sys=0.00, real=0.03 secs] [GC [ParNew:
>         28327K->3264K(29504K), 0.0430410 secs]
>         1990448K->1969326K(2044736K) icms_dc=100 , 0.0431180 secs]
>         [Times: user=0.07 sys=0.01, real=0.04 secs] [GC [ParNew:
>         29504K->3264K(29504K), 0.0658260 secs]
>         1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs]
>         [Times: user=0.11 sys=0.00, real=0.07 secs] [GC [ParNew:
>         29504K->3264K(29504K), 0.0630250 secs]
>         2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs]
>         [Times: user=0.11 sys=0.00, real=0.06 secs] [GC [ParNew:
>         29504K->3263K(29504K), 0.0435130 secs]
>         2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs]
>         [Times: user=0.07 sys=0.00, real=0.04 secs]
>         [CMS-concurrent-sweep: 1.813/2.058 secs] [Times: user=3.76
>         sys=0.02, real=2.05 secs] [CMS-concurrent-reset: 0.035/0.035
>         secs] [Times: user=0.06 sys=0.00, real=0.04 secs] [GC [ParNew
>         (promotion failed): 29503K->29504K(29504K), 0.5729750
>         secs][CMS[Unloading class sun.reflect.GeneratedConstructorAccessor6]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor26]
>         [Unloading class sun.reflect.GeneratedMethodAccessor9]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor17]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor20]
>         [Unloading class sun.reflect.GeneratedMethodAccessor4]
>         [Unloading class sun.reflect.GeneratedMethodAccessor8]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor25]
>         [Unloading class sun.reflect.GeneratedMethodAccessor18]
>         [Unloading class sun.reflect.GeneratedMethodAccessor17]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor27]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor19]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor12]
>         [Unloading class sun.reflect.GeneratedMethodAccessor2]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor14]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor28]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor5]
>         [Unloading class sun.reflect.GeneratedMethodAccessor16]
>         [Unloading class sun.reflect.GeneratedMethodAccessor19]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor9]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor11]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor8]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor29]
>         [Unloading class sun.reflect.GeneratedMethodAccessor3]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor24]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor18]
>         [Unloading class sun.reflect.GeneratedMethodAccessor15]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor10]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor16]
>         [Unloading class sun.reflect.GeneratedConstructorAccessor15]
> 
>         [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times:
>         user=10.95 sys=0.02, real=9.00 secs]  (concurrent mode failure):
>         2008891K->2014802K(2015232K), 24.2053380 secs]
>         2038395K->2014802K(2044736K), [CMS Perm :
>         50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times:
>         user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark:
>         2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times:
>         user=0.00 sys=0.00, real=0.00 secs]
>         This is the end of the output:
> 
>         Heap
>          par new generation   total 29504K, used 23591K [0x2e8b0000,
>         0x308b0000, 0x308b0000)
>          eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
>          from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
>          to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
>          concurrent mark-sweep generation total 2015232K, used 61766K
>         [0x308b0000, 0xab8b0000, 0xab8b0000)
>          concurrent-mark-sweep perm gen total 87828K, used 52696K
>         [0xab8b0000, 0xb0e75000, 0xb38b0000)
> 
> 
>         I would appreciate if anybody can give me an advise about this.
> 
>         Thank you very much for your help.
> 
> 
>         ------------------------------------------------------------------------
> 
>         _______________________________________________
>         hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net
>         <mailto:hotspot-gc-use at openjdk.java.net>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From bharathwork at yahoo.com  Fri Apr 15 07:12:46 2011
From: bharathwork at yahoo.com (Bharath Mundlapudi)
Date: Fri, 15 Apr 2011 07:12:46 -0700 (PDT)
Subject: Is CMS cycle can collect finalize objects
Message-ID: <522727.98997.qm@web110708.mail.gq1.yahoo.com>

We have tuned our server to not run into Full GC with CMS collector. One thing, we noted recently was - java.lang.ref.Finalizer objects getting incremented with load. Due to this, CMS cycle threshold was reached and CMS went into loop and run continuously.

To verify if CMS is cleaning up these Finalizer objects, I have tested on another setup. I have noticed that Finalizer objects are not getting cleaned but when i force full gc, these objects are getting garbage collected. 


I have the following questions:
1. Is there a way (JVM cmd option) to tell CMS to cleanup Finalizer objects when CMS runs rather than via Full GC?
2. I see that, we can there is System.runFinalization() method to notify GC to cleanup the finalizer queue. Is this better approach for server-side applications?
3. Is there any JMX API to invoke finalization from an external process?


Versions verified:

We are using JDK 1.6.0 update 21/23 on Redhat 5.4. 


Thanks in anticipation,
Bharath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110415/92d2f600/attachment.html 

From y.s.ramakrishna at oracle.com  Fri Apr 15 09:33:24 2011
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 15 Apr 2011 09:33:24 -0700
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <522727.98997.qm@web110708.mail.gq1.yahoo.com>
References: <522727.98997.qm@web110708.mail.gq1.yahoo.com>
Message-ID: <4DA87354.7050206@oracle.com>

Hi Bharath --

On 4/15/2011 7:12 AM, Bharath Mundlapudi wrote:
> We have tuned our server to not run into Full GC with CMS collector. One thing, we noted recently was - java.lang.ref.Finalizer objects getting incremented with load. Due to this, CMS cycle threshold was reached and CMS went into loop and run continuously.
>
> To verify if CMS is cleaning up these Finalizer objects, I have tested on another setup. I have noticed that Finalizer objects are not getting cleaned but when i force full gc, these objects are getting garbage collected.
>
>
> I have the following questions:
> 1. Is there a way (JVM cmd option) to tell CMS to cleanup Finalizer objects when CMS runs rather than via Full GC?

CMS does process finalizable objects without the need for a full STW gc.
Once an object with a finalizer is determined by the CMS collector to
be unreachable, it will be placed on the finalizable queue, whence
the finalizer thread will pull those objects and finalize them. At the
next CMS cycle the space used by those objects will become available
for new allocation. The only catch is that the CMS collector will
only detect such objects in the old generation (and if you have
enabled class unloading, in the perm generation). (That is not to
say that those in the younger gen will not be finalized; they will
be if both the FinalRef and the referent object are in the young gen
at the time that the referent became unreachable. If they happen to
be split between the two generations (which is unlikely to happen in
practice but is not impossible), then we'll need to wait until
the object that is in the younger generation migrates to the older
generation and then they will be discovered at the next CMS cycle.
(And then there will need to be another CMS cycle to actually reclaim the
space used by them following finalization.)

In your case, what is the symptom? Is the finalizer thread's "to be finalized"
queue of objects growing, or are you saying that the CMS collector does not
detect and enqueue unreachable objects into the finalizer's queue?
What does jmap -finalizerinfo on your process show?
What does -XX:+PrintClassHistogram show as accumulating in the heap?
(Are they one specific type of Finalizer objects or all varieties?)


> 2. I see that, we can there is System.runFinalization() method to notify GC to cleanup the finalizer queue. Is this better approach for server-side applications?

runFinalization() will only cause the objects in the finalizer queue to get
finalized in a new thread. If objects are not already on the queue nothing
will happen.

> 3. Is there any JMX API to invoke finalization from an external process?

Don't know.

>
>
> Versions verified:
>
> We are using JDK 1.6.0 update 21/23 on Redhat 5.4.

Did the problem start in 6u21? Or are those the only versions
you tested and found that there was an issue?

Do you have a test case that demonstrates the issue you
encounter? If so, could you send it in and open a suitable bug
report?

-- ramki


>
>
> Thanks in anticipation,
> Bharath
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From doug.jones at internet.co.nz  Fri Apr 15 15:19:16 2011
From: doug.jones at internet.co.nz (Doug Jones)
Date: Sat, 16 Apr 2011 10:19:16 +1200
Subject: Java heap space, GC, and Promotion Failed
References: <BANLkTikdNbaHNASNPZRF7dbf2Z0idvc-KA@mail.gmail.com><BANLkTinOi+dOy_YVw8Gm=6t8PEeJont5Sg@mail.gmail.com><4DA5F95F.1010101@oracle.com>
	<BANLkTimkvviHiPy5e1Sg=QfUOdcH+NYNLA@mail.gmail.com>
Message-ID: <5469D08D9C3B4BB3A0183CA434E3AA69@hp6926682b5457>

Hi Rafael,

Perhaps two other observations: You don't appear to have a Young generation size specified (-XX:NewSize=, or -Xmn). From your logs YG is just 29.5Mb which is probably too small with a 2Gb heap and a lot of objects being created. The idea is to get as many objects to die and be collected in the YG, provided ParNew times are OK (I think the adaptive collector tries to keep this around 10-20ms, but most apps can cope with longer ParNew pauses than that). This puts less pressure on Tenured. With a 2Gb heap size, you might try a NewSize of 128m. If ParNew's are taking too long then reduce it (it depends on whether your app is producing lots of small objects, or larger objects).

The other curiosity is that you have CMSInitiatingFraction set to only 5, so effectively you are using only 5% of the Heap (strictly Tenured). It can be helpful to set this parameter (and its companion UseInitiatingOccupancyOnly) to avoid conc-mode failures, but again with a small YG size, and relatively large Heap, that should probably be up around 80% at least.

Maybe try these, and if you are still having problems send some more logs showing the details.

Doug. 

  ----- Original Message ----- 
  From: Rafael Angarita 
  To: Y.S.Ramakrishna at oracle.com 
  Cc: hotspot-gc-use at openjdk.java.net 
  Sent: Friday, April 15, 2011 2:17 AM
  Subject: Re: Java heap space, GC, and Promotion Failed


  Thank you very much!


  I took your advise about the JVM GC parameters and removed some of them.


  I used  -Xmx2500m. My application gets further with the proccesing it needs to do, but the whole computer gets really slow and my application crash anyway.


  I'm trying to get the developers of the framework I'm using for my DSL.


  If any of you guys have more ideas, I'm here to listen and learn.


  Thank you very much.


  On 13 April 2011 14:58, Y. S. Ramakrishna <y.s.ramakrishna at oracle.com> wrote:

    Hi Rafael --

    Looks like you need more heap: size your -Xmx bigger to
    accommodate all of the objects that your Eclipse project creates.
    Here's the state of the old gen in the penultimate display:-


        [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 sys=0.02, real=9.00 secs]  (concurrent mode failure): 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K), [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times: user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 


    The last line shows that the old gen has:-
     2015232 - 2014802 = 430 KB
    of free space. Perhaps you were trying to allocate an object bigger than that.
    I'd suggest running with a larger heap (possibly using a 64-bit JVM if
    you need more Java heap).

    However, the end of your message does not show the heap to be too full.
    Perhaps Eclipse catches the OOM, and drops all of the objects before it exits,
    so you see the heap as not full in the final display:-
    (Eclipse experts on the list might want to weigh in.)


        This is the end of the output:

        Heap
         par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000, 0x308b0000)
         eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
         from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
         to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
         concurrent mark-sweep generation total 2015232K, used 61766K [0x308b0000, 0xab8b0000, 0xab8b0000)
         concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, 0xb0e75000, 0xb38b0000)


    Asides:-
    Never, never use values for MaxTenuringThreshold exceeding 15, unless
    you are sure you want that kind of behaviour. I'd suggest
    just leave that option out unless you know how to tune for it (there's
    lots of experience on this alias with tuning that though, should
    you need to tune that for performance in the future).

    More asides (specific to CMS):-
    Depending on what your platform is, if it has anything more than 2 cores,
    i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then
    want to drop other options starting with "CMSIncremental".
    CMS does not unload classes by default. With Eclipse etc. you would
    want to unload classes concurrently so as not to get OOM's:
    use -XX:+CMSClassUnloadingEnabled (and if on older JVM's -XX:+CMSPermGenSweepingEnabled).

    Bottom line: looks like you need more Java heap.
    -- ramki


    On 04/13/11 10:25, Rafael Angarita wrote:

      Hello everybody,

      I'm building a code generation application as an Eclipse and one of my test projects contains around 15000 source files. My application started having memory problems, so after doing some optimizations especific to the framework I'm using to develope my DSL, I started learning about GC, but I think I'm still lost.

      I have tried with different JVM options for the GC with no success. Currently, I'm trying:

      -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails  -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:CMSInitiatingOccupancyFraction=5 -XX:MaxTenuringThreshold=300 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSIncrementalDutyCycleMin=1

      but this is just one of the several things I have tried.

      At first everything seems to go fine, but after awhile I get "promotion failed" and everything gets really slow, and finally the application crash with java.lang.OutOfMemoryError: Java heap space.


      CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: user=0.66 sys=0.02, real=0.58 secs] [GC[YG occupancy: 28574 K (29504 K)][Rescan (parallel) , 0.0198420 secs][weak refs processing, 0.0015200 secs] [1 CMS-remark: 1961760K(2015232K)] 1990335K(2044736K), 0.0215890 secs] [Times: user=0.03 sys=0.00, real=0.03 secs] [GC [ParNew: 29096K->2087K(29504K), 0.0270900 secs] 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs] [Times: user=0.05 sys=0.00, real=0.03 secs] [GC [ParNew: 28327K->3264K(29504K), 0.0430410 secs] 1990448K->1969326K(2044736K) icms_dc=100 , 0.0431180 secs] [Times: user=0.07 sys=0.01, real=0.04 secs] [GC [ParNew: 29504K->3264K(29504K), 0.0658260 secs] 1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] [Times: user=0.11 sys=0.00, real=0.07 secs] [GC [ParNew: 29504K->3264K(29504K), 0.0630250 secs] 2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] [Times: user=0.11 sys=0.00, real=0.06 secs] [GC [ParNew: 29504K->3263K(29504K), 0.0435130 secs] 2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs] [Times: user=0.07 sys=0.00, real=0.04 secs] [CMS-concurrent-sweep: 1.813/2.058 secs] [Times: user=3.76 sys=0.02, real=2.05 secs] [CMS-concurrent-reset: 0.035/0.035 secs] [Times: user=0.06 sys=0.00, real=0.04 secs] [GC [ParNew (promotion failed): 29503K->29504K(29504K), 0.5729750 secs][CMS[Unloading class sun.reflect.GeneratedConstructorAccessor6]
      [Unloading class sun.reflect.GeneratedConstructorAccessor26]
      [Unloading class sun.reflect.GeneratedMethodAccessor9]
      [Unloading class sun.reflect.GeneratedConstructorAccessor17]
      [Unloading class sun.reflect.GeneratedConstructorAccessor20]
      [Unloading class sun.reflect.GeneratedMethodAccessor4]
      [Unloading class sun.reflect.GeneratedMethodAccessor8]
      [Unloading class sun.reflect.GeneratedConstructorAccessor25]
      [Unloading class sun.reflect.GeneratedMethodAccessor18]
      [Unloading class sun.reflect.GeneratedMethodAccessor17]
      [Unloading class sun.reflect.GeneratedConstructorAccessor27]
      [Unloading class sun.reflect.GeneratedConstructorAccessor19]
      [Unloading class sun.reflect.GeneratedConstructorAccessor12]
      [Unloading class sun.reflect.GeneratedMethodAccessor2]
      [Unloading class sun.reflect.GeneratedConstructorAccessor14]
      [Unloading class sun.reflect.GeneratedConstructorAccessor28]
      [Unloading class sun.reflect.GeneratedConstructorAccessor5]
      [Unloading class sun.reflect.GeneratedMethodAccessor16]
      [Unloading class sun.reflect.GeneratedMethodAccessor19]
      [Unloading class sun.reflect.GeneratedConstructorAccessor9]
      [Unloading class sun.reflect.GeneratedConstructorAccessor11]
      [Unloading class sun.reflect.GeneratedConstructorAccessor8]
      [Unloading class sun.reflect.GeneratedConstructorAccessor29]
      [Unloading class sun.reflect.GeneratedMethodAccessor3]
      [Unloading class sun.reflect.GeneratedConstructorAccessor24]
      [Unloading class sun.reflect.GeneratedConstructorAccessor18]
      [Unloading class sun.reflect.GeneratedMethodAccessor15]
      [Unloading class sun.reflect.GeneratedConstructorAccessor10]
      [Unloading class sun.reflect.GeneratedConstructorAccessor16]
      [Unloading class sun.reflect.GeneratedConstructorAccessor15]

      [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 sys=0.02, real=9.00 secs]  (concurrent mode failure): 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K), [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times: user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
      This is the end of the output:

      Heap
       par new generation   total 29504K, used 23591K [0x2e8b0000, 0x308b0000, 0x308b0000)
       eden space 26240K,  77% used [0x2e8b0000, 0x2fc89f20, 0x30250000)
       from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000)
       to   space 3264K,   0% used [0x30250000, 0x30250000, 0x30580000)
       concurrent mark-sweep generation total 2015232K, used 61766K [0x308b0000, 0xab8b0000, 0xab8b0000)
       concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, 0xb0e75000, 0xb38b0000)


      I would appreciate if anybody can give me an advise about this.

      Thank you very much for your help.


      ------------------------------------------------------------------------

      _______________________________________________
      hotspot-gc-use mailing list
      hotspot-gc-use at openjdk.java.net
      http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


------------------------------------------------------------------------------


  _______________________________________________
  hotspot-gc-use mailing list
  hotspot-gc-use at openjdk.java.net
  http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110416/15fb53b3/attachment.html 

From yuzhihong at gmail.com  Sun Apr 17 20:55:55 2011
From: yuzhihong at gmail.com (Ted Yu)
Date: Sun, 17 Apr 2011 20:55:55 -0700
Subject: question on finalize method
Message-ID: <BANLkTi=jF1MD+NT-6JWcLx4XFroaHicdHQ@mail.gmail.com>

Is this statement true for Java 1.6 and beyond (
http://forums.whirlpool.net.au/archive/754353) ?
In fact, it is perfectly permissible for a Java VM to *never* call it.

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110417/a506f138/attachment.html 

From y.s.ramakrishna at oracle.com  Sun Apr 17 23:49:21 2011
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Sun, 17 Apr 2011 23:49:21 -0700
Subject: question on finalize method
In-Reply-To: <BANLkTi=jF1MD+NT-6JWcLx4XFroaHicdHQ@mail.gmail.com>
References: <BANLkTi=jF1MD+NT-6JWcLx4XFroaHicdHQ@mail.gmail.com>
Message-ID: <4DABDEF1.3050007@oracle.com>

Yes; indeed, the spec is deliberately loose because it
is difficult in practice to implement any hard promptness
guarantees in general.

-- ramki

On 4/17/2011 8:55 PM, Ted Yu wrote:
> Is this statement true for Java 1.6 and beyond (
> http://forums.whirlpool.net.au/archive/754353) ?
> In fact, it is perfectly permissible for a Java VM to *never* call it.
>
> Thanks
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From bharathwork at yahoo.com  Fri Apr 22 16:00:24 2011
From: bharathwork at yahoo.com (Bharath Mundlapudi)
Date: Fri, 22 Apr 2011 16:00:24 -0700 (PDT)
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <4DA87354.7050206@oracle.com>
References: <522727.98997.qm@web110708.mail.gq1.yahoo.com>
	<4DA87354.7050206@oracle.com>
Message-ID: <774851.44950.qm@web110705.mail.gq1.yahoo.com>

Hi Ramki,

Thanks for the detailed explanation. I was trying to run some tests for your questions. Here are the answers to some of your questions.

>>What are the symptoms?
java.net.SocksSocketImpl objects are not getting cleaned up after a CMS cycle. I see the direct correlation to java.lang.ref.Finalizer objects. Overtime, this fills up the old generation and CMS going in loop occupying complete one core. But when we trigger Full GC, these objects are garbage collected. 

You mentioned that CMS cycle does cleanup these objects provided we enable class unloading. Are you suggesting -XX:+ClassUnloading or -XX:+CMSClassUnloadingEnabled? I have tried with later and 

didn't succeed.? Our pern gen is relatively constant, by enabling this, are we introducing performance overhead? We have room for CPU cycles and perm gen is relatively small, so this may be fine. Just that we want to see these objects should GC'ed in CMS cycle. 


Do you have any suggestion w.r.t. to which flags should i be using to trigger this?


>> What does jmap -finalizerinfo on your process show?
>> What does -XX:+PrintClassHistogram show as accumulating in the heap?
>> (Are they one specific type of Finalizer objects or all varieties?)

Jmap -histo shows the above class is keep accumulating. Infact, finalizerinfo doesn't show any objects on this process.


>>Did the problem start in 6u21? Or are those the only versions
>>you tested and found that there was an issue?
We have seen this problem in 6u21. We were on 6u12 earlier and didn't run into this problem. But can't say this is a build particular, since lots of things have changed.

Thanks in anticipation,
-Bharath

________________________________

From: Y. Srinivas Ramakrishna <y.s.ramakrishna at oracle.com>
To: Bharath Mundlapudi <bharathwork at yahoo.com>
Cc: hotspot-gc-use at openjdk.java.net
Sent: Friday, April 15, 2011 9:33 AM
Subject: Re: Is CMS cycle can collect finalize objects

Hi Bharath --

On 4/15/2011 7:12 AM, Bharath Mundlapudi wrote:
> We have tuned our server to not run into Full GC with CMS collector. One thing, we noted recently was - java.lang.ref.Finalizer objects getting incremented with load. Due to this, CMS cycle threshold was reached and CMS went into loop and run continuously.
>
> To verify if CMS is cleaning up these Finalizer objects, I have tested on another setup. I have noticed that Finalizer objects are not getting cleaned but when i force full gc, these objects are getting garbage collected.
>
>
> I have the following questions:
> 1. Is there a way (JVM cmd option) to tell CMS to cleanup Finalizer objects when CMS runs rather than via Full GC?

CMS does process finalizable objects without the need for a full STW gc.
Once an object with a finalizer is determined by the CMS collector to
be unreachable, it will be placed on the finalizable queue, whence
the finalizer thread will pull those objects and finalize them. At the
next CMS cycle the space used by those objects will become available
for new allocation. The only catch is that the CMS collector will
only detect such objects in the old generation (and if you have
enabled class unloading, in the perm generation). (That is not to
say that those in the younger gen will not be finalized; they will
be if both the FinalRef and the referent object are in the young gen
at the time that the referent became unreachable. If they happen to
be split between the two generations (which is unlikely to happen in
practice but is not impossible), then we'll need to wait until
the object that is in the younger generation migrates to the older
generation and then they will be discovered at the next CMS cycle.
(And then there will need to be another CMS cycle to actually reclaim the
space used by them following finalization.)

In your case, what is the symptom? Is the finalizer thread's "to be finalized"
queue of objects growing, or are you saying that the CMS collector does not
detect and enqueue unreachable objects into the finalizer's queue?
What does jmap -finalizerinfo on your process show?
What does -XX:+PrintClassHistogram show as accumulating in the heap?
(Are they one specific type of Finalizer objects or all varieties?)


> 2. I see that, we can there is System.runFinalization() method to notify GC to cleanup the finalizer queue. Is this better approach for server-side applications?

runFinalization() will only cause the objects in the finalizer queue to get
finalized in a new thread. If objects are not already on the queue nothing
will happen.

> 3. Is there any JMX API to invoke finalization from an external process?

Don't know.

>
>
> Versions verified:
>
> We are using JDK 1.6.0 update 21/23 on Redhat 5.4.

Did the problem start in 6u21? Or are those the only versions
you tested and found that there was an issue?

Do you have a test case that demonstrates the issue you
encounter? If so, could you send it in and open a suitable bug
report?

-- ramki


>
>
> Thanks in anticipation,
> Bharath
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From y.s.ramakrishna at oracle.com  Mon Apr 25 09:48:08 2011
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Mon, 25 Apr 2011 09:48:08 -0700
Subject: Is CMS cycle can collect finalize objects
Message-ID: <4DB5A5C8.7070307@oracle.com>


Forgot to cc the alias; response attached.
-------------- next part --------------
An embedded message was scrubbed...
From: "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com>
Subject: Re: Is CMS cycle can collect finalize objects
Date: Mon, 25 Apr 2011 09:37:30 -0700
Size: 4853
Url: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110425/9d895eee/attachment.eml 

From fancyerii at gmail.com  Sun Apr 24 20:21:21 2011
From: fancyerii at gmail.com (Li Li)
Date: Mon, 25 Apr 2011 11:21:21 +0800
Subject: is there any resource about gc details of hotspot?
Message-ID: <BANLkTinYTEHp+LC57ODbUCZRx7u8EoO0+Q@mail.gmail.com>

hi all,
    I'd like to learn the detail of each garbage collector such as
Serial GC, Parallel GC, G1 GC. the basic idea of these algorithm(I
don't want to read the codes of open jdk now because it's hard to
understand). such as how they do marking and sweeping, why some of
them need stopping the world while others can run concurrently with
java application.
    http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html
is the official document. but I need something more detailed. thank
you.

From do.chuan at gmail.com  Mon Apr 25 18:52:19 2011
From: do.chuan at gmail.com (dochuan)
Date: Tue, 26 Apr 2011 09:52:19 +0800
Subject: is there any resource about gc details of hotspot?
In-Reply-To: <BANLkTinYTEHp+LC57ODbUCZRx7u8EoO0+Q@mail.gmail.com>
References: <BANLkTinYTEHp+LC57ODbUCZRx7u8EoO0+Q@mail.gmail.com>
Message-ID: <4DB62553.704@gmail.com>

book:
Garbage Collection: algorithms for automatic dynamic memory management

and
http://www.hpl.hp.com/personal/Hans_Boehm/

On 11-4-25 ??11:21, Li Li wrote:
> hi all,
>      I'd like to learn the detail of each garbage collector such as
> Serial GC, Parallel GC, G1 GC. the basic idea of these algorithm(I
> don't want to read the codes of open jdk now because it's hard to
> understand). such as how they do marking and sweeping, why some of
> them need stopping the world while others can run concurrently with
> java application.
>      http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html
> is the official document. but I need something more detailed. thank
> you.
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From hjohn at xs4all.nl  Tue Apr 26 07:59:01 2011
From: hjohn at xs4all.nl (John Hendrikx)
Date: Tue, 26 Apr 2011 16:59:01 +0200
Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?
Message-ID: <4DB6DDB5.4040804@xs4all.nl>

Hi list,

I've been testing Java 1.6 performance vs Java 1.7 performance with a 
timing critical application -- it's essential that garbage collection 
pauses are very short.  What I've found is that Java 1.6 seems to 
perform significantly better than 1.7 (b137) in this respect, although 
with certain settings 1.6 will also fail catastrophically.   I've used 
the following options:

For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC
For 1.7.0b137: -Xms256M -Xmx256M  -XX:+UseG1GC -XX:MaxGCPauseMillis=2

The amount of garbage created is roughly 150 MB/sec.  The application 
demands a response time of about 20 ms and uses half a dozen threads 
which deal with buffering and decoding of information.

With the above settings, the 1.6 VM will meet this goal over a 2 minute 
period >99% of the time (with an average CPU consumption of 65% per CPU 
core for two cores) -- from verbosegc I gather that the pause times are 
around 0.01-0.02 seconds:

[GC 187752K->187559K(258880K), 0.0148198 secs]
[GC 192156K(258880K), 0.0008281 secs]
[GC 144561K->144372K(258880K), 0.0153497 secs]
[GC 148965K(258880K), 0.0008028 secs]
[GC 166187K->165969K(258880K), 0.0146546 secs]
[GC 187935K->187754K(258880K), 0.0150638 secs]
[GC 192344K(258880K), 0.0008422 secs]

Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit.  
It can also introduce OutOfMemory conditions and other catastrophic 
failures (one time the GC took 10 seconds after the application had only 
been running 20 seconds).  How stable 1.6 will perform with the initial 
settings remains to be seen; the results with more RAM worry me somewhat.

The 1.7 VM however performs significantly worse.  Here is some of its 
output (over roughtly a one second period):

[GC concurrent-mark-end, 0.0197681 sec]
[GC remark, 0.0030323 secs]
[GC concurrent-count-start]
[GC concurrent-count-end, 0.0060561]
[GC cleanup 177M->103M(256M), 0.0005319 secs]
[GC concurrent-cleanup-start]
[GC concurrent-cleanup-end, 0.0000676]
[GC pause (partial) 136M->136M(256M), 0.0046206 secs]
[GC pause (partial) 139M->139M(256M), 0.0039039 secs]
[GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs]
[GC concurrent-mark-start]
[GC concurrent-mark-end, 0.0152915 sec]
[GC remark, 0.0033085 secs]
[GC concurrent-count-start]
[GC concurrent-count-end, 0.0085232]
[GC cleanup 163M->129M(256M), 0.0004847 secs]
[GC concurrent-cleanup-start]
[GC concurrent-cleanup-end, 0.0000363]

 From the above output one would not expect the performance to be worse, 
however, the application fails to meet its goals 10-20% of the time.  
The amount of garbage created is the same.  CPU time however is hovering 
around 90-95%, which is likely the cause of the poor performance.  The 
GC seems to take a significantly larger amount of time to do its work 
causing these stalls in my test application.

I've experimented with memory sizes and max pause times with the 1.7 VM, 
and although it seemed to be doing better with more RAM, it never comes 
even close to the performance observed with the 1.6 VM.

I'm not sure if there are other useful options I can try to see if I can 
tune the 1.7 VM performance a bit better. I can provide more 
information, although not any (useful) source code at this time due to 
external dependencies (JNA/JNI) of this application.

I'm wondering if I'm missing something as it seems strange to me that 
1.7 is actually underperforming for me when in general most seem to 
agree that the G1GC is a huge improvement.

--John


From shane.cox at gmail.com  Tue Apr 26 10:36:37 2011
From: shane.cox at gmail.com (Shane Cox)
Date: Tue, 26 Apr 2011 13:36:37 -0400
Subject: Periodic long minor GC pauses
Message-ID: <BANLkTikk3uK+LQTwhfrCqFsreH=wBUDnRw@mail.gmail.com>

Periodically, our Java app on Linux experiences a long Minor GC pause that
cannot be accounted for by the GC time in the log file.  Instead, the pause
is captured as "real" (wall clock) time and is observable in our application
logs.  An example is below.  The GC completed in 56ms, but the application
was paused for 2.45 seconds.

2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew:
943439K->104832K(943744K), 0.0481790 secs] 4909998K->4086751K(25060992K),
0.0485110 secs] [Times: user=0.34 sys=0.03, real=0.04 secs]
2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew:
942852K->104832K(943744K), 0.0738000 secs] 4924772K->4150899K(25060992K),
0.0740980 secs] [Times: user=0.45 sys=0.12, real=0.07 secs]
2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew:
943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K),
0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]*
2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew:
918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K),
0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs]
2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew:
919952K->104832K(943744K), 0.0845070 secs] 5028177K->4268050K(25060992K),
0.0848300 secs] [Times: user=0.52 sys=0.11, real=0.09 secs]


Initially I suspected swapping, but according to the free command, 0 bytes
of swap are in use.
>free -m
             total       used       free     shared    buffers     cached
Mem:         32168      28118       4050          0        824      12652
-/+ buffers/cache:      14641      17527
Swap:         8191          0       8191


Next, I read about a problem relating to mprotect() on Linux that can be
worked around with -XX:+UseMember.  I tried that, but I still see the same
unexplainable pauses.


Any suggestions/ideas?  We've upgraded to the latest JDK, but no luck.

Thanks,
Shane


java version "1.6.0_25"
Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)


Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64
x86_64 GNU/Linux


-verbose:gc  -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m
-XX:MaxPermSize=256m -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70
-XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings
-XX:+UseMembar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110426/f9ba1dd4/attachment.html 

From y.s.ramakrishna at oracle.com  Tue Apr 26 10:45:55 2011
From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna)
Date: Tue, 26 Apr 2011 10:45:55 -0700
Subject: Periodic long minor GC pauses
In-Reply-To: <BANLkTikk3uK+LQTwhfrCqFsreH=wBUDnRw@mail.gmail.com>
References: <BANLkTikk3uK+LQTwhfrCqFsreH=wBUDnRw@mail.gmail.com>
Message-ID: <4DB704D3.20600@oracle.com>

The pause is definitely in the beginning, before GC collection code
itself runs; witness the timestamps:-

2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31 sys=0.09, real=2.45 secs]

The first timestamp is 2120.686 and the next one is 2123.075, so we have
about 2.389 s between those two. If you add to that the GC time of 0.056 s,
you get 2.445 which is close enough to the 2.45 s reported.

So we need to figure out what happens in the JVM between those two
time-stamps and we can at least bound the culprit.

-- ramki

On 04/26/11 10:36, Shane Cox wrote:
> Periodically, our Java app on Linux experiences a long Minor GC pause 
> that cannot be accounted for by the GC time in the log file.  Instead, 
> the pause is captured as "real" (wall clock) time and is observable in 
> our application logs.  An example is below.  The GC completed in 56ms, 
> but the application was paused for 2.45 seconds.
> 
> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew: 
> 943439K->104832K(943744K), 0.0481790 secs] 
> 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: user=0.34 
> sys=0.03, real=0.04 secs]
> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew: 
> 942852K->104832K(943744K), 0.0738000 secs] 
> 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: user=0.45 
> sys=0.12, real=0.07 secs]
> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: 
> 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), 
> 0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]*
> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew: 
> 918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K), 
> 0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs]
> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew: 
> 919952K->104832K(943744K), 0.0845070 secs] 
> 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: user=0.52 
> sys=0.11, real=0.09 secs]
> 
> 
> Initially I suspected swapping, but according to the free command, 0 
> bytes of swap are in use.
>  >free -m
>              total       used       free     shared    buffers     cached
> Mem:         32168      28118       4050          0        824      12652
> -/+ buffers/cache:      14641      17527
> Swap:         8191          0       8191
> 
> 
> Next, I read about a problem relating to mprotect() on Linux that can be 
> worked around with -XX:+UseMember.  I tried that, but I still see the 
> same unexplainable pauses.
> 
> 
> Any suggestions/ideas?  We've upgraded to the latest JDK, but no luck.
> 
> Thanks,
> Shane
> 
> 
> java version "1.6.0_25"
> Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)
> 
> 
> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 
> x86_64 GNU/Linux
> 
> 
> -verbose:gc  -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m 
> -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution 
> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
> -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSClassUnloadingEnabled 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC 
> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings -XX:+UseMembar
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From y.s.ramakrishna at oracle.com  Tue Apr 26 11:17:46 2011
From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna)
Date: Tue, 26 Apr 2011 11:17:46 -0700
Subject: Periodic long minor GC pauses
In-Reply-To: <4DB704D3.20600@oracle.com>
References: <BANLkTikk3uK+LQTwhfrCqFsreH=wBUDnRw@mail.gmail.com>
	<4DB704D3.20600@oracle.com>
Message-ID: <4DB70C4A.9090203@oracle.com>

I had a quick look and all i could find was the GC prologue
code (although i didn't look all that carefully).
Bascially, GC is invoked, it prints this timestamp,
does a bit of global book-keeping and some initialization,
and then goes over each generation in the heap and
says "i am going to do a collection, do whatever you need
to do before i do the collection", and the generations each do a bit of
book-keeping and any relevant initialization.

The only thing i can see in the gc prologues other than a bit
of lightweight book-keeping is some reporting code that could
potentially be heavyweight. But you do not have any of those
enabled in your option set, so there should not be anything
obviously heavyweight going on.

I'd suggest filing a bug under the category of jvm/hotspot/garbage_collector
so someone in support can work with you to get this diagnosed...

Three questions when you file the bug:
(1) have you seen this start happening recently? (version?)
(2) can you check if the longer pauses are "random" or do
     they always happen "during" CMS concurrent cycles or
     always outside of such cycles?
(3) test set-up.

-- ramki

On 04/26/11 10:45, Y. S. Ramakrishna wrote:
> The pause is definitely in the beginning, before GC collection code
> itself runs; witness the timestamps:-
> 
> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31 sys=0.09, real=2.45 secs]
> 
> The first timestamp is 2120.686 and the next one is 2123.075, so we have
> about 2.389 s between those two. If you add to that the GC time of 0.056 s,
> you get 2.445 which is close enough to the 2.45 s reported.
> 
> So we need to figure out what happens in the JVM between those two
> time-stamps and we can at least bound the culprit.
> 
> -- ramki
> 
> On 04/26/11 10:36, Shane Cox wrote:
>> Periodically, our Java app on Linux experiences a long Minor GC pause 
>> that cannot be accounted for by the GC time in the log file.  Instead, 
>> the pause is captured as "real" (wall clock) time and is observable in 
>> our application logs.  An example is below.  The GC completed in 56ms, 
>> but the application was paused for 2.45 seconds.
>>
>> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew: 
>> 943439K->104832K(943744K), 0.0481790 secs] 
>> 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: user=0.34 
>> sys=0.03, real=0.04 secs]
>> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew: 
>> 942852K->104832K(943744K), 0.0738000 secs] 
>> 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: user=0.45 
>> sys=0.12, real=0.07 secs]
>> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: 
>> 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), 
>> 0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]*
>> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew: 
>> 918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K), 
>> 0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs]
>> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew: 
>> 919952K->104832K(943744K), 0.0845070 secs] 
>> 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: user=0.52 
>> sys=0.11, real=0.09 secs]
>>
>>
>> Initially I suspected swapping, but according to the free command, 0 
>> bytes of swap are in use.
>>  >free -m
>>              total       used       free     shared    buffers     cached
>> Mem:         32168      28118       4050          0        824      12652
>> -/+ buffers/cache:      14641      17527
>> Swap:         8191          0       8191
>>
>>
>> Next, I read about a problem relating to mprotect() on Linux that can be 
>> worked around with -XX:+UseMember.  I tried that, but I still see the 
>> same unexplainable pauses.
>>
>>
>> Any suggestions/ideas?  We've upgraded to the latest JDK, but no luck.
>>
>> Thanks,
>> Shane
>>
>>
>> java version "1.6.0_25"
>> Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
>> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)
>>
>>
>> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 
>> x86_64 GNU/Linux
>>
>>
>> -verbose:gc  -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m 
>> -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution 
>> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
>> -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSClassUnloadingEnabled 
>> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC 
>> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings -XX:+UseMembar
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From shane.cox at gmail.com  Tue Apr 26 11:29:42 2011
From: shane.cox at gmail.com (Shane Cox)
Date: Tue, 26 Apr 2011 14:29:42 -0400
Subject: Periodic long minor GC pauses
In-Reply-To: <4DB70C4A.9090203@oracle.com>
References: <BANLkTikk3uK+LQTwhfrCqFsreH=wBUDnRw@mail.gmail.com>
	<4DB704D3.20600@oracle.com> <4DB70C4A.9090203@oracle.com>
Message-ID: <BANLkTindtwyfhxcDwWo4H-MTY0DTGY--_A@mail.gmail.com>

Below is an example from a Remark.  Of the total 1.3 seconds of elapsed
time, 1.2 seconds is found between the first two timestamps.  However, I'm
not savvy enough to know whether this is the same problem or simply the
result of a long scavenge that occurs as part of the Remark.  Is there any
way to tell?

2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K (943744
K)]9467.353: [Rescan (parallel) , 0.0106370 secs]9467.374: [weak refs
processing, 0.0159250 secs]9467.390: [class unloading, 0.0180420
secs]9467.408: [scrub symbol & string tables, 0.0458500 secs] [1 CMS-remark:
12520949K(24117248K)] 13233450K(25060992K), 0.1052950 secs] [Times:
user=0.13 sys=0.01, real=1.32 secs]


On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna <
y.s.ramakrishna at oracle.com> wrote:

> I had a quick look and all i could find was the GC prologue
> code (although i didn't look all that carefully).
> Bascially, GC is invoked, it prints this timestamp,
> does a bit of global book-keeping and some initialization,
> and then goes over each generation in the heap and
> says "i am going to do a collection, do whatever you need
> to do before i do the collection", and the generations each do a bit of
> book-keeping and any relevant initialization.
>
> The only thing i can see in the gc prologues other than a bit
> of lightweight book-keeping is some reporting code that could
> potentially be heavyweight. But you do not have any of those
> enabled in your option set, so there should not be anything
> obviously heavyweight going on.
>
> I'd suggest filing a bug under the category of
> jvm/hotspot/garbage_collector
> so someone in support can work with you to get this diagnosed...
>
> Three questions when you file the bug:
> (1) have you seen this start happening recently? (version?)
> (2) can you check if the longer pauses are "random" or do
>    they always happen "during" CMS concurrent cycles or
>    always outside of such cycles?
> (3) test set-up.
>
> -- ramki
>
>
> On 04/26/11 10:45, Y. S. Ramakrishna wrote:
>
>> The pause is definitely in the beginning, before GC collection code
>> itself runs; witness the timestamps:-
>>
>> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew:
>> 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K),
>> 0.0563970 secs] [Times: user=0.31 sys=0.09, real=2.45 secs]
>>
>> The first timestamp is 2120.686 and the next one is 2123.075, so we have
>> about 2.389 s between those two. If you add to that the GC time of 0.056
>> s,
>> you get 2.445 which is close enough to the 2.45 s reported.
>>
>> So we need to figure out what happens in the JVM between those two
>> time-stamps and we can at least bound the culprit.
>>
>> -- ramki
>>
>> On 04/26/11 10:36, Shane Cox wrote:
>>
>>> Periodically, our Java app on Linux experiences a long Minor GC pause
>>> that cannot be accounted for by the GC time in the log file.  Instead, the
>>> pause is captured as "real" (wall clock) time and is observable in our
>>> application logs.  An example is below.  The GC completed in 56ms, but the
>>> application was paused for 2.45 seconds.
>>>
>>> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew:
>>> 943439K->104832K(943744K), 0.0481790 secs] 4909998K->4086751K(25060992K),
>>> 0.0485110 secs] [Times: user=0.34 sys=0.03, real=0.04 secs]
>>> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew:
>>> 942852K->104832K(943744K), 0.0738000 secs] 4924772K->4150899K(25060992K),
>>> 0.0740980 secs] [Times: user=0.45 sys=0.12, real=0.07 secs]
>>> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew:
>>> 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K),
>>> 0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]*
>>> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew:
>>> 918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K),
>>> 0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs]
>>> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew:
>>> 919952K->104832K(943744K), 0.0845070 secs] 5028177K->4268050K(25060992K),
>>> 0.0848300 secs] [Times: user=0.52 sys=0.11, real=0.09 secs]
>>>
>>>
>>> Initially I suspected swapping, but according to the free command, 0
>>> bytes of swap are in use.
>>>  >free -m
>>>             total       used       free     shared    buffers     cached
>>> Mem:         32168      28118       4050          0        824      12652
>>> -/+ buffers/cache:      14641      17527
>>> Swap:         8191          0       8191
>>>
>>>
>>> Next, I read about a problem relating to mprotect() on Linux that can be
>>> worked around with -XX:+UseMember.  I tried that, but I still see the same
>>> unexplainable pauses.
>>>
>>>
>>> Any suggestions/ideas?  We've upgraded to the latest JDK, but no luck.
>>>
>>> Thanks,
>>> Shane
>>>
>>>
>>> java version "1.6.0_25"
>>> Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
>>> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)
>>>
>>>
>>> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64
>>> x86_64 GNU/Linux
>>>
>>>
>>> -verbose:gc  -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m
>>> -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC
>>> -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70
>>> -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps
>>> -XX:+PrintHeapAtGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings
>>> -XX:+UseMembar
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110426/4acfabde/attachment-0001.html 

From y.s.ramakrishna at oracle.com  Tue Apr 26 12:40:29 2011
From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna)
Date: Tue, 26 Apr 2011 12:40:29 -0700
Subject: Periodic long minor GC pauses
In-Reply-To: <BANLkTindtwyfhxcDwWo4H-MTY0DTGY--_A@mail.gmail.com>
References: <BANLkTikk3uK+LQTwhfrCqFsreH=wBUDnRw@mail.gmail.com>	<4DB704D3.20600@oracle.com>	<4DB70C4A.9090203@oracle.com>
	<BANLkTindtwyfhxcDwWo4H-MTY0DTGY--_A@mail.gmail.com>
Message-ID: <4DB71FAD.3050905@oracle.com>

Well-spotted; it's a version of the same problem as near as
i can tell. Please make sure to include a sizable GC log with
your bug report (starting from VM start-up, so we can see if
there is any clue in when the problem first starts during
the life of the VM).

thanks.
-- ramki

On 04/26/11 11:29, Shane Cox wrote:
> Below is an example from a Remark.  Of the total 1.3 seconds of elapsed 
> time, 1.2 seconds is found between the first two timestamps.  However, 
> I'm not savvy enough to know whether this is the same problem or simply 
> the result of a long scavenge that occurs as part of the Remark.  Is 
> there any way to tell?
> 
> 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K 
> (943744 K)]9467.353: [Rescan (parallel) , 0.0106370 secs]9467.374: [weak 
> refs processing, 0.0159250 secs]9467.390: [class unloading, 0.0180420 
> secs]9467.408: [scrub symbol & string tables, 0.0458500 secs] [1 
> CMS-remark: 12520949K(24117248K)] 13233450K(25060992K), 0.1052950 secs] 
> [Times: user=0.13 sys=0.01, real=1.32 secs]
> 
> 
> On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna 
> <y.s.ramakrishna at oracle.com <mailto:y.s.ramakrishna at oracle.com>> wrote:
> 
>     I had a quick look and all i could find was the GC prologue
>     code (although i didn't look all that carefully).
>     Bascially, GC is invoked, it prints this timestamp,
>     does a bit of global book-keeping and some initialization,
>     and then goes over each generation in the heap and
>     says "i am going to do a collection, do whatever you need
>     to do before i do the collection", and the generations each do a bit of
>     book-keeping and any relevant initialization.
> 
>     The only thing i can see in the gc prologues other than a bit
>     of lightweight book-keeping is some reporting code that could
>     potentially be heavyweight. But you do not have any of those
>     enabled in your option set, so there should not be anything
>     obviously heavyweight going on.
> 
>     I'd suggest filing a bug under the category of
>     jvm/hotspot/garbage_collector
>     so someone in support can work with you to get this diagnosed...
> 
>     Three questions when you file the bug:
>     (1) have you seen this start happening recently? (version?)
>     (2) can you check if the longer pauses are "random" or do
>        they always happen "during" CMS concurrent cycles or
>        always outside of such cycles?
>     (3) test set-up.
> 
>     -- ramki
> 
> 
>     On 04/26/11 10:45, Y. S. Ramakrishna wrote:
> 
>         The pause is definitely in the beginning, before GC collection code
>         itself runs; witness the timestamps:-
> 
>         2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew:
>         943744K->79296K(943744K), 0.0559560 secs]
>         4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31
>         sys=0.09, real=2.45 secs]
> 
>         The first timestamp is 2120.686 and the next one is 2123.075, so
>         we have
>         about 2.389 s between those two. If you add to that the GC time
>         of 0.056 s,
>         you get 2.445 which is close enough to the 2.45 s reported.
> 
>         So we need to figure out what happens in the JVM between those two
>         time-stamps and we can at least bound the culprit.
> 
>         -- ramki
> 
>         On 04/26/11 10:36, Shane Cox wrote:
> 
>             Periodically, our Java app on Linux experiences a long Minor
>             GC pause that cannot be accounted for by the GC time in the
>             log file.  Instead, the pause is captured as "real" (wall
>             clock) time and is observable in our application logs.  An
>             example is below.  The GC completed in 56ms, but the
>             application was paused for 2.45 seconds.
> 
>             2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157:
>             [ParNew: 943439K->104832K(943744K), 0.0481790 secs]
>             4909998K->4086751K(25060992K), 0.0485110 secs] [Times:
>             user=0.34 sys=0.03, real=0.04 secs]
>             2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317:
>             [ParNew: 942852K->104832K(943744K), 0.0738000 secs]
>             4924772K->4150899K(25060992K), 0.0740980 secs] [Times:
>             user=0.45 sys=0.12, real=0.07 secs]
>             2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075:
>             [ParNew: 943744K->79296K(943744K), 0.0559560 secs]
>             4989811K->4187520K(25060992K), 0.0563970 secs] [Times:
>             user=0.31 sys=0.09, *real=2.45 secs]*
>             2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928:
>             [ParNew: 918208K->81040K(943744K), 0.0396620 secs]
>             5026432K->4189265K(25060992K), 0.0400030 secs] [Times:
>             user=0.32 sys=0.00, real=0.04 secs]
>             2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445:
>             [ParNew: 919952K->104832K(943744K), 0.0845070 secs]
>             5028177K->4268050K(25060992K), 0.0848300 secs] [Times:
>             user=0.52 sys=0.11, real=0.09 secs]
> 
> 
>             Initially I suspected swapping, but according to the free
>             command, 0 bytes of swap are in use.
>              >free -m
>                         total       used       free     shared  
>              buffers     cached
>             Mem:         32168      28118       4050          0      
>              824      12652
>             -/+ buffers/cache:      14641      17527
>             Swap:         8191          0       8191
> 
> 
>             Next, I read about a problem relating to mprotect() on Linux
>             that can be worked around with -XX:+UseMember.  I tried
>             that, but I still see the same unexplainable pauses.
> 
> 
>             Any suggestions/ideas?  We've upgraded to the latest JDK,
>             but no luck.
> 
>             Thanks,
>             Shane
> 
> 
>             java version "1.6.0_25"
>             Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
>             Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)
> 
> 
>             Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009
>             x86_64 x86_64 x86_64 GNU/Linux
> 
> 
>             -verbose:gc  -Xms24g -Xmx24g -Xmn1g -Xss256k
>             -XX:PermSize=256m -XX:MaxPermSize=256m
>             -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC
>             -XX:+CMSParallelRemarkEnabled
>             -XX:CMSInitiatingOccupancyFraction=70
>             -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails
>             -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
>             -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings
>             -XX:+UseMembar
> 
> 
>             ------------------------------------------------------------------------
> 
>             _______________________________________________
>             hotspot-gc-use mailing list
>             hotspot-gc-use at openjdk.java.net
>             <mailto:hotspot-gc-use at openjdk.java.net>
>             http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
>         _______________________________________________
>         hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net
>         <mailto:hotspot-gc-use at openjdk.java.net>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> 

From jon.masamitsu at oracle.com  Tue Apr 26 21:34:09 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 26 Apr 2011 21:34:09 -0700
Subject: Periodic long minor GC pauses
In-Reply-To: <4DB71FAD.3050905@oracle.com>
References: <BANLkTikk3uK+LQTwhfrCqFsreH=wBUDnRw@mail.gmail.com>	<4DB704D3.20600@oracle.com>	<4DB70C4A.9090203@oracle.com>	<BANLkTindtwyfhxcDwWo4H-MTY0DTGY--_A@mail.gmail.com>
	<4DB71FAD.3050905@oracle.com>
Message-ID: <4DB79CC1.8040707@oracle.com>

Shane,

Have you tried running with -XX:+AlwaysPreTouch ?  We've occasionally 
seen  intermittent
long pauses as the heap grows into newly committed pages.  This flag 
causes pages
to be touched as they are committed.  I don't know how this fits into 
Ramki's
observation but  it might be worth a shot.

Jon


On 4/26/2011 12:40 PM, Y. S. Ramakrishna wrote:
> Well-spotted; it's a version of the same problem as near as
> i can tell. Please make sure to include a sizable GC log with
> your bug report (starting from VM start-up, so we can see if
> there is any clue in when the problem first starts during
> the life of the VM).
>
> thanks.
> -- ramki
>
> On 04/26/11 11:29, Shane Cox wrote:
>> Below is an example from a Remark.  Of the total 1.3 seconds of elapsed
>> time, 1.2 seconds is found between the first two timestamps.  However,
>> I'm not savvy enough to know whether this is the same problem or simply
>> the result of a long scavenge that occurs as part of the Remark.  Is
>> there any way to tell?
>>
>> 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K
>> (943744 K)]9467.353: [Rescan (parallel) , 0.0106370 secs]9467.374: [weak
>> refs processing, 0.0159250 secs]9467.390: [class unloading, 0.0180420
>> secs]9467.408: [scrub symbol&  string tables, 0.0458500 secs] [1
>> CMS-remark: 12520949K(24117248K)] 13233450K(25060992K), 0.1052950 secs]
>> [Times: user=0.13 sys=0.01, real=1.32 secs]
>>
>>
>> On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna
>> <y.s.ramakrishna at oracle.com<mailto:y.s.ramakrishna at oracle.com>>  wrote:
>>
>>      I had a quick look and all i could find was the GC prologue
>>      code (although i didn't look all that carefully).
>>      Bascially, GC is invoked, it prints this timestamp,
>>      does a bit of global book-keeping and some initialization,
>>      and then goes over each generation in the heap and
>>      says "i am going to do a collection, do whatever you need
>>      to do before i do the collection", and the generations each do a bit of
>>      book-keeping and any relevant initialization.
>>
>>      The only thing i can see in the gc prologues other than a bit
>>      of lightweight book-keeping is some reporting code that could
>>      potentially be heavyweight. But you do not have any of those
>>      enabled in your option set, so there should not be anything
>>      obviously heavyweight going on.
>>
>>      I'd suggest filing a bug under the category of
>>      jvm/hotspot/garbage_collector
>>      so someone in support can work with you to get this diagnosed...
>>
>>      Three questions when you file the bug:
>>      (1) have you seen this start happening recently? (version?)
>>      (2) can you check if the longer pauses are "random" or do
>>         they always happen "during" CMS concurrent cycles or
>>         always outside of such cycles?
>>      (3) test set-up.
>>
>>      -- ramki
>>
>>
>>      On 04/26/11 10:45, Y. S. Ramakrishna wrote:
>>
>>          The pause is definitely in the beginning, before GC collection code
>>          itself runs; witness the timestamps:-
>>
>>          2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew:
>>          943744K->79296K(943744K), 0.0559560 secs]
>>          4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31
>>          sys=0.09, real=2.45 secs]
>>
>>          The first timestamp is 2120.686 and the next one is 2123.075, so
>>          we have
>>          about 2.389 s between those two. If you add to that the GC time
>>          of 0.056 s,
>>          you get 2.445 which is close enough to the 2.45 s reported.
>>
>>          So we need to figure out what happens in the JVM between those two
>>          time-stamps and we can at least bound the culprit.
>>
>>          -- ramki
>>
>>          On 04/26/11 10:36, Shane Cox wrote:
>>
>>              Periodically, our Java app on Linux experiences a long Minor
>>              GC pause that cannot be accounted for by the GC time in the
>>              log file.  Instead, the pause is captured as "real" (wall
>>              clock) time and is observable in our application logs.  An
>>              example is below.  The GC completed in 56ms, but the
>>              application was paused for 2.45 seconds.
>>
>>              2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157:
>>              [ParNew: 943439K->104832K(943744K), 0.0481790 secs]
>>              4909998K->4086751K(25060992K), 0.0485110 secs] [Times:
>>              user=0.34 sys=0.03, real=0.04 secs]
>>              2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317:
>>              [ParNew: 942852K->104832K(943744K), 0.0738000 secs]
>>              4924772K->4150899K(25060992K), 0.0740980 secs] [Times:
>>              user=0.45 sys=0.12, real=0.07 secs]
>>              2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075:
>>              [ParNew: 943744K->79296K(943744K), 0.0559560 secs]
>>              4989811K->4187520K(25060992K), 0.0563970 secs] [Times:
>>              user=0.31 sys=0.09, *real=2.45 secs]*
>>              2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928:
>>              [ParNew: 918208K->81040K(943744K), 0.0396620 secs]
>>              5026432K->4189265K(25060992K), 0.0400030 secs] [Times:
>>              user=0.32 sys=0.00, real=0.04 secs]
>>              2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445:
>>              [ParNew: 919952K->104832K(943744K), 0.0845070 secs]
>>              5028177K->4268050K(25060992K), 0.0848300 secs] [Times:
>>              user=0.52 sys=0.11, real=0.09 secs]
>>
>>
>>              Initially I suspected swapping, but according to the free
>>              command, 0 bytes of swap are in use.
>>               >free -m
>>                          total       used       free     shared
>>               buffers     cached
>>              Mem:         32168      28118       4050          0
>>               824      12652
>>              -/+ buffers/cache:      14641      17527
>>              Swap:         8191          0       8191
>>
>>
>>              Next, I read about a problem relating to mprotect() on Linux
>>              that can be worked around with -XX:+UseMember.  I tried
>>              that, but I still see the same unexplainable pauses.
>>
>>
>>              Any suggestions/ideas?  We've upgraded to the latest JDK,
>>              but no luck.
>>
>>              Thanks,
>>              Shane
>>
>>
>>              java version "1.6.0_25"
>>              Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
>>              Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)
>>
>>
>>              Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009
>>              x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>>              -verbose:gc  -Xms24g -Xmx24g -Xmn1g -Xss256k
>>              -XX:PermSize=256m -XX:MaxPermSize=256m
>>              -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC
>>              -XX:+CMSParallelRemarkEnabled
>>              -XX:CMSInitiatingOccupancyFraction=70
>>              -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails
>>              -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
>>              -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings
>>              -XX:+UseMembar
>>
>>
>>              ------------------------------------------------------------------------
>>
>>              _______________________________________________
>>              hotspot-gc-use mailing list
>>              hotspot-gc-use at openjdk.java.net
>>              <mailto:hotspot-gc-use at openjdk.java.net>
>>              http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>          _______________________________________________
>>          hotspot-gc-use mailing list
>>          hotspot-gc-use at openjdk.java.net
>>          <mailto:hotspot-gc-use at openjdk.java.net>
>>          http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From jon.masamitsu at oracle.com  Thu Apr 28 06:25:53 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 28 Apr 2011 06:25:53 -0700
Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?
In-Reply-To: <4DB6DDB5.4040804@xs4all.nl>
References: <4DB6DDB5.4040804@xs4all.nl>
Message-ID: <4DB96AE1.2020202@oracle.com>

John,

You're telling G1 (UseG1GC) to limit pauses to 2ms.
(-XX:MaxGCPauseMillis=2) but seemed to have tuned
CMS (UseConcMarkSweepGC) toward a 20ms goal.
G1 is trying to do very short collections and needs to do many
of them to keep up with the allocation rate.  Did you
mean you are setting MaxGCPauseMillis to 20?

Jon

On 4/26/2011 7:59 AM, John Hendrikx wrote:
> Hi list,
>
> I've been testing Java 1.6 performance vs Java 1.7 performance with a
> timing critical application -- it's essential that garbage collection
> pauses are very short.  What I've found is that Java 1.6 seems to
> perform significantly better than 1.7 (b137) in this respect, although
> with certain settings 1.6 will also fail catastrophically.   I've used
> the following options:
>
> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC
> For 1.7.0b137: -Xms256M -Xmx256M  -XX:+UseG1GC -XX:MaxGCPauseMillis=2
>
> The amount of garbage created is roughly 150 MB/sec.  The application
> demands a response time of about 20 ms and uses half a dozen threads
> which deal with buffering and decoding of information.
>
> With the above settings, the 1.6 VM will meet this goal over a 2 minute
> period>99% of the time (with an average CPU consumption of 65% per CPU
> core for two cores) -- from verbosegc I gather that the pause times are
> around 0.01-0.02 seconds:
>
> [GC 187752K->187559K(258880K), 0.0148198 secs]
> [GC 192156K(258880K), 0.0008281 secs]
> [GC 144561K->144372K(258880K), 0.0153497 secs]
> [GC 148965K(258880K), 0.0008028 secs]
> [GC 166187K->165969K(258880K), 0.0146546 secs]
> [GC 187935K->187754K(258880K), 0.0150638 secs]
> [GC 192344K(258880K), 0.0008422 secs]
>
> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit.
> It can also introduce OutOfMemory conditions and other catastrophic
> failures (one time the GC took 10 seconds after the application had only
> been running 20 seconds).  How stable 1.6 will perform with the initial
> settings remains to be seen; the results with more RAM worry me somewhat.
>
> The 1.7 VM however performs significantly worse.  Here is some of its
> output (over roughtly a one second period):
>
> [GC concurrent-mark-end, 0.0197681 sec]
> [GC remark, 0.0030323 secs]
> [GC concurrent-count-start]
> [GC concurrent-count-end, 0.0060561]
> [GC cleanup 177M->103M(256M), 0.0005319 secs]
> [GC concurrent-cleanup-start]
> [GC concurrent-cleanup-end, 0.0000676]
> [GC pause (partial) 136M->136M(256M), 0.0046206 secs]
> [GC pause (partial) 139M->139M(256M), 0.0039039 secs]
> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs]
> [GC concurrent-mark-start]
> [GC concurrent-mark-end, 0.0152915 sec]
> [GC remark, 0.0033085 secs]
> [GC concurrent-count-start]
> [GC concurrent-count-end, 0.0085232]
> [GC cleanup 163M->129M(256M), 0.0004847 secs]
> [GC concurrent-cleanup-start]
> [GC concurrent-cleanup-end, 0.0000363]
>
>   From the above output one would not expect the performance to be worse,
> however, the application fails to meet its goals 10-20% of the time.
> The amount of garbage created is the same.  CPU time however is hovering
> around 90-95%, which is likely the cause of the poor performance.  The
> GC seems to take a significantly larger amount of time to do its work
> causing these stalls in my test application.
>
> I've experimented with memory sizes and max pause times with the 1.7 VM,
> and although it seemed to be doing better with more RAM, it never comes
> even close to the performance observed with the 1.6 VM.
>
> I'm not sure if there are other useful options I can try to see if I can
> tune the 1.7 VM performance a bit better. I can provide more
> information, although not any (useful) source code at this time due to
> external dependencies (JNA/JNI) of this application.
>
> I'm wondering if I'm missing something as it seems strange to me that
> 1.7 is actually underperforming for me when in general most seem to
> agree that the G1GC is a huge improvement.
>
> --John
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From hjohn at xs4all.nl  Thu Apr 28 23:10:08 2011
From: hjohn at xs4all.nl (John Hendrikx)
Date: Fri, 29 Apr 2011 08:10:08 +0200
Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?
In-Reply-To: <4DB96AE1.2020202@oracle.com>
References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com>
Message-ID: <4DBA5640.9080203@xs4all.nl>

I tried many -XX:MaxGCPauseMillis settings, including not setting it at 
all, 20, 10, 5, 2.  The results were similar each time -- it didn't 
really have much of an effect.  In retrospect you might say that the 
total CPU use is what is causing the problems, not necessarily the 
length of the pauses -- whether this extra CPU use is caused by the 
collector or because of some other change in Java 7 I donot know; the 
program is the same.  Is there perhaps another collector that I could 
try to see if this lowers CPU use?  Or settings (even non-GC related) 
that could lower CPU use?

Java 6's CMS I didn't need to tune.  After determining that the length 
of GC pauses was causing problems in the application, I tried turning 
CMS on and it resolved the problems.

What I observe is that even though with Java 7 the pauses seem (are?) 
very short, the CPU use is a lot higher (from 65% under Java 6 to 95% 
with 7).  This could be related to other causes (perhaps threading 
overhead, debug code in Java 7, etc) but I doubt it is in any specific 
Java code that I wrote as most of the heavy lifting is happening in 
native methods.  It could for example be that several ByteBuffers being 
used are being copied under Java 7 while under 6 direct access was possible.

John.

Jon Masamitsu wrote:
> John,
>
> You're telling G1 (UseG1GC) to limit pauses to 2ms.
> (-XX:MaxGCPauseMillis=2) but seemed to have tuned
> CMS (UseConcMarkSweepGC) toward a 20ms goal.
> G1 is trying to do very short collections and needs to do many
> of them to keep up with the allocation rate.  Did you
> mean you are setting MaxGCPauseMillis to 20?
>
> Jon
>
> On 4/26/2011 7:59 AM, John Hendrikx wrote:
>   
>> Hi list,
>>
>> I've been testing Java 1.6 performance vs Java 1.7 performance with a
>> timing critical application -- it's essential that garbage collection
>> pauses are very short.  What I've found is that Java 1.6 seems to
>> perform significantly better than 1.7 (b137) in this respect, although
>> with certain settings 1.6 will also fail catastrophically.   I've used
>> the following options:
>>
>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC
>> For 1.7.0b137: -Xms256M -Xmx256M  -XX:+UseG1GC -XX:MaxGCPauseMillis=2
>>
>> The amount of garbage created is roughly 150 MB/sec.  The application
>> demands a response time of about 20 ms and uses half a dozen threads
>> which deal with buffering and decoding of information.
>>
>> With the above settings, the 1.6 VM will meet this goal over a 2 minute
>> period>99% of the time (with an average CPU consumption of 65% per CPU
>> core for two cores) -- from verbosegc I gather that the pause times are
>> around 0.01-0.02 seconds:
>>
>> [GC 187752K->187559K(258880K), 0.0148198 secs]
>> [GC 192156K(258880K), 0.0008281 secs]
>> [GC 144561K->144372K(258880K), 0.0153497 secs]
>> [GC 148965K(258880K), 0.0008028 secs]
>> [GC 166187K->165969K(258880K), 0.0146546 secs]
>> [GC 187935K->187754K(258880K), 0.0150638 secs]
>> [GC 192344K(258880K), 0.0008422 secs]
>>
>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit.
>> It can also introduce OutOfMemory conditions and other catastrophic
>> failures (one time the GC took 10 seconds after the application had only
>> been running 20 seconds).  How stable 1.6 will perform with the initial
>> settings remains to be seen; the results with more RAM worry me somewhat.
>>
>> The 1.7 VM however performs significantly worse.  Here is some of its
>> output (over roughtly a one second period):
>>
>> [GC concurrent-mark-end, 0.0197681 sec]
>> [GC remark, 0.0030323 secs]
>> [GC concurrent-count-start]
>> [GC concurrent-count-end, 0.0060561]
>> [GC cleanup 177M->103M(256M), 0.0005319 secs]
>> [GC concurrent-cleanup-start]
>> [GC concurrent-cleanup-end, 0.0000676]
>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs]
>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs]
>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs]
>> [GC concurrent-mark-start]
>> [GC concurrent-mark-end, 0.0152915 sec]
>> [GC remark, 0.0033085 secs]
>> [GC concurrent-count-start]
>> [GC concurrent-count-end, 0.0085232]
>> [GC cleanup 163M->129M(256M), 0.0004847 secs]
>> [GC concurrent-cleanup-start]
>> [GC concurrent-cleanup-end, 0.0000363]
>>
>>   From the above output one would not expect the performance to be worse,
>> however, the application fails to meet its goals 10-20% of the time.
>> The amount of garbage created is the same.  CPU time however is hovering
>> around 90-95%, which is likely the cause of the poor performance.  The
>> GC seems to take a significantly larger amount of time to do its work
>> causing these stalls in my test application.
>>
>> I've experimented with memory sizes and max pause times with the 1.7 VM,
>> and although it seemed to be doing better with more RAM, it never comes
>> even close to the performance observed with the 1.6 VM.
>>
>> I'm not sure if there are other useful options I can try to see if I can
>> tune the 1.7 VM performance a bit better. I can provide more
>> information, although not any (useful) source code at this time due to
>> external dependencies (JNA/JNI) of this application.
>>
>> I'm wondering if I'm missing something as it seems strange to me that
>> 1.7 is actually underperforming for me when in general most seem to
>> agree that the G1GC is a huge improvement.
>>
>> --John
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>     
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>   


From y.s.ramakrishna at oracle.com  Thu Apr 28 23:16:28 2011
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Thu, 28 Apr 2011 23:16:28 -0700
Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?
In-Reply-To: <4DBA5640.9080203@xs4all.nl>
References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com>
	<4DBA5640.9080203@xs4all.nl>
Message-ID: <4DBA57BC.40604@oracle.com>

John --

How about posting performance/times of each of the 4 combinations
from the following cartesian product:-

     {JDK7, JDK6} X {CMS, G1}

Perhaps you are conflating JDK changes with GC changes, because
of changing both axes/dimensions at the same time?

-- ramki

On 4/28/2011 11:10 PM, John Hendrikx wrote:
> I tried many -XX:MaxGCPauseMillis settings, including not setting it at
> all, 20, 10, 5, 2.  The results were similar each time -- it didn't
> really have much of an effect.  In retrospect you might say that the
> total CPU use is what is causing the problems, not necessarily the
> length of the pauses -- whether this extra CPU use is caused by the
> collector or because of some other change in Java 7 I donot know; the
> program is the same.  Is there perhaps another collector that I could
> try to see if this lowers CPU use?  Or settings (even non-GC related)
> that could lower CPU use?
>
> Java 6's CMS I didn't need to tune.  After determining that the length
> of GC pauses was causing problems in the application, I tried turning
> CMS on and it resolved the problems.
>
> What I observe is that even though with Java 7 the pauses seem (are?)
> very short, the CPU use is a lot higher (from 65% under Java 6 to 95%
> with 7).  This could be related to other causes (perhaps threading
> overhead, debug code in Java 7, etc) but I doubt it is in any specific
> Java code that I wrote as most of the heavy lifting is happening in
> native methods.  It could for example be that several ByteBuffers being
> used are being copied under Java 7 while under 6 direct access was possible.
>
> John.
>
> Jon Masamitsu wrote:
>> John,
>>
>> You're telling G1 (UseG1GC) to limit pauses to 2ms.
>> (-XX:MaxGCPauseMillis=2) but seemed to have tuned
>> CMS (UseConcMarkSweepGC) toward a 20ms goal.
>> G1 is trying to do very short collections and needs to do many
>> of them to keep up with the allocation rate.  Did you
>> mean you are setting MaxGCPauseMillis to 20?
>>
>> Jon
>>
>> On 4/26/2011 7:59 AM, John Hendrikx wrote:
>>
>>> Hi list,
>>>
>>> I've been testing Java 1.6 performance vs Java 1.7 performance with a
>>> timing critical application -- it's essential that garbage collection
>>> pauses are very short.  What I've found is that Java 1.6 seems to
>>> perform significantly better than 1.7 (b137) in this respect, although
>>> with certain settings 1.6 will also fail catastrophically.   I've used
>>> the following options:
>>>
>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC
>>> For 1.7.0b137: -Xms256M -Xmx256M  -XX:+UseG1GC -XX:MaxGCPauseMillis=2
>>>
>>> The amount of garbage created is roughly 150 MB/sec.  The application
>>> demands a response time of about 20 ms and uses half a dozen threads
>>> which deal with buffering and decoding of information.
>>>
>>> With the above settings, the 1.6 VM will meet this goal over a 2 minute
>>> period>99% of the time (with an average CPU consumption of 65% per CPU
>>> core for two cores) -- from verbosegc I gather that the pause times are
>>> around 0.01-0.02 seconds:
>>>
>>> [GC 187752K->187559K(258880K), 0.0148198 secs]
>>> [GC 192156K(258880K), 0.0008281 secs]
>>> [GC 144561K->144372K(258880K), 0.0153497 secs]
>>> [GC 148965K(258880K), 0.0008028 secs]
>>> [GC 166187K->165969K(258880K), 0.0146546 secs]
>>> [GC 187935K->187754K(258880K), 0.0150638 secs]
>>> [GC 192344K(258880K), 0.0008422 secs]
>>>
>>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit.
>>> It can also introduce OutOfMemory conditions and other catastrophic
>>> failures (one time the GC took 10 seconds after the application had only
>>> been running 20 seconds).  How stable 1.6 will perform with the initial
>>> settings remains to be seen; the results with more RAM worry me somewhat.
>>>
>>> The 1.7 VM however performs significantly worse.  Here is some of its
>>> output (over roughtly a one second period):
>>>
>>> [GC concurrent-mark-end, 0.0197681 sec]
>>> [GC remark, 0.0030323 secs]
>>> [GC concurrent-count-start]
>>> [GC concurrent-count-end, 0.0060561]
>>> [GC cleanup 177M->103M(256M), 0.0005319 secs]
>>> [GC concurrent-cleanup-start]
>>> [GC concurrent-cleanup-end, 0.0000676]
>>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs]
>>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs]
>>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs]
>>> [GC concurrent-mark-start]
>>> [GC concurrent-mark-end, 0.0152915 sec]
>>> [GC remark, 0.0033085 secs]
>>> [GC concurrent-count-start]
>>> [GC concurrent-count-end, 0.0085232]
>>> [GC cleanup 163M->129M(256M), 0.0004847 secs]
>>> [GC concurrent-cleanup-start]
>>> [GC concurrent-cleanup-end, 0.0000363]
>>>
>>>    From the above output one would not expect the performance to be worse,
>>> however, the application fails to meet its goals 10-20% of the time.
>>> The amount of garbage created is the same.  CPU time however is hovering
>>> around 90-95%, which is likely the cause of the poor performance.  The
>>> GC seems to take a significantly larger amount of time to do its work
>>> causing these stalls in my test application.
>>>
>>> I've experimented with memory sizes and max pause times with the 1.7 VM,
>>> and although it seemed to be doing better with more RAM, it never comes
>>> even close to the performance observed with the 1.6 VM.
>>>
>>> I'm not sure if there are other useful options I can try to see if I can
>>> tune the 1.7 VM performance a bit better. I can provide more
>>> information, although not any (useful) source code at this time due to
>>> external dependencies (JNA/JNI) of this application.
>>>
>>> I'm wondering if I'm missing something as it seems strange to me that
>>> 1.7 is actually underperforming for me when in general most seem to
>>> agree that the G1GC is a huge improvement.
>>>
>>> --John
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From hjohn at xs4all.nl  Fri Apr 29 00:23:11 2011
From: hjohn at xs4all.nl (John Hendrikx)
Date: Fri, 29 Apr 2011 09:23:11 +0200
Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?
In-Reply-To: <4DBA57BC.40604@oracle.com>
References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com>
	<4DBA5640.9080203@xs4all.nl> <4DBA57BC.40604@oracle.com>
Message-ID: <4DBA675F.70801@xs4all.nl>

That's a good idea; I did the runs with G1GC under Java 6 (1.6.0_22), 
and the main thing is that the CPU usage is a lot lower there, although 
I still cannot get results that are as good as with the old CMS gc.

Below some attempts.  I don't know how to activate CMS for Java 7, it 
ignores the option that I'd use for Java 6.

Note that for many of the runs that had a high percentage of long pauses 
or high CPU the program eventually halted (this can be a timing 
threading issue in my code, but it proofs hard to locate if that is the 
case).  For the runs where the long pause percentage is 10% or higher, 
this usually means that every GC resulted in a pause longer than 40ms.  
Collections happen about once or twice per second, so that's a lot of 
too long pauses.

Java7: -ea -Xms256M -Xmx256M
Run 1: CPU: ~90%  Long Pauses: >45%

Java7: -ea -Xms256M -Xmx256M -XX:+UseG1GC
Run 1: CPU: ~90%  Long Pauses: >30%

Java7: -ea -Xms256M -Xmx256M -XX:+UseG1GC -XX:MaxGCPauseMillis=20
Run 1: CPU: >95%  Long Pauses: >25%
* when CPU approaches 100% significantly more long pauses are observed

Java7: -ea -Xms256M -Xmx256M -XX:+UseG1GC -XX:MaxGCPauseMillis=2
Run 1: CPU: >95%  Long Pauses: >30%
* when CPU approaches 100% significantly more long pauses are observed

Java6: -ea -Xms256M -Xmx256M
Run 1: CPU: ~50%  Long Pauses: >30%

Java6: -ea -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc
Run 1: CPU: ~65%  Long Pauses: <1%

Java6: -ea -Xms256M -Xmx256M -XX:+UnlockExperimentalVMOptions 
-XX:+UseG1GC -verbose:gc
Run 1: CPU: ~55%  Long Pauses: ~9%

Java6: -ea -Xms256M -Xmx256M -XX:+UnlockExperimentalVMOptions 
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -verbose:gc
Run 1: CPU: ~55%  Long Pauses: ~11%
* Pause times were almost always >0.05 seconds, option is ignored?

Java6: -ea -Xms256M -Xmx256M -XX:+UnlockExperimentalVMOptions 
-XX:+UseG1GC -XX:MaxGCPauseMillis=10 -verbose:gc
Run 1: java.lang.OutOfMemoryError: Java heap space (gets stuck in a Full 
GC loop, prints 100+ times [Full GC 214M->210M(256M), 0.0673243 secs])
Run 2: CPU: ~55%  Long Pauses: ~18%
* Pause times were almost always >0.05 seconds, option is ignored?

I'm happy to try any other suggested options/combinations.

--John

Y. Srinivas Ramakrishna wrote:
> John --
>
> How about posting performance/times of each of the 4 combinations
> from the following cartesian product:-
>
>     {JDK7, JDK6} X {CMS, G1}
>
> Perhaps you are conflating JDK changes with GC changes, because
> of changing both axes/dimensions at the same time?
>
> -- ramki
>
> On 4/28/2011 11:10 PM, John Hendrikx wrote:
>> I tried many -XX:MaxGCPauseMillis settings, including not setting it at
>> all, 20, 10, 5, 2.  The results were similar each time -- it didn't
>> really have much of an effect.  In retrospect you might say that the
>> total CPU use is what is causing the problems, not necessarily the
>> length of the pauses -- whether this extra CPU use is caused by the
>> collector or because of some other change in Java 7 I donot know; the
>> program is the same.  Is there perhaps another collector that I could
>> try to see if this lowers CPU use?  Or settings (even non-GC related)
>> that could lower CPU use?
>>
>> Java 6's CMS I didn't need to tune.  After determining that the length
>> of GC pauses was causing problems in the application, I tried turning
>> CMS on and it resolved the problems.
>>
>> What I observe is that even though with Java 7 the pauses seem (are?)
>> very short, the CPU use is a lot higher (from 65% under Java 6 to 95%
>> with 7).  This could be related to other causes (perhaps threading
>> overhead, debug code in Java 7, etc) but I doubt it is in any specific
>> Java code that I wrote as most of the heavy lifting is happening in
>> native methods.  It could for example be that several ByteBuffers being
>> used are being copied under Java 7 while under 6 direct access was 
>> possible.
>>
>> John.
>>
>> Jon Masamitsu wrote:
>>> John,
>>>
>>> You're telling G1 (UseG1GC) to limit pauses to 2ms.
>>> (-XX:MaxGCPauseMillis=2) but seemed to have tuned
>>> CMS (UseConcMarkSweepGC) toward a 20ms goal.
>>> G1 is trying to do very short collections and needs to do many
>>> of them to keep up with the allocation rate.  Did you
>>> mean you are setting MaxGCPauseMillis to 20?
>>>
>>> Jon
>>>
>>> On 4/26/2011 7:59 AM, John Hendrikx wrote:
>>>
>>>> Hi list,
>>>>
>>>> I've been testing Java 1.6 performance vs Java 1.7 performance with a
>>>> timing critical application -- it's essential that garbage collection
>>>> pauses are very short.  What I've found is that Java 1.6 seems to
>>>> perform significantly better than 1.7 (b137) in this respect, although
>>>> with certain settings 1.6 will also fail catastrophically.   I've used
>>>> the following options:
>>>>
>>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC
>>>> For 1.7.0b137: -Xms256M -Xmx256M  -XX:+UseG1GC -XX:MaxGCPauseMillis=2
>>>>
>>>> The amount of garbage created is roughly 150 MB/sec.  The application
>>>> demands a response time of about 20 ms and uses half a dozen threads
>>>> which deal with buffering and decoding of information.
>>>>
>>>> With the above settings, the 1.6 VM will meet this goal over a 2 
>>>> minute
>>>> period>99% of the time (with an average CPU consumption of 65% per CPU
>>>> core for two cores) -- from verbosegc I gather that the pause times 
>>>> are
>>>> around 0.01-0.02 seconds:
>>>>
>>>> [GC 187752K->187559K(258880K), 0.0148198 secs]
>>>> [GC 192156K(258880K), 0.0008281 secs]
>>>> [GC 144561K->144372K(258880K), 0.0153497 secs]
>>>> [GC 148965K(258880K), 0.0008028 secs]
>>>> [GC 166187K->165969K(258880K), 0.0146546 secs]
>>>> [GC 187935K->187754K(258880K), 0.0150638 secs]
>>>> [GC 192344K(258880K), 0.0008422 secs]
>>>>
>>>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a 
>>>> bit.
>>>> It can also introduce OutOfMemory conditions and other catastrophic
>>>> failures (one time the GC took 10 seconds after the application had 
>>>> only
>>>> been running 20 seconds).  How stable 1.6 will perform with the 
>>>> initial
>>>> settings remains to be seen; the results with more RAM worry me 
>>>> somewhat.
>>>>
>>>> The 1.7 VM however performs significantly worse.  Here is some of its
>>>> output (over roughtly a one second period):
>>>>
>>>> [GC concurrent-mark-end, 0.0197681 sec]
>>>> [GC remark, 0.0030323 secs]
>>>> [GC concurrent-count-start]
>>>> [GC concurrent-count-end, 0.0060561]
>>>> [GC cleanup 177M->103M(256M), 0.0005319 secs]
>>>> [GC concurrent-cleanup-start]
>>>> [GC concurrent-cleanup-end, 0.0000676]
>>>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs]
>>>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs]
>>>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs]
>>>> [GC concurrent-mark-start]
>>>> [GC concurrent-mark-end, 0.0152915 sec]
>>>> [GC remark, 0.0033085 secs]
>>>> [GC concurrent-count-start]
>>>> [GC concurrent-count-end, 0.0085232]
>>>> [GC cleanup 163M->129M(256M), 0.0004847 secs]
>>>> [GC concurrent-cleanup-start]
>>>> [GC concurrent-cleanup-end, 0.0000363]
>>>>
>>>>    From the above output one would not expect the performance to be 
>>>> worse,
>>>> however, the application fails to meet its goals 10-20% of the time.
>>>> The amount of garbage created is the same.  CPU time however is 
>>>> hovering
>>>> around 90-95%, which is likely the cause of the poor performance.  The
>>>> GC seems to take a significantly larger amount of time to do its work
>>>> causing these stalls in my test application.
>>>>
>>>> I've experimented with memory sizes and max pause times with the 
>>>> 1.7 VM,
>>>> and although it seemed to be doing better with more RAM, it never 
>>>> comes
>>>> even close to the performance observed with the 1.6 VM.
>>>>
>>>> I'm not sure if there are other useful options I can try to see if 
>>>> I can
>>>> tune the 1.7 VM performance a bit better. I can provide more
>>>> information, although not any (useful) source code at this time due to
>>>> external dependencies (JNA/JNI) of this application.
>>>>
>>>> I'm wondering if I'm missing something as it seems strange to me that
>>>> 1.7 is actually underperforming for me when in general most seem to
>>>> agree that the G1GC is a huge improvement.
>>>>
>>>> --John
>>>>
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>


From y.s.ramakrishna at oracle.com  Fri Apr 29 12:57:46 2011
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 29 Apr 2011 12:57:46 -0700
Subject: CMS option for Java7 ignored (was Re: 1.7 G1GC significantly slower
	than 1.6 Mark and Sweep?)
In-Reply-To: <4DBA675F.70801@xs4all.nl>
References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com>
	<4DBA5640.9080203@xs4all.nl> <4DBA57BC.40604@oracle.com>
	<4DBA675F.70801@xs4all.nl>
Message-ID: <4DBB183A.6090608@oracle.com>

John, You brought up several related but somewhat orthogonal issues,
so best to deal with each in a separate sub-thread of the
main thread.

On 4/29/2011 12:23 AM, John Hendrikx wrote:

> .... I don't know how to activate CMS for Java 7, it ignores the option that I'd use
> for Java 6.
...
> Java6: -ea -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc

...
>>>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC


Are you saying that if you do:

    -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc

you do not get CMS? If not, what does the GC log say?
Can you provide the following details:

% java -version

and also

% jinfo <pid>

as well as:

% jnifo -flag UseConcMarkSweepGC <pid>

where <pid> is yr JVM process.

The main issue will be dealt with in the original thread.
Sorry for the digression.

-- ramki

From jon.masamitsu at oracle.com  Fri Apr 29 13:21:55 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 29 Apr 2011 13:21:55 -0700
Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?
In-Reply-To: <4DBA5640.9080203@xs4all.nl>
References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com>
	<4DBA5640.9080203@xs4all.nl>
Message-ID: <4DBB1DE3.9070304@oracle.com>

John,

If you do additional runs to look at this issue, please
add -XX:+PrintGCTimesStamps.  Helps with plotting
vs. time.  Thanks.

Jon

On 4/28/2011 11:10 PM, John Hendrikx wrote:
> I tried many -XX:MaxGCPauseMillis settings, including not setting it 
> at all, 20, 10, 5, 2.  The results were similar each time -- it didn't 
> really have much of an effect.  In retrospect you might say that the 
> total CPU use is what is causing the problems, not necessarily the 
> length of the pauses -- whether this extra CPU use is caused by the 
> collector or because of some other change in Java 7 I donot know; the 
> program is the same.  Is there perhaps another collector that I could 
> try to see if this lowers CPU use?  Or settings (even non-GC related) 
> that could lower CPU use?
>
> Java 6's CMS I didn't need to tune.  After determining that the length 
> of GC pauses was causing problems in the application, I tried turning 
> CMS on and it resolved the problems.
>
> What I observe is that even though with Java 7 the pauses seem (are?) 
> very short, the CPU use is a lot higher (from 65% under Java 6 to 95% 
> with 7).  This could be related to other causes (perhaps threading 
> overhead, debug code in Java 7, etc) but I doubt it is in any specific 
> Java code that I wrote as most of the heavy lifting is happening in 
> native methods.  It could for example be that several ByteBuffers 
> being used are being copied under Java 7 while under 6 direct access 
> was possible.
>
> John.
>
> Jon Masamitsu wrote:
>> John,
>>
>> You're telling G1 (UseG1GC) to limit pauses to 2ms.
>> (-XX:MaxGCPauseMillis=2) but seemed to have tuned
>> CMS (UseConcMarkSweepGC) toward a 20ms goal.
>> G1 is trying to do very short collections and needs to do many
>> of them to keep up with the allocation rate.  Did you
>> mean you are setting MaxGCPauseMillis to 20?
>>
>> Jon
>>
>> On 4/26/2011 7:59 AM, John Hendrikx wrote:
>>> Hi list,
>>>
>>> I've been testing Java 1.6 performance vs Java 1.7 performance with a
>>> timing critical application -- it's essential that garbage collection
>>> pauses are very short.  What I've found is that Java 1.6 seems to
>>> perform significantly better than 1.7 (b137) in this respect, although
>>> with certain settings 1.6 will also fail catastrophically.   I've used
>>> the following options:
>>>
>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC
>>> For 1.7.0b137: -Xms256M -Xmx256M  -XX:+UseG1GC -XX:MaxGCPauseMillis=2
>>>
>>> The amount of garbage created is roughly 150 MB/sec.  The application
>>> demands a response time of about 20 ms and uses half a dozen threads
>>> which deal with buffering and decoding of information.
>>>
>>> With the above settings, the 1.6 VM will meet this goal over a 2 minute
>>> period>99% of the time (with an average CPU consumption of 65% per CPU
>>> core for two cores) -- from verbosegc I gather that the pause times are
>>> around 0.01-0.02 seconds:
>>>
>>> [GC 187752K->187559K(258880K), 0.0148198 secs]
>>> [GC 192156K(258880K), 0.0008281 secs]
>>> [GC 144561K->144372K(258880K), 0.0153497 secs]
>>> [GC 148965K(258880K), 0.0008028 secs]
>>> [GC 166187K->165969K(258880K), 0.0146546 secs]
>>> [GC 187935K->187754K(258880K), 0.0150638 secs]
>>> [GC 192344K(258880K), 0.0008422 secs]
>>>
>>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit.
>>> It can also introduce OutOfMemory conditions and other catastrophic
>>> failures (one time the GC took 10 seconds after the application had 
>>> only
>>> been running 20 seconds).  How stable 1.6 will perform with the initial
>>> settings remains to be seen; the results with more RAM worry me 
>>> somewhat.
>>>
>>> The 1.7 VM however performs significantly worse.  Here is some of its
>>> output (over roughtly a one second period):
>>>
>>> [GC concurrent-mark-end, 0.0197681 sec]
>>> [GC remark, 0.0030323 secs]
>>> [GC concurrent-count-start]
>>> [GC concurrent-count-end, 0.0060561]
>>> [GC cleanup 177M->103M(256M), 0.0005319 secs]
>>> [GC concurrent-cleanup-start]
>>> [GC concurrent-cleanup-end, 0.0000676]
>>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs]
>>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs]
>>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs]
>>> [GC concurrent-mark-start]
>>> [GC concurrent-mark-end, 0.0152915 sec]
>>> [GC remark, 0.0033085 secs]
>>> [GC concurrent-count-start]
>>> [GC concurrent-count-end, 0.0085232]
>>> [GC cleanup 163M->129M(256M), 0.0004847 secs]
>>> [GC concurrent-cleanup-start]
>>> [GC concurrent-cleanup-end, 0.0000363]
>>>
>>>   From the above output one would not expect the performance to be 
>>> worse,
>>> however, the application fails to meet its goals 10-20% of the time.
>>> The amount of garbage created is the same.  CPU time however is 
>>> hovering
>>> around 90-95%, which is likely the cause of the poor performance.  The
>>> GC seems to take a significantly larger amount of time to do its work
>>> causing these stalls in my test application.
>>>
>>> I've experimented with memory sizes and max pause times with the 1.7 
>>> VM,
>>> and although it seemed to be doing better with more RAM, it never comes
>>> even close to the performance observed with the 1.6 VM.
>>>
>>> I'm not sure if there are other useful options I can try to see if I 
>>> can
>>> tune the 1.7 VM performance a bit better. I can provide more
>>> information, although not any (useful) source code at this time due to
>>> external dependencies (JNA/JNI) of this application.
>>>
>>> I'm wondering if I'm missing something as it seems strange to me that
>>> 1.7 is actually underperforming for me when in general most seem to
>>> agree that the G1GC is a huge improvement.
>>>
>>> --John
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>