From tony.printezis at oracle.com  Thu May  6 12:32:30 2010
From: tony.printezis at oracle.com (Tony Printezis)
Date: Thu, 06 May 2010 15:32:30 -0400
Subject: Feedback requested: HotSpot GC logging improvements
Message-ID: <4BE3194E.902@oracle.com>

Hi all,

We would like your input on some changes to HotSpot's GC logging that we 
have been discussing. We have been wanting to improve our GC logging for 
some time. However we haven't had the resources to spend on it. We don't 
know when we'll get to it, but we'd still like to get some feedback on 
our plans.

The changes fall into two categories.


A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.

I strongly believe that maintaining two GC log formats is 
counter-productive, especially given that the current -verbosegc format 
is unhelpful in many ways (i.e., lacks a lot of helpful information). 
So, we would like to unify the two into one, with maybe 
-XX:+PrintGCDetails generating a superset of what -verbosegc would 
generate (so that a parser for the -XX:+PrintGCDetails output will also 
be able to parse the -verbosegc output). The new output will not be what 
-XX:+PrintGCDetails generates today but something that can be reliably 
parsed and it is also reasonably human-readable (so, no xml and no 
space/tab-separated formats). Additionally, we're proposing to enable 
-XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and 
ignore that option, I can't believe that users will really not want a 
time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be 
optional though.

Specific questions:

- Is anyone really attached to the old -verbosegc output?
- Would anyone really hate having time stamps by default?
- I know that a lot of folks have their own parsers for our current GC 
log formats. Would you be happy if we provided you with a (reliable!) 
parser for the new format in Java that you can easily adapt?


B. Introducing "cyclic" GC logs.

This is something that a lot of folks have asked for given that they 
were concerned with the GC logs getting very large (a 1TB disk is $85 
these days, but anyway...). Given that each GC log record is of variable 
size, we cannot easily cycle through the log using the same file (I'd 
rather not have to overwrite existing records). Our current proposal is 
for the user to specify a file number N and a size target S for each 
file. For a given GC log -Xloggc:foo, HotSpot will generate

foo.00000001
foo.00000002
foo.00000003
etc.

(we'll create a new file as soon as the size of the one we are writing 
to exceeds S, so each file will be slightly larger than S but it will be 
helpful not to split individual log records between two files)

When we create a new file, if we have more than N files we'll delete the 
oldest. So, in the above example, if N == 3, when we create foo.00000004 
we'll delete foo.00000001.

Note that in the above scheme, the logs are not really "cyclic" but, 
instead, we're pruning the oldest records every now and then, which has 
the same effect.

Another (related) request has been to maybe append the GC log file name 
with the pid of the JVM that's generating it. Maybe we don't want to do 
this by default. But, would people find it helpful if we provide a new 
cmd line parameter to do that? So, for the above example and assuming 
that the JVM's pid is 1234, the GC log file(s) will be either:

foo.1234

or

foo.1234.00000001
foo.1234.00000002
foo.1234.00000003
etc.

Specific questions:

- Would people really hate it if HotSpot starts appending the GC log 
file name with a (zero-padded) sequence number? Maybe if N == 1 (the 
default), HotSpot will skip the sequence number and ignore S, i.e., 
behave as it does today.
- To the people who have been asking for cyclic GC logs: is the sequence 
number scheme above good enough?


Thanks in advance for your feedback,

Tony, HotSpot GC Group


From ryanobjc at gmail.com  Thu May  6 12:51:48 2010
From: ryanobjc at gmail.com (Ryan Rawson)
Date: Thu, 6 May 2010 12:51:48 -0700
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <z2x78568af11005061251xd0245329w8d4c63bbff19982d@mail.gmail.com>

Hey,

I would say that PrintGCDateStamps should be the default - or at least
promoted heavily as the option to use.  Since most other logs (eg:
log4j) are in "normal time" correlating GC logs and server logs
require determining when the VM started, then doing some on the fly
math for every log line you are interested in.  Sometimes determining
VM start time is impossible because the first log line is offset from
the VM start time by X milliseconds, and in a tight debugging
situation this could be all the difference in the world.

On the logfile front, there are at least 2 problems:
- successive runs overwrite the previous log file.  I ran into this
problem and lost the ability to debug a problem.
- a logfile will grow without bound, although in my experience I have
not had space problems with this.

While you are correct that _the cheapest disks_ you can buy run
$85/TB, this is not the kinds of disks many people are installing into
server-type systems. A serial-attached-scsi (aka SAS) disk at 10k rpm
is a little bit more expensive than $85/TB.

-ryan


On Thu, May 6, 2010 at 12:32 PM, Tony Printezis
<tony.printezis at oracle.com> wrote:
> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging that we
> have been discussing. We have been wanting to improve our GC logging for
> some time. However we haven't had the resources to spend on it. We don't
> know when we'll get to it, but we'd still like to get some feedback on
> our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will also
> be able to parse the -verbosegc output). The new output will not be what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete the
> oldest. So, in the above example, if N == 3, when we create foo.00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which has
> the same effect.
>
> Another (related) request has been to maybe append the GC log file name
> with the pid of the JVM that's generating it. Maybe we don't want to do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From matt.khan at db.com  Thu May  6 13:01:41 2010
From: matt.khan at db.com (Matt Khan)
Date: Thu, 6 May 2010 21:01:41 +0100
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
Message-ID: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>

Evening

we currently manage the log overwriting issue by mv'ing the last gc.log to 
gc.log.<timestamp when the new jvm one starts>, if you're going to roll 
the logs then I would prefer a meaningful suffix rather than just a 
counter. 

I second the idea that datestamps should be the default.

I think a unified, easily parseable but still readable output would be 
great though wouldn't you still need a verbose output that is specific to 
each collector in order to provide a "debug" level of detail?

Cheers
Matt

Matt Khan
--------------------------------------------------
GFFX Auto Trading
Deutsche Bank, London


Tony Printezis <tony.printezis at oracle.com> 
Sent by: hotspot-gc-use-bounces at openjdk.java.net
06/05/2010 20:32

To
hotspot-gc-use at openjdk.java.net
cc

Subject
Feedback requested: HotSpot GC logging improvements


Hi all,

We would like your input on some changes to HotSpot's GC logging that we 
have been discussing. We have been wanting to improve our GC logging for 
some time. However we haven't had the resources to spend on it. We don't 
know when we'll get to it, but we'd still like to get some feedback on 
our plans.

The changes fall into two categories.


A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.

I strongly believe that maintaining two GC log formats is 
counter-productive, especially given that the current -verbosegc format 
is unhelpful in many ways (i.e., lacks a lot of helpful information). 
So, we would like to unify the two into one, with maybe 
-XX:+PrintGCDetails generating a superset of what -verbosegc would 
generate (so that a parser for the -XX:+PrintGCDetails output will also 
be able to parse the -verbosegc output). The new output will not be what 
-XX:+PrintGCDetails generates today but something that can be reliably 
parsed and it is also reasonably human-readable (so, no xml and no 
space/tab-separated formats). Additionally, we're proposing to enable 
-XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and 
ignore that option, I can't believe that users will really not want a 
time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be 
optional though.

Specific questions:

- Is anyone really attached to the old -verbosegc output?
- Would anyone really hate having time stamps by default?
- I know that a lot of folks have their own parsers for our current GC 
log formats. Would you be happy if we provided you with a (reliable!) 
parser for the new format in Java that you can easily adapt?


B. Introducing "cyclic" GC logs.

This is something that a lot of folks have asked for given that they 
were concerned with the GC logs getting very large (a 1TB disk is $85 
these days, but anyway...). Given that each GC log record is of variable 
size, we cannot easily cycle through the log using the same file (I'd 
rather not have to overwrite existing records). Our current proposal is 
for the user to specify a file number N and a size target S for each 
file. For a given GC log -Xloggc:foo, HotSpot will generate

foo.00000001
foo.00000002
foo.00000003
etc.

(we'll create a new file as soon as the size of the one we are writing 
to exceeds S, so each file will be slightly larger than S but it will be 
helpful not to split individual log records between two files)

When we create a new file, if we have more than N files we'll delete the 
oldest. So, in the above example, if N == 3, when we create foo.00000004 
we'll delete foo.00000001.

Note that in the above scheme, the logs are not really "cyclic" but, 
instead, we're pruning the oldest records every now and then, which has 
the same effect.

Another (related) request has been to maybe append the GC log file name 
with the pid of the JVM that's generating it. Maybe we don't want to do 
this by default. But, would people find it helpful if we provide a new 
cmd line parameter to do that? So, for the above example and assuming 
that the JVM's pid is 1234, the GC log file(s) will be either:

foo.1234

or

foo.1234.00000001
foo.1234.00000002
foo.1234.00000003
etc.

Specific questions:

- Would people really hate it if HotSpot starts appending the GC log 
file name with a (zero-padded) sequence number? Maybe if N == 1 (the 
default), HotSpot will skip the sequence number and ignore S, i.e., 
behave as it does today.
- To the people who have been asking for cyclic GC logs: is the sequence 
number scheme above good enough?


Thanks in advance for your feedback,

Tony, HotSpot GC Group

_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


---

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to http://www.db.com/en/content/eu_disclosures.htm for additional EU corporate and regulatory disclosures.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100506/c23a82d8/attachment.html 

From matt.fowles at gmail.com  Thu May  6 13:05:46 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Thu, 6 May 2010 16:05:46 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
References: <4BE3194E.902@oracle.com>
	<OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
Message-ID: <z2qfbf191171005061305vd5d9f98fnd086a5fb1a70d9fc@mail.gmail.com>

Tony~

Definitely date stamps as the default.  The math to correlate things is
annoying and error prone.

Matt

On Thu, May 6, 2010 at 4:01 PM, Matt Khan <matt.khan at db.com> wrote:

>
> Evening
>
> we currently manage the log overwriting issue by mv'ing the last gc.log to
> gc.log.<timestamp when the new jvm one starts>, if you're going to roll the
> logs then I would prefer a meaningful suffix rather than just a counter.
>
> I second the idea that datestamps should be the default.
>
> I think a unified, easily parseable but still readable output would be
> great though wouldn't you still need a verbose output that is specific to
> each collector in order to provide a "debug" level of detail?
>
> Cheers
> Matt
>
> Matt Khan
> --------------------------------------------------
> GFFX Auto Trading
> Deutsche Bank, London
>
>
>
>  *Tony Printezis <tony.printezis at oracle.com>*
> Sent by: hotspot-gc-use-bounces at openjdk.java.net
>
> 06/05/2010 20:32
>   To
> hotspot-gc-use at openjdk.java.net
>  cc
>   Subject
> Feedback requested: HotSpot GC logging improvements
>
>
>
>
> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging that we
> have been discussing. We have been wanting to improve our GC logging for
> some time. However we haven't had the resources to spend on it. We don't
> know when we'll get to it, but we'd still like to get some feedback on
> our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will also
> be able to parse the -verbosegc output). The new output will not be what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete the
> oldest. So, in the above example, if N == 3, when we create foo.00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which has
> the same effect.
>
> Another (related) request has been to maybe append the GC log file name
> with the pid of the JVM that's generating it. Maybe we don't want to do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> ---
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> Please refer to http://www.db.com/en/content/eu_disclosures.htm for
> additional EU corporate and regulatory disclosures.
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100506/6a805325/attachment-0001.html 

From michael.finocchiaro at gmail.com  Thu May  6 13:20:35 2010
From: michael.finocchiaro at gmail.com (Michael Finocchiaro)
Date: Thu, 6 May 2010 22:20:35 +0200
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
References: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
Message-ID: <AC67B421-08D9-469B-96B5-038C834E4E03@gmail.com>

Perhaps the logfile suffix could be parametrized (sequence, datestamp,  
etc)?
GCTimeStamps efinitely should be definitely be the default because as  
previously pointed out, post-mortem without 'em is painful.
For more debug detail per collector one could use PrintHeapAtGC or  
similar couldn't one?
I always had a weakness for the simplicity of HP's format - one line,  
space separated with time, date, cause and sizes of each generation.  
Super easy to parse and even eye-scan. That being said, now that we  
have jVisualVM with snapshots and everything the GCDetails output is  
complete and practical as well.
Cheers,
Fini

Sent from Fino's iPhone 3GS

Michael Finocchiaro
Mobile +6 85 46 07 62
http://mfinocchiaro.wordpress.com

On 6 mai 2010, at 22:01, Matt Khan <matt.khan at db.com> wrote:

>
> Evening
>
> we currently manage the log overwriting issue by mv'ing the last  
> gc.log to gc.log.<timestamp when the new jvm one starts>, if you're  
> going to roll the logs then I would prefer a meaningful suffix  
> rather than just a counter.
>
> I second the idea that datestamps should be the default.
>
> I think a unified, easily parseable but still readable output would  
> be great though wouldn't you still need a verbose output that is  
> specific to each collector in order to provide a "debug" level of  
> detail?
>
> Cheers
> Matt
>
> Matt Khan
> --------------------------------------------------
> GFFX Auto Trading
> Deutsche Bank, London
>
>
>
> Tony Printezis <tony.printezis at oracle.com>
> Sent by: hotspot-gc-use-bounces at openjdk.java.net
> 06/05/2010 20:32
>
> To
> hotspot-gc-use at openjdk.java.net
> cc
> Subject
> Feedback requested: HotSpot GC logging improvements
>
>
>
>
>
> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging  
> that we
> have been discussing. We have been wanting to improve our GC logging  
> for
> some time. However we haven't had the resources to spend on it. We  
> don't
> know when we'll get to it, but we'd still like to get some feedback on
> our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails  
> output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc  
> format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will  
> also
> be able to parse the -verbosegc output). The new output will not be  
> what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate  
> and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to  
> be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of  
> variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal  
> is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it  
> will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete  
> the
> oldest. So, in the above example, if N == 3, when we create foo. 
> 00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which  
> has
> the same effect.
>
> Another (related) request has been to maybe append the GC log file  
> name
> with the pid of the JVM that's generating it. Maybe we don't want to  
> do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the  
> sequence
> number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> ---
>
> This e-mail may contain confidential and/or privileged information.  
> If you are not the intended recipient (or have received this e-mail  
> in error) please notify the sender immediately and delete this e- 
> mail. Any unauthorized copying, disclosure or distribution of the  
> material in this e-mail is strictly forbidden.
>
> Please refer to http://www.db.com/en/content/eu_disclosures.htm for  
> additional EU corporate and regulatory disclosures.
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100506/7cd97d1c/attachment.html 

From adamh at basis.com  Thu May  6 13:28:51 2010
From: adamh at basis.com (Adam Hawthorne)
Date: Thu, 6 May 2010 16:28:51 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <y2yae7416cf1005061328if2faff28u744ff32df6c2ead6@mail.gmail.com>

Tony,

On Thu, May 6, 2010 at 15:32, Tony Printezis <tony.printezis at oracle.com>wrote:

> Hi all,
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.
>
[snip]

> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
>

We aren't.

- Would anyone really hate having time stamps by default?
>

I'm in agreement with other folks; timestamps are okay, but date stamps by
default would be better.

- I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?
>

+1.  Or +10.


>
> B. Introducing "cyclic" GC logs.
> Specific questions:

[snip]

> - Would people really hate it if HotSpot starts appending the GC log

 file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.


How many digits in the sequence?  Would that be configurable?

Overall, having this is better than not having it.

- To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?
>
>
Much better than nothing at all for disk-conscious customers.

Thanks,

Adam

--
Adam Hawthorne
Software Engineer
BASIS International Ltd.
www.basis.com
+1.505.345.5232 Phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100506/806fd8b4/attachment.html 

From jeff.lloyd at algorithmics.com  Thu May  6 13:27:58 2010
From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com)
Date: Thu, 6 May 2010 16:27:58 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0C6C5A2F@TORMAIL.algorithmics.com>

Hi Tony,

I'm in favour of item A.  I'm not attached to the old format, and I'd
like to see time stamps as the default.

I also like item B.

Can I make a suggestion for item B?  Have you noticed everyone who posts
their GC log file additionally has to include in their email the gc
config options that they think they used for that particular run?  In
our application we prefix _every_ "cyclic" log file with the config
options used to start the app.  It makes reporting problems easier and
we can see which options the client used by looking at the top of the
log file instead of having to ask the client for what they thought they
used.

Jeff

-----Original Message-----
From: hotspot-gc-use-bounces at openjdk.java.net
[mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Tony
Printezis
Sent: Thursday, May 06, 2010 3:33 PM
To: hotspot-gc-use at openjdk.java.net
Subject: Feedback requested: HotSpot GC logging improvements

Hi all,

We would like your input on some changes to HotSpot's GC logging that we

have been discussing. We have been wanting to improve our GC logging for

some time. However we haven't had the resources to spend on it. We don't

know when we'll get to it, but we'd still like to get some feedback on 
our plans.

The changes fall into two categories.


A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails
output.

I strongly believe that maintaining two GC log formats is 
counter-productive, especially given that the current -verbosegc format 
is unhelpful in many ways (i.e., lacks a lot of helpful information). 
So, we would like to unify the two into one, with maybe 
-XX:+PrintGCDetails generating a superset of what -verbosegc would 
generate (so that a parser for the -XX:+PrintGCDetails output will also 
be able to parse the -verbosegc output). The new output will not be what

-XX:+PrintGCDetails generates today but something that can be reliably 
parsed and it is also reasonably human-readable (so, no xml and no 
space/tab-separated formats). Additionally, we're proposing to enable 
-XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and

ignore that option, I can't believe that users will really not want a 
time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be 
optional though.

Specific questions:

- Is anyone really attached to the old -verbosegc output?
- Would anyone really hate having time stamps by default?
- I know that a lot of folks have their own parsers for our current GC 
log formats. Would you be happy if we provided you with a (reliable!) 
parser for the new format in Java that you can easily adapt?


B. Introducing "cyclic" GC logs.

This is something that a lot of folks have asked for given that they 
were concerned with the GC logs getting very large (a 1TB disk is $85 
these days, but anyway...). Given that each GC log record is of variable

size, we cannot easily cycle through the log using the same file (I'd 
rather not have to overwrite existing records). Our current proposal is 
for the user to specify a file number N and a size target S for each 
file. For a given GC log -Xloggc:foo, HotSpot will generate

foo.00000001
foo.00000002
foo.00000003
etc.

(we'll create a new file as soon as the size of the one we are writing 
to exceeds S, so each file will be slightly larger than S but it will be

helpful not to split individual log records between two files)

When we create a new file, if we have more than N files we'll delete the

oldest. So, in the above example, if N == 3, when we create foo.00000004

we'll delete foo.00000001.

Note that in the above scheme, the logs are not really "cyclic" but, 
instead, we're pruning the oldest records every now and then, which has 
the same effect.

Another (related) request has been to maybe append the GC log file name 
with the pid of the JVM that's generating it. Maybe we don't want to do 
this by default. But, would people find it helpful if we provide a new 
cmd line parameter to do that? So, for the above example and assuming 
that the JVM's pid is 1234, the GC log file(s) will be either:

foo.1234

or

foo.1234.00000001
foo.1234.00000002
foo.1234.00000003
etc.

Specific questions:

- Would people really hate it if HotSpot starts appending the GC log 
file name with a (zero-padded) sequence number? Maybe if N == 1 (the 
default), HotSpot will skip the sequence number and ignore S, i.e., 
behave as it does today.
- To the people who have been asking for cyclic GC logs: is the sequence

number scheme above good enough?


Thanks in advance for your feedback,

Tony, HotSpot GC Group

_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

 
--------------------------------------------------------------------------
This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
--------------------------------------------------------------------------

From ycraig at cysystems.com  Thu May  6 13:47:17 2010
From: ycraig at cysystems.com (craig yeldell)
Date: Thu, 6 May 2010 16:47:17 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <D518EB5E-1427-41E5-8760-0E78A7B4CD55@cysystems.com>

A.
- No I am not attached to the old -verbosegc output.
- I would also think that having the time stamps should be default.
- Having a reliable parser would definitely be a welcome addition.

B.
- We handle our own gc log rotation, but if you were to provide it I  
would prefer the zero padded sequence number.
- Not one of those folks.

Regards,
Craig

On May 6, 2010, at 3:32 PM, Tony Printezis wrote:

> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging  
> that we
> have been discussing. We have been wanting to improve our GC logging  
> for
> some time. However we haven't had the resources to spend on it. We  
> don't
> know when we'll get to it, but we'd still like to get some feedback on
> our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails  
> output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc  
> format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will  
> also
> be able to parse the -verbosegc output). The new output will not be  
> what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate  
> and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to  
> be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of  
> variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal  
> is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it  
> will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete  
> the
> oldest. So, in the above example, if N == 3, when we create foo. 
> 00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which  
> has
> the same effect.
>
> Another (related) request has been to maybe append the GC log file  
> name
> with the pid of the JVM that's generating it. Maybe we don't want to  
> do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the  
> sequence
> number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From Matthew.H.Miller at Sun.COM  Thu May  6 13:46:20 2010
From: Matthew.H.Miller at Sun.COM (Matthew Miller)
Date: Thu, 06 May 2010 16:46:20 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0C6C5A2F@TORMAIL.algorithmics.com>
References: <4BE3194E.902@oracle.com>
	<0FCC438D62A5E643AA3F57D3417B220D0C6C5A2F@TORMAIL.algorithmics.com>
Message-ID: <4BE32A9C.8010705@sun.com>

I'd like to say +1 to the idea below - if it were some how possible to 
(somewhat) easily tell all the GC options used from with in the GC Log 
itself, it would be very useful. (Especially to the support organization).

-Matt

On 5/6/2010 4:27 PM, jeff.lloyd at algorithmics.com wrote:

[... Snip ...]
> Can I make a suggestion for item B?  Have you noticed everyone who posts
> their GC log file additionally has to include in their email the gc
> config options that they think they used for that particular run?  In
> our application we prefix _every_ "cyclic" log file with the config
> options used to start the app.  It makes reporting problems easier and
> we can see which options the client used by looking at the top of the
> log file instead of having to ask the client for what they thought they
> used.
>
> Jeff
>    


From rainer.jung at kippdata.de  Thu May  6 13:04:31 2010
From: rainer.jung at kippdata.de (Rainer Jung)
Date: Thu, 06 May 2010 22:04:31 +0200
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <4BE320CF.402@kippdata.de>

Wonderful! Comments inline.

On 06.05.2010 21:32, Tony Printezis wrote:
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will also
> be able to parse the -verbosegc output). The new output will not be what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?

Not at all if we have a chance to get something better.

> - Would anyone really hate having time stamps by default?

We had to do a lot of quirks to simulate GCDateStamps for years until 
they finally made it into Java 6. Having timestamps by default is a 
must, absolute timestamps should be at least optional. Personally I find 
the absolute timestamps more important than the once relative from the 
JVM start, but that depends on what you are doing. Gathering statistical 
data the relative ones are better, since you can do computations more 
easily, tracking problems sometimes the absolute ones are easier, 
because you quickly want to know whether the log lines match the time of 
day when you observed the problem.

> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?

Of course. Although I can imagine folks qould want to get it in 
different implementation techniques. I guess you plan to provide a 
parser written in Java? Would be great, if it could be provided in a way 
making it easy fr people to customize, so possibly Open Source with a 
nice license like Apache Software License 2.

> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete the
> oldest. So, in the above example, if N == 3, when we create foo.00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which has
> the same effect.

There's a lot of options here. When you doing log rotation, people who 
want to archive the logs might have regular jobs (cron and friends) 
fetching the old closed files and transferring them to another system. 
In that case it would be nice if the apparatus would not get into 
conflict by both the internal rotation and the external script operating 
on the same files. f00.00000001 might have been detected as old and 
copied to the remote host and during the same time GC decides to now 
reuse it. Of course people can increase the cycle length and so on, but 
I always found it a bit problematic if a log rotation mechanism touches 
old files long after the rotation happened. That's why I personally find 
externally organized pruning better. Of course than it's not carefree 
out of the box.

Another thing I often miss is the ability to combine size and time based 
rotation. I want to say: rotate whenever 10MB are full so that the 
chunks I need to handle do not get to big, but please also rotate at 
midnight, so that I know that I can grab the complete files of the day 
after midnight. So specifying a max size and a time pattern and the 
first criterium fulfilled already triggers rotation.

> Another (related) request has been to maybe append the GC log file name
> with the pid of the JVM that's generating it. Maybe we don't want to do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?

Some time ago I asked whether it would be possible to get the %p 
substitution (replace it by the process id) that is already available 
for some files also for the GC log. I think it already exists in the JDK 
code either for the HeapDumpOnOutOfMemoryError or the hotspot error 
file. Forgot for which. The code is extremely simple.

Would foo.%p.%8N be to complex?

Great initiative! Will you start another discussion about the data 
contents of the file? It could be interesting when people describe what 
kind of information they extract out of the GC logs. Not everything is 
straightforward in the sense of it is based on individual lines. As an 
example I always calculate the total stopped time per minute (summing 
up) as a percentage of wallclock time.

Regards,

Rainer

From rainer.jung at kippdata.de  Thu May  6 13:18:09 2010
From: rainer.jung at kippdata.de (Rainer Jung)
Date: Thu, 06 May 2010 22:18:09 +0200
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <4BE32401.9070309@kippdata.de>

Short addition to my previous post:

On 06.05.2010 21:32, Tony Printezis wrote:
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?

Another slight problem with the numbering scheme is that during 
archiving you'll overwrite old files. So your archive scripts need to 
intelligently rename the files. Not too easy.

Maybe adding a couple of substitution characters (%p=pid, %N roll 
number, %Y, ... the usual strftime caharcters for timestamp formatting).

Regards,

Rainer

From Peter.B.Kessler at Oracle.COM  Thu May  6 15:00:43 2010
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Thu, 06 May 2010 15:00:43 -0700
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
References: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
Message-ID: <4BE33C0B.5060009@Oracle.COM>

+1 on using date stamps in the file names if you have to split a GC log into several files.

If you use the same format as is used for the -XX:+PrintGCDateStamps, ISO 8601, then lexicographic order, e.g. from file listings, should also be in time sequence order, which would be convenient.  (Then, if only we could get PID's to be monotonically increasing. :-)  That might imply that you want PID's after (less significant than) time stamps in the split file names, so you still get them in time sequence order.  If you want just the logs from PID 1234, you can use "ls *.1234.*" to get just those, in time sequence order.

Do you want an option to start a new log file in a sequence after some period of time?  E.g., once a day?  That might make it easier to line up events across long-running JVM's that are collecting (and therefore generating logs) at different rates.

+1 on including the command line arguments (or maybe the settings those provoke inside the VM) in the log file.  For settings that change over time, e.g., because of ergonomics, it would be good to have a way to see those, too.  Maybe "a new log file format" allows that to happen.

			... peter

Matt Khan wrote:
> 
> Evening
> 
> we currently manage the log overwriting issue by mv'ing the last gc.log 
> to gc.log.<timestamp when the new jvm one starts>, if you're going to 
> roll the logs then I would prefer a meaningful suffix rather than just a 
> counter.
> 
> I second the idea that datestamps should be the default.
> 
> I think a unified, easily parseable but still readable output would be 
> great though wouldn't you still need a verbose output that is specific 
> to each collector in order to provide a "debug" level of detail?
> 
> Cheers
> Matt
> 
> Matt Khan
> --------------------------------------------------
> GFFX Auto Trading
> Deutsche Bank, London

From doug.jones at internet.co.nz  Thu May  6 17:45:59 2010
From: doug.jones at internet.co.nz (Doug Jones)
Date: Fri, 7 May 2010 12:45:59 +1200
Subject: Feedback requested: HotSpot GC logging improvements
References: <4BE3194E.902@oracle.com>
Message-ID: <003001caed7e$c5c22e20$9011b9d2@userf9r7stx6j4>

The biggest problem to us is that when the JVM is restarted the previous GC
log file is overwritten. So I would like to suggest the following:

1) At midnight each day the GC log is cycled by appending the old day's date
(in YYYYMMDD format). So the currently being written to log is always what
is specified on gclog.

2) When the JVM is started, if a file of the name specified on gclog exists
then it is renamed to the appropriate YYYYMMDD file (preferably taking the
date from the file last written to date, not the current date) and a fresh
GC log started. If the YYYYMMDD file already exists (ie the JVM has already
been restarted that day) then the last GC log would just be appended to
that.

This would seem to have a number of advantages: it overcomes the problem of
the last GC log being overwritten on a restart; the GC log is automatically
kept to a manageable size - most often when we are looking at a GC problem
we want to see what is in the log over the last few hours; and it means that
it is easy for each site to implement its own retention policy, eg deleting
old logs after NN days. I don't really see that as being the responsibility
of the JVM. I guess it also has the advantage that in the case that
DateStamps are
not turned on then the first entry in the log for a day will have the
TimeStamp for the start of the day, so becomes much easier to work out the
time of the
day a subsequent event in the log occurred.

For the other question our vote is: leave PrintGC Details much as it is and
deprecate verbosegc, turn on GCTimeStamps by default but agree not
DateStamps (I suspect just getting GCTime is the least overhead system
call).

Doug.


----- Original Message -----
From: "Tony Printezis" <tony.printezis at oracle.com>
To: <hotspot-gc-use at openjdk.java.net>
Sent: Friday, May 07, 2010 7:32 AM
Subject: Feedback requested: HotSpot GC logging improvements


> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging that we
> have been discussing. We have been wanting to improve our GC logging for
> some time. However we haven't had the resources to spend on it. We don't
> know when we'll get to it, but we'd still like to get some feedback on
> our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will also
> be able to parse the -verbosegc output). The new output will not be what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete the
> oldest. So, in the above example, if N == 3, when we create foo.00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which has
> the same effect.
>
> Another (related) request has been to maybe append the GC log file name
> with the pid of the JVM that's generating it. Maybe we don't want to do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From Johann.Loefflmann at Sun.COM  Fri May  7 02:05:42 2010
From: Johann.Loefflmann at Sun.COM (Johann N. Loefflmann)
Date: Fri, 07 May 2010 11:05:42 +0200
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <4BE3D7E6.7050503@sun.com>

Tony,

> B. Introducing "cyclic" GC logs.
>   
For the Java Fatal Error Log we can specify %p, for example:

-XX:ErrorFile=/var/log/java/java_error%p.log

IMHO it would be great if we could use %p also for

-Xloggc:/var/log/java/gc%p.log

See also
http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/felog.html#gbwcy

IMHO options in order to configure the format of the gc log filename 
would be more
comfortable for an user than only hardcoding a particular schema. Proposal:

%p   the process ID
%t   a timestamp (the format could be controlled by a new option,
        I suggest -XX:TimeStampFormatForLogFileNames)
%n   an index or sequential number

I suggest a default for the timestamp format, something like

-XX:TimeStampFormatForLogFileNames=yyyy-MM-dd_HH-mm-ss

because digits, hypen and underscore are file system independent, the 
format is
human readable and the default can be changed by specifying the option 
if the default is
not suitable. Furthermore, the option could be used by any log and not 
just only
the gc log (the fatal error log for example).

Pattern letters for the option could be borrowed from the 
SimpleDateFormat class. See also
http://java.sun.com/javase/6/docs/api/java/text/SimpleDateFormat.html

I'm sure that customer would find it quite comfortable to specify 
something like that

-Xloggc:/var/log/java/gclog_pid%p_%n_%t.log

-Johann
(Software TSC Support engineer)


From rainer.jung at kippdata.de  Thu May  6 22:05:40 2010
From: rainer.jung at kippdata.de (Rainer Jung)
Date: Fri, 07 May 2010 07:05:40 +0200
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE33C0B.5060009@Oracle.COM>
References: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
	<4BE33C0B.5060009@Oracle.COM>
Message-ID: <4BE39FA4.10005@kippdata.de>

On 07.05.2010 00:00, Peter B. Kessler wrote:
> +1 on including the command line arguments (or maybe the settings those provoke inside the VM) in the log file.  For settings that change over time, e.g., because of ergonomics, it would be good to have a way to see those, too.  Maybe "a new log file format" allows that to happen.

+1

From chkwok at digibites.nl  Fri May  7 07:54:45 2010
From: chkwok at digibites.nl (Chi Ho Kwok)
Date: Fri, 7 May 2010 16:54:45 +0200
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE39FA4.10005@kippdata.de>
References: <OF11F32B52.F251A81D-ON8025771B.006D54BB-8025771B.006E04E0@db.com>
	<4BE33C0B.5060009@Oracle.COM> <4BE39FA4.10005@kippdata.de>
Message-ID: <q2o1b9d6f691005070754i8ba885b1o4a5523ac4b17cce0@mail.gmail.com>

Talking about a new log format... Is it possible to send the log events to
the java.util.logging system too so we can do the whole redirection, write
to file or integration in log4j via adapters ourselves, if we choose to?
Just dump the string message in a LogRecord and put all parameters into
setParameters(), so you can access either the string / formatted message or
get a machine readable version by calling getParameters(). Parameters are
arrays like ["GCNew", time spent, prev usage, current usage, etc], where the
first field defines the type, and the other parameters' interpretation
depend on the type.

Okay, with this, you can't just add a some command line flags to reconfigure
GC logging, but this makes integrating GC related things in apps much
easier, you can write a { if concurrent collector failed - send the admin an
email that the app stopped for $x seconds, please fix } script in a minute;
or if { gc overhead > 10% then maybe you should increase the heap size }
trigger. No more writing scripts that parse the output of the gc log to
check for weird things. No more parsing at all.


Chi Ho Kwok

On Fri, May 7, 2010 at 7:05 AM, Rainer Jung <rainer.jung at kippdata.de> wrote:

> On 07.05.2010 00:00, Peter B. Kessler wrote:
> > +1 on including the command line arguments (or maybe the settings those
> provoke inside the VM) in the log file.  For settings that change over time,
> e.g., because of ergonomics, it would be good to have a way to see those,
> too.  Maybe "a new log file format" allows that to happen.
>
> +1
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100507/08fc5ff5/attachment.html 

From tony.printezis at oracle.com  Fri May  7 09:15:19 2010
From: tony.printezis at oracle.com (Tony Printezis)
Date: Fri, 07 May 2010 12:15:19 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE3194E.902@oracle.com>
References: <4BE3194E.902@oracle.com>
Message-ID: <4BE43C97.6000805@oracle.com>

Hi all,

First, thank you for all the excellent feedback (which I see as mostly 
positive to the proposals). We are glad that people still care about the 
GC logs. Instead of replying to individual e-mails, I'll consolidate my 
replies here.

* "I would say that PrintGCDateStamps should be the default" (several 
folks brought this up)

 From the point of view of analyzing logs to just look at the GC's 
behavior, we only need time stamps. And this is the reason why I'd like 
to see them turned on by default (too many times we got a log without 
time stamps for which we said "damn, if it had time stamps we'd get a 
better idea of what was happening"). So, they are the minimum we need to 
get a good picture of how the GC behaved. Date stamps will increase the 
size of the log (which still seems to be an issue for some people) and 
be helpful in fewer places (i.e., when comparing application and GC 
events; but we generally do not do that). So, you'll have to turn them 
on yourselves. :-)

* "successive runs overwrite the previous log file" (several folks 
brought this up)

Don't you think that adding the JVM's pid to the log file name would 
eliminate this problem?

* "I'm not attached to the old format" (several folks mentioned this)

Oh, good. I'll be quoting you when I'll be making a case to remove it.

* "A serial-attached-scsi (aka SAS) disk at 10k rpm is a little bit more 
expensive than $85/TB" (Ryan)

Point taken, but do you really need super duper 10k rpm disks to store 
GC log files. :-)

* "if you're going to roll the logs then I would prefer a meaningful 
suffix rather than just a counter."

A counter seems like a perfectly meaningful suffix to me.

* "wouldn't you still need a verbose output that is specific to each 
collector in order to provide a "debug" level of detail?" (Matt)

Very good point. The verbose output will be as unified as possible, but 
with indeed GC-specific extensions not to lose that information.

* "I guess you plan to provide a parser written in Java?" (Rainer)

Java? We're HotSpot developers! We only work in C++, assembly, and awk! 
Just kidding... Yes, indeed in Java.

* "so possibly Open Source with a nice license like Apache Software 
License 2" (Rainer)

Maybe, and not up to me to decide.

* "f00.00000001 might have been detected as old and copied to the remote 
host and during the same time GC decides to now reuse it...  That's why 
I personally find externally organized pruning better. Another thing I 
often miss is the ability to combine size and time based rotation." (Rainer)

The proposal never reuses log files. We'll never overwrite anything. 
Instead, we'll delete the oldest files as we create new ones. If we tell 
the users to prune the older log files themselves, I know what the first 
bug filed against the new policy will be. :-) Regarding rotating based 
on both size and time: most people care about size so I think that's 
what we'll do. If you want more advanced management of the logs you'll 
have to set N to infinity (at least we'll need a way to say "never 
delete older files") so that HotSpot doesn't delete any files and you'll 
be able to copy them and delete them yourself.

But, seriously, this is excellent feedback. You guys are doing more wild 
stuff with our logs than I had imagined. :-)

* "Will you start another discussion about the data contents of the 
file?" (Rainer)

We'll do that separately, based most likely on a wiki. When we get to 
it. No promises though!

* "For more debug detail per collector one could use PrintHeapAtGC" 
(Michael)

Well, PrintHeapAtGC was supposed to be added for debugging purposes, 
i.e., to find out what the address range of each generation is. However, 
it has clearer information on how full each generation is which is why 
people use it today (it's very space inefficient though...). We are 
hoping to add that information to the standard GC log records to 
eliminate the need for PrintHeapAtGC.

* "In our application we prefix _every_ "cyclic" log file with the 
config options used to start the app." (Jeff)

Adding configuration / whatever information at the top of every log file 
fragment is an excellent suggestion. Thanks for bringing it up.

* "How many digits in the sequence?  Would that be configurable?" (Adam)

8 should be more enough (do you really see the need for more than 99m 
log fragments)? Actually, even 6 will probably  be enough. And if we go 
over that, we won't cycle the numbers, we'll just expand the number field.

* "IMHO it would be great if we could use %p also for" (Johann)

I was going to say that this would start getting over the top. But I was 
not aware that you can do that with the fatal error log. I'll need to 
investigate that further. So, we'll leave this (and additional custom 
formatting in the GC log name) as a "maybe". :-) I'm not quite sure 
whether we'd want to use the same facility for the sequence numbers 
though, given that they'd be needed if we split the log and won't be 
needed if we don't. For those, I just vote to just add a suffix to the 
log file name when they are needed.

Thanks again for all the good points,

Tony, HotSpot GC Group

On 5/6/2010 3:32 PM, Tony Printezis wrote:
> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging that 
> we have been discussing. We have been wanting to improve our GC 
> logging for some time. However we haven't had the resources to spend 
> on it. We don't know when we'll get to it, but we'd still like to get 
> some feedback on our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails 
> output.
>
> I strongly believe that maintaining two GC log formats is 
> counter-productive, especially given that the current -verbosegc 
> format is unhelpful in many ways (i.e., lacks a lot of helpful 
> information). So, we would like to unify the two into one, with maybe 
> -XX:+PrintGCDetails generating a superset of what -verbosegc would 
> generate (so that a parser for the -XX:+PrintGCDetails output will 
> also be able to parse the -verbosegc output). The new output will not 
> be what -XX:+PrintGCDetails generates today but something that can be 
> reliably parsed and it is also reasonably human-readable (so, no xml 
> and no space/tab-separated formats). Additionally, we're proposing to 
> enable -XX:+PrintGCTimeStamps by default (in fact, we'll probably 
> deprecate and ignore that option, I can't believe that users will 
> really not want a time stamp per GC log record). We'll leave 
> -XX:+PrintGCDateStamps to be optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC 
> log formats. Would you be happy if we provided you with a (reliable!) 
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they 
> were concerned with the GC logs getting very large (a 1TB disk is $85 
> these days, but anyway...). Given that each GC log record is of 
> variable size, we cannot easily cycle through the log using the same 
> file (I'd rather not have to overwrite existing records). Our current 
> proposal is for the user to specify a file number N and a size target 
> S for each file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing 
> to exceeds S, so each file will be slightly larger than S but it will 
> be helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete 
> the oldest. So, in the above example, if N == 3, when we create 
> foo.00000004 we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but, 
> instead, we're pruning the oldest records every now and then, which 
> has the same effect.
>
> Another (related) request has been to maybe append the GC log file 
> name with the pid of the JVM that's generating it. Maybe we don't want 
> to do this by default. But, would people find it helpful if we provide 
> a new cmd line parameter to do that? So, for the above example and 
> assuming that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log 
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the 
> default), HotSpot will skip the sequence number and ignore S, i.e., 
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the 
> sequence number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
>

From ryanobjc at gmail.com  Fri May  7 14:39:20 2010
From: ryanobjc at gmail.com (Ryan Rawson)
Date: Fri, 7 May 2010 14:39:20 -0700
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE43C97.6000805@oracle.com>
References: <4BE3194E.902@oracle.com> <4BE43C97.6000805@oracle.com>
Message-ID: <y2x78568af11005071439ld70826dev468ad4278b8ed6e5@mail.gmail.com>

One last thing to keep in mind is that as you push Java to extremes of
performance (I am working on an open source database in Java), the
primary factor eventually becomes GC.  At this point in our dev cycle,
GC considerations dominate all others in any performance oriented
decisions.  Eventually this ripples down into other areas - like how
JNI is too slow and DirectByteBuffers are good, but potentially
limited for fine-data access.  At this point I have a prod
installation that takes 80ms GC pauses every second or more (the
object allocation pattern doesnt match the generational hypothesis).
Moving data out of the realm of GC into hand-managed is basically our
next step.  If DirectByteBuffers didn't exist the next step would be
writing a lot more JNI or porting away from Java (and all the pain
that entails).

Thanks to the success of Hadoop, the next area for Java is medium
performance systems code. You just would not believe the amount of
people writing DB and large data things in Java.

Thanks for the attention!
-ryan


On Fri, May 7, 2010 at 9:15 AM, Tony Printezis
<tony.printezis at oracle.com> wrote:
> Hi all,
>
> First, thank you for all the excellent feedback (which I see as mostly
> positive to the proposals). We are glad that people still care about the
> GC logs. Instead of replying to individual e-mails, I'll consolidate my
> replies here.
>
> * "I would say that PrintGCDateStamps should be the default" (several
> folks brought this up)
>
> ?From the point of view of analyzing logs to just look at the GC's
> behavior, we only need time stamps. And this is the reason why I'd like
> to see them turned on by default (too many times we got a log without
> time stamps for which we said "damn, if it had time stamps we'd get a
> better idea of what was happening"). So, they are the minimum we need to
> get a good picture of how the GC behaved. Date stamps will increase the
> size of the log (which still seems to be an issue for some people) and
> be helpful in fewer places (i.e., when comparing application and GC
> events; but we generally do not do that). So, you'll have to turn them
> on yourselves. :-)
>
> * "successive runs overwrite the previous log file" (several folks
> brought this up)
>
> Don't you think that adding the JVM's pid to the log file name would
> eliminate this problem?
>
> * "I'm not attached to the old format" (several folks mentioned this)
>
> Oh, good. I'll be quoting you when I'll be making a case to remove it.
>
> * "A serial-attached-scsi (aka SAS) disk at 10k rpm is a little bit more
> expensive than $85/TB" (Ryan)
>
> Point taken, but do you really need super duper 10k rpm disks to store
> GC log files. :-)
>
> * "if you're going to roll the logs then I would prefer a meaningful
> suffix rather than just a counter."
>
> A counter seems like a perfectly meaningful suffix to me.
>
> * "wouldn't you still need a verbose output that is specific to each
> collector in order to provide a "debug" level of detail?" (Matt)
>
> Very good point. The verbose output will be as unified as possible, but
> with indeed GC-specific extensions not to lose that information.
>
> * "I guess you plan to provide a parser written in Java?" (Rainer)
>
> Java? We're HotSpot developers! We only work in C++, assembly, and awk!
> Just kidding... Yes, indeed in Java.
>
> * "so possibly Open Source with a nice license like Apache Software
> License 2" (Rainer)
>
> Maybe, and not up to me to decide.
>
> * "f00.00000001 might have been detected as old and copied to the remote
> host and during the same time GC decides to now reuse it... ?That's why
> I personally find externally organized pruning better. Another thing I
> often miss is the ability to combine size and time based rotation." (Rainer)
>
> The proposal never reuses log files. We'll never overwrite anything.
> Instead, we'll delete the oldest files as we create new ones. If we tell
> the users to prune the older log files themselves, I know what the first
> bug filed against the new policy will be. :-) Regarding rotating based
> on both size and time: most people care about size so I think that's
> what we'll do. If you want more advanced management of the logs you'll
> have to set N to infinity (at least we'll need a way to say "never
> delete older files") so that HotSpot doesn't delete any files and you'll
> be able to copy them and delete them yourself.
>
> But, seriously, this is excellent feedback. You guys are doing more wild
> stuff with our logs than I had imagined. :-)
>
> * "Will you start another discussion about the data contents of the
> file?" (Rainer)
>
> We'll do that separately, based most likely on a wiki. When we get to
> it. No promises though!
>
> * "For more debug detail per collector one could use PrintHeapAtGC"
> (Michael)
>
> Well, PrintHeapAtGC was supposed to be added for debugging purposes,
> i.e., to find out what the address range of each generation is. However,
> it has clearer information on how full each generation is which is why
> people use it today (it's very space inefficient though...). We are
> hoping to add that information to the standard GC log records to
> eliminate the need for PrintHeapAtGC.
>
> * "In our application we prefix _every_ "cyclic" log file with the
> config options used to start the app." (Jeff)
>
> Adding configuration / whatever information at the top of every log file
> fragment is an excellent suggestion. Thanks for bringing it up.
>
> * "How many digits in the sequence? ?Would that be configurable?" (Adam)
>
> 8 should be more enough (do you really see the need for more than 99m
> log fragments)? Actually, even 6 will probably ?be enough. And if we go
> over that, we won't cycle the numbers, we'll just expand the number field.
>
> * "IMHO it would be great if we could use %p also for" (Johann)
>
> I was going to say that this would start getting over the top. But I was
> not aware that you can do that with the fatal error log. I'll need to
> investigate that further. So, we'll leave this (and additional custom
> formatting in the GC log name) as a "maybe". :-) I'm not quite sure
> whether we'd want to use the same facility for the sequence numbers
> though, given that they'd be needed if we split the log and won't be
> needed if we don't. For those, I just vote to just add a suffix to the
> log file name when they are needed.
>
> Thanks again for all the good points,
>
> Tony, HotSpot GC Group
>
> On 5/6/2010 3:32 PM, Tony Printezis wrote:
>> Hi all,
>>
>> We would like your input on some changes to HotSpot's GC logging that
>> we have been discussing. We have been wanting to improve our GC
>> logging for some time. However we haven't had the resources to spend
>> on it. We don't know when we'll get to it, but we'd still like to get
>> some feedback on our plans.
>>
>> The changes fall into two categories.
>>
>>
>> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails
>> output.
>>
>> I strongly believe that maintaining two GC log formats is
>> counter-productive, especially given that the current -verbosegc
>> format is unhelpful in many ways (i.e., lacks a lot of helpful
>> information). So, we would like to unify the two into one, with maybe
>> -XX:+PrintGCDetails generating a superset of what -verbosegc would
>> generate (so that a parser for the -XX:+PrintGCDetails output will
>> also be able to parse the -verbosegc output). The new output will not
>> be what -XX:+PrintGCDetails generates today but something that can be
>> reliably parsed and it is also reasonably human-readable (so, no xml
>> and no space/tab-separated formats). Additionally, we're proposing to
>> enable -XX:+PrintGCTimeStamps by default (in fact, we'll probably
>> deprecate and ignore that option, I can't believe that users will
>> really not want a time stamp per GC log record). We'll leave
>> -XX:+PrintGCDateStamps to be optional though.
>>
>> Specific questions:
>>
>> - Is anyone really attached to the old -verbosegc output?
>> - Would anyone really hate having time stamps by default?
>> - I know that a lot of folks have their own parsers for our current GC
>> log formats. Would you be happy if we provided you with a (reliable!)
>> parser for the new format in Java that you can easily adapt?
>>
>>
>> B. Introducing "cyclic" GC logs.
>>
>> This is something that a lot of folks have asked for given that they
>> were concerned with the GC logs getting very large (a 1TB disk is $85
>> these days, but anyway...). Given that each GC log record is of
>> variable size, we cannot easily cycle through the log using the same
>> file (I'd rather not have to overwrite existing records). Our current
>> proposal is for the user to specify a file number N and a size target
>> S for each file. For a given GC log -Xloggc:foo, HotSpot will generate
>>
>> foo.00000001
>> foo.00000002
>> foo.00000003
>> etc.
>>
>> (we'll create a new file as soon as the size of the one we are writing
>> to exceeds S, so each file will be slightly larger than S but it will
>> be helpful not to split individual log records between two files)
>>
>> When we create a new file, if we have more than N files we'll delete
>> the oldest. So, in the above example, if N == 3, when we create
>> foo.00000004 we'll delete foo.00000001.
>>
>> Note that in the above scheme, the logs are not really "cyclic" but,
>> instead, we're pruning the oldest records every now and then, which
>> has the same effect.
>>
>> Another (related) request has been to maybe append the GC log file
>> name with the pid of the JVM that's generating it. Maybe we don't want
>> to do this by default. But, would people find it helpful if we provide
>> a new cmd line parameter to do that? So, for the above example and
>> assuming that the JVM's pid is 1234, the GC log file(s) will be either:
>>
>> foo.1234
>>
>> or
>>
>> foo.1234.00000001
>> foo.1234.00000002
>> foo.1234.00000003
>> etc.
>>
>> Specific questions:
>>
>> - Would people really hate it if HotSpot starts appending the GC log
>> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
>> default), HotSpot will skip the sequence number and ignore S, i.e.,
>> behave as it does today.
>> - To the people who have been asking for cyclic GC logs: is the
>> sequence number scheme above good enough?
>>
>>
>> Thanks in advance for your feedback,
>>
>> Tony, HotSpot GC Group
>>
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From vkleinschmidt at gmail.com  Mon May 10 13:34:20 2010
From: vkleinschmidt at gmail.com (Volker Kleinschmidt)
Date: Mon, 10 May 2010 15:34:20 -0500
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE43C97.6000805@oracle.com>
References: <4BE3194E.902@oracle.com> <4BE43C97.6000805@oracle.com>
Message-ID: <AANLkTikh1Ecg0jaHZqn9p9DcUArzBUklMcqskk8B5qB_@mail.gmail.com>

Where is the advantage to using counter numbers in the log file names?
If you take the sensible suggestion made by several others here to use
ISO datetimestamps in the filenames, you have a natural sequence, no
worries about name re-use, and easily automated log maintenance by
those that want to keep these logs for a while. You could still
implement an auto-deletion of "older" logs for those that want it.
Each log can then easily be identified, and an optional PID in the
filename would be helpful too. But a counter? What info does that give
you by itself, without additional context? None whatsoever. That's why
others declared it as "not meaningful".

We mainly need gc logs for post-mortem performance problem analysis,
so the date/time stamps on the logs would be really handy to identify
which log to look at (we often don't get to look at the log on the
client system, hence file dates don't help us, and they often don't
have PrintGCDateStamps enabled). However the core issue for us is
prevention of log overwriting when -Xloggc specifies a fixed filename
and the VM gets restarted by the service wrapper watchdog feature,
i.e. when you really needed that GC log. So any auto-log-rolling
mechanism is much better than none and will make me yodel with joy :^)

--Volker

On Fri, May 7, 2010 at 11:15 AM, Tony Printezis
<tony.printezis at oracle.com> wrote:
> Hi all,
>
> First, thank you for all the excellent feedback (which I see as mostly
> positive to the proposals). We are glad that people still care about the
> GC logs. Instead of replying to individual e-mails, I'll consolidate my
> replies here.
>
> * "I would say that PrintGCDateStamps should be the default" (several
> folks brought this up)
>
> ?From the point of view of analyzing logs to just look at the GC's
> behavior, we only need time stamps. And this is the reason why I'd like
> to see them turned on by default (too many times we got a log without
> time stamps for which we said "damn, if it had time stamps we'd get a
> better idea of what was happening"). So, they are the minimum we need to
> get a good picture of how the GC behaved. Date stamps will increase the
> size of the log (which still seems to be an issue for some people) and
> be helpful in fewer places (i.e., when comparing application and GC
> events; but we generally do not do that). So, you'll have to turn them
> on yourselves. :-)
>
> * "successive runs overwrite the previous log file" (several folks
> brought this up)
>
> Don't you think that adding the JVM's pid to the log file name would
> eliminate this problem?
>
> * "I'm not attached to the old format" (several folks mentioned this)
>
> Oh, good. I'll be quoting you when I'll be making a case to remove it.
>
> * "A serial-attached-scsi (aka SAS) disk at 10k rpm is a little bit more
> expensive than $85/TB" (Ryan)
>
> Point taken, but do you really need super duper 10k rpm disks to store
> GC log files. :-)
>
> * "if you're going to roll the logs then I would prefer a meaningful
> suffix rather than just a counter."
>
> A counter seems like a perfectly meaningful suffix to me.
>
> * "wouldn't you still need a verbose output that is specific to each
> collector in order to provide a "debug" level of detail?" (Matt)
>
> Very good point. The verbose output will be as unified as possible, but
> with indeed GC-specific extensions not to lose that information.
>
> * "I guess you plan to provide a parser written in Java?" (Rainer)
>
> Java? We're HotSpot developers! We only work in C++, assembly, and awk!
> Just kidding... Yes, indeed in Java.
>
> * "so possibly Open Source with a nice license like Apache Software
> License 2" (Rainer)
>
> Maybe, and not up to me to decide.
>
> * "f00.00000001 might have been detected as old and copied to the remote
> host and during the same time GC decides to now reuse it... ?That's why
> I personally find externally organized pruning better. Another thing I
> often miss is the ability to combine size and time based rotation." (Rainer)
>
> The proposal never reuses log files. We'll never overwrite anything.
> Instead, we'll delete the oldest files as we create new ones. If we tell
> the users to prune the older log files themselves, I know what the first
> bug filed against the new policy will be. :-) Regarding rotating based
> on both size and time: most people care about size so I think that's
> what we'll do. If you want more advanced management of the logs you'll
> have to set N to infinity (at least we'll need a way to say "never
> delete older files") so that HotSpot doesn't delete any files and you'll
> be able to copy them and delete them yourself.
>
> But, seriously, this is excellent feedback. You guys are doing more wild
> stuff with our logs than I had imagined. :-)
>
> * "Will you start another discussion about the data contents of the
> file?" (Rainer)
>
> We'll do that separately, based most likely on a wiki. When we get to
> it. No promises though!
>
> * "For more debug detail per collector one could use PrintHeapAtGC"
> (Michael)
>
> Well, PrintHeapAtGC was supposed to be added for debugging purposes,
> i.e., to find out what the address range of each generation is. However,
> it has clearer information on how full each generation is which is why
> people use it today (it's very space inefficient though...). We are
> hoping to add that information to the standard GC log records to
> eliminate the need for PrintHeapAtGC.
>
> * "In our application we prefix _every_ "cyclic" log file with the
> config options used to start the app." (Jeff)
>
> Adding configuration / whatever information at the top of every log file
> fragment is an excellent suggestion. Thanks for bringing it up.
>
> * "How many digits in the sequence? ?Would that be configurable?" (Adam)
>
> 8 should be more enough (do you really see the need for more than 99m
> log fragments)? Actually, even 6 will probably ?be enough. And if we go
> over that, we won't cycle the numbers, we'll just expand the number field.
>
> * "IMHO it would be great if we could use %p also for" (Johann)
>
> I was going to say that this would start getting over the top. But I was
> not aware that you can do that with the fatal error log. I'll need to
> investigate that further. So, we'll leave this (and additional custom
> formatting in the GC log name) as a "maybe". :-) I'm not quite sure
> whether we'd want to use the same facility for the sequence numbers
> though, given that they'd be needed if we split the log and won't be
> needed if we don't. For those, I just vote to just add a suffix to the
> log file name when they are needed.
>
> Thanks again for all the good points,
>
> Tony, HotSpot GC Group
>
> On 5/6/2010 3:32 PM, Tony Printezis wrote:
>> Hi all,
>>
>> We would like your input on some changes to HotSpot's GC logging that
>> we have been discussing. We have been wanting to improve our GC
>> logging for some time. However we haven't had the resources to spend
>> on it. We don't know when we'll get to it, but we'd still like to get
>> some feedback on our plans.
>>
>> The changes fall into two categories.
>>
>>
>> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails
>> output.
>>
>> I strongly believe that maintaining two GC log formats is
>> counter-productive, especially given that the current -verbosegc
>> format is unhelpful in many ways (i.e., lacks a lot of helpful
>> information). So, we would like to unify the two into one, with maybe
>> -XX:+PrintGCDetails generating a superset of what -verbosegc would
>> generate (so that a parser for the -XX:+PrintGCDetails output will
>> also be able to parse the -verbosegc output). The new output will not
>> be what -XX:+PrintGCDetails generates today but something that can be
>> reliably parsed and it is also reasonably human-readable (so, no xml
>> and no space/tab-separated formats). Additionally, we're proposing to
>> enable -XX:+PrintGCTimeStamps by default (in fact, we'll probably
>> deprecate and ignore that option, I can't believe that users will
>> really not want a time stamp per GC log record). We'll leave
>> -XX:+PrintGCDateStamps to be optional though.
>>
>> Specific questions:
>>
>> - Is anyone really attached to the old -verbosegc output?
>> - Would anyone really hate having time stamps by default?
>> - I know that a lot of folks have their own parsers for our current GC
>> log formats. Would you be happy if we provided you with a (reliable!)
>> parser for the new format in Java that you can easily adapt?
>>
>>
>> B. Introducing "cyclic" GC logs.
>>
>> This is something that a lot of folks have asked for given that they
>> were concerned with the GC logs getting very large (a 1TB disk is $85
>> these days, but anyway...). Given that each GC log record is of
>> variable size, we cannot easily cycle through the log using the same
>> file (I'd rather not have to overwrite existing records). Our current
>> proposal is for the user to specify a file number N and a size target
>> S for each file. For a given GC log -Xloggc:foo, HotSpot will generate
>>
>> foo.00000001
>> foo.00000002
>> foo.00000003
>> etc.
>>
>> (we'll create a new file as soon as the size of the one we are writing
>> to exceeds S, so each file will be slightly larger than S but it will
>> be helpful not to split individual log records between two files)
>>
>> When we create a new file, if we have more than N files we'll delete
>> the oldest. So, in the above example, if N == 3, when we create
>> foo.00000004 we'll delete foo.00000001.
>>
>> Note that in the above scheme, the logs are not really "cyclic" but,
>> instead, we're pruning the oldest records every now and then, which
>> has the same effect.
>>
>> Another (related) request has been to maybe append the GC log file
>> name with the pid of the JVM that's generating it. Maybe we don't want
>> to do this by default. But, would people find it helpful if we provide
>> a new cmd line parameter to do that? So, for the above example and
>> assuming that the JVM's pid is 1234, the GC log file(s) will be either:
>>
>> foo.1234
>>
>> or
>>
>> foo.1234.00000001
>> foo.1234.00000002
>> foo.1234.00000003
>> etc.
>>
>> Specific questions:
>>
>> - Would people really hate it if HotSpot starts appending the GC log
>> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
>> default), HotSpot will skip the sequence number and ignore S, i.e.,
>> behave as it does today.
>> - To the people who have been asking for cyclic GC logs: is the
>> sequence number scheme above good enough?
>>
>>
>> Thanks in advance for your feedback,
>>
>> Tony, HotSpot GC Group
>>
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>


-- 
Volker Kleinschmidt
Senior Support Engineer
Blackboard Client Support

From tony.printezis at oracle.com  Mon May 10 15:41:28 2010
From: tony.printezis at oracle.com (Tony Printezis)
Date: Mon, 10 May 2010 18:41:28 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <AANLkTikh1Ecg0jaHZqn9p9DcUArzBUklMcqskk8B5qB_@mail.gmail.com>
References: <4BE3194E.902@oracle.com> <4BE43C97.6000805@oracle.com>
	<AANLkTikh1Ecg0jaHZqn9p9DcUArzBUklMcqskk8B5qB_@mail.gmail.com>
Message-ID: <4BE88B98.8090705@oracle.com>

Volker,

Volker Kleinschmidt wrote:
> Where is the advantage to using counter numbers in the log file names?
> If you take the sensible suggestion made by several others here to use
> ISO datetimestamps in the filenames, you have a natural sequence, no
> worries about name re-use, and easily automated log maintenance by
> those that want to keep these logs for a while. You could still
> implement an auto-deletion of "older" logs for those that want it.
> Each log can then easily be identified, and an optional PID 
Well, even with date stamps, if you don't have the JVM pid in the file 
name you won't know which JVM each log came from either. And, if you 
only have date stamps on the files names, you might not know whether 
you're missing a file in between the two your customer sent you (you'd 
need to look at the contents to see whether the time stamps are 
contiguous or there's a potential hole). Like the time stamps vs. date 
stamps argument, the sequence number is the minimum you'd need and some 
folks might want to also enable date stamps in addition to the seq numbers.

BTW, something that I just thought of: if we do introduce a way for 
people to use %n, %d, whatever in the GC log file names, would it also 
be helpful to have %h for "host name"?
> in the
> filename would be helpful too. But a counter? What info does that give
> you by itself, without additional context? None whatsoever. That's why
> others declared it as "not meaningful".
>
> We mainly need gc logs for post-mortem performance problem analysis,
> so the date/time stamps on the logs would be really handy to identify
> which log to look at (we often don't get to look at the log on the
> client system, hence file dates don't help us, 
Good point.
> and they often don't
> have PrintGCDateStamps enabled). However the core issue for us is
> prevention of log overwriting when -Xloggc specifies a fixed filename
> and the VM gets restarted by the service wrapper watchdog feature,
> i.e. when you really needed that GC log. So any auto-log-rolling
> mechanism is much better than none and will make me yodel with joy :^)
>   
Yodel? This is almost a good reason to drop that proposal asap. ;-) ;-) ;-)

Tony
> On Fri, May 7, 2010 at 11:15 AM, Tony Printezis
> <tony.printezis at oracle.com> wrote:
>   
>> Hi all,
>>
>> First, thank you for all the excellent feedback (which I see as mostly
>> positive to the proposals). We are glad that people still care about the
>> GC logs. Instead of replying to individual e-mails, I'll consolidate my
>> replies here.
>>
>> * "I would say that PrintGCDateStamps should be the default" (several
>> folks brought this up)
>>
>>  From the point of view of analyzing logs to just look at the GC's
>> behavior, we only need time stamps. And this is the reason why I'd like
>> to see them turned on by default (too many times we got a log without
>> time stamps for which we said "damn, if it had time stamps we'd get a
>> better idea of what was happening"). So, they are the minimum we need to
>> get a good picture of how the GC behaved. Date stamps will increase the
>> size of the log (which still seems to be an issue for some people) and
>> be helpful in fewer places (i.e., when comparing application and GC
>> events; but we generally do not do that). So, you'll have to turn them
>> on yourselves. :-)
>>
>> * "successive runs overwrite the previous log file" (several folks
>> brought this up)
>>
>> Don't you think that adding the JVM's pid to the log file name would
>> eliminate this problem?
>>
>> * "I'm not attached to the old format" (several folks mentioned this)
>>
>> Oh, good. I'll be quoting you when I'll be making a case to remove it.
>>
>> * "A serial-attached-scsi (aka SAS) disk at 10k rpm is a little bit more
>> expensive than $85/TB" (Ryan)
>>
>> Point taken, but do you really need super duper 10k rpm disks to store
>> GC log files. :-)
>>
>> * "if you're going to roll the logs then I would prefer a meaningful
>> suffix rather than just a counter."
>>
>> A counter seems like a perfectly meaningful suffix to me.
>>
>> * "wouldn't you still need a verbose output that is specific to each
>> collector in order to provide a "debug" level of detail?" (Matt)
>>
>> Very good point. The verbose output will be as unified as possible, but
>> with indeed GC-specific extensions not to lose that information.
>>
>> * "I guess you plan to provide a parser written in Java?" (Rainer)
>>
>> Java? We're HotSpot developers! We only work in C++, assembly, and awk!
>> Just kidding... Yes, indeed in Java.
>>
>> * "so possibly Open Source with a nice license like Apache Software
>> License 2" (Rainer)
>>
>> Maybe, and not up to me to decide.
>>
>> * "f00.00000001 might have been detected as old and copied to the remote
>> host and during the same time GC decides to now reuse it...  That's why
>> I personally find externally organized pruning better. Another thing I
>> often miss is the ability to combine size and time based rotation." (Rainer)
>>
>> The proposal never reuses log files. We'll never overwrite anything.
>> Instead, we'll delete the oldest files as we create new ones. If we tell
>> the users to prune the older log files themselves, I know what the first
>> bug filed against the new policy will be. :-) Regarding rotating based
>> on both size and time: most people care about size so I think that's
>> what we'll do. If you want more advanced management of the logs you'll
>> have to set N to infinity (at least we'll need a way to say "never
>> delete older files") so that HotSpot doesn't delete any files and you'll
>> be able to copy them and delete them yourself.
>>
>> But, seriously, this is excellent feedback. You guys are doing more wild
>> stuff with our logs than I had imagined. :-)
>>
>> * "Will you start another discussion about the data contents of the
>> file?" (Rainer)
>>
>> We'll do that separately, based most likely on a wiki. When we get to
>> it. No promises though!
>>
>> * "For more debug detail per collector one could use PrintHeapAtGC"
>> (Michael)
>>
>> Well, PrintHeapAtGC was supposed to be added for debugging purposes,
>> i.e., to find out what the address range of each generation is. However,
>> it has clearer information on how full each generation is which is why
>> people use it today (it's very space inefficient though...). We are
>> hoping to add that information to the standard GC log records to
>> eliminate the need for PrintHeapAtGC.
>>
>> * "In our application we prefix _every_ "cyclic" log file with the
>> config options used to start the app." (Jeff)
>>
>> Adding configuration / whatever information at the top of every log file
>> fragment is an excellent suggestion. Thanks for bringing it up.
>>
>> * "How many digits in the sequence?  Would that be configurable?" (Adam)
>>
>> 8 should be more enough (do you really see the need for more than 99m
>> log fragments)? Actually, even 6 will probably  be enough. And if we go
>> over that, we won't cycle the numbers, we'll just expand the number field.
>>
>> * "IMHO it would be great if we could use %p also for" (Johann)
>>
>> I was going to say that this would start getting over the top. But I was
>> not aware that you can do that with the fatal error log. I'll need to
>> investigate that further. So, we'll leave this (and additional custom
>> formatting in the GC log name) as a "maybe". :-) I'm not quite sure
>> whether we'd want to use the same facility for the sequence numbers
>> though, given that they'd be needed if we split the log and won't be
>> needed if we don't. For those, I just vote to just add a suffix to the
>> log file name when they are needed.
>>
>> Thanks again for all the good points,
>>
>> Tony, HotSpot GC Group
>>
>> On 5/6/2010 3:32 PM, Tony Printezis wrote:
>>     
>>> Hi all,
>>>
>>> We would like your input on some changes to HotSpot's GC logging that
>>> we have been discussing. We have been wanting to improve our GC
>>> logging for some time. However we haven't had the resources to spend
>>> on it. We don't know when we'll get to it, but we'd still like to get
>>> some feedback on our plans.
>>>
>>> The changes fall into two categories.
>>>
>>>
>>> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails
>>> output.
>>>
>>> I strongly believe that maintaining two GC log formats is
>>> counter-productive, especially given that the current -verbosegc
>>> format is unhelpful in many ways (i.e., lacks a lot of helpful
>>> information). So, we would like to unify the two into one, with maybe
>>> -XX:+PrintGCDetails generating a superset of what -verbosegc would
>>> generate (so that a parser for the -XX:+PrintGCDetails output will
>>> also be able to parse the -verbosegc output). The new output will not
>>> be what -XX:+PrintGCDetails generates today but something that can be
>>> reliably parsed and it is also reasonably human-readable (so, no xml
>>> and no space/tab-separated formats). Additionally, we're proposing to
>>> enable -XX:+PrintGCTimeStamps by default (in fact, we'll probably
>>> deprecate and ignore that option, I can't believe that users will
>>> really not want a time stamp per GC log record). We'll leave
>>> -XX:+PrintGCDateStamps to be optional though.
>>>
>>> Specific questions:
>>>
>>> - Is anyone really attached to the old -verbosegc output?
>>> - Would anyone really hate having time stamps by default?
>>> - I know that a lot of folks have their own parsers for our current GC
>>> log formats. Would you be happy if we provided you with a (reliable!)
>>> parser for the new format in Java that you can easily adapt?
>>>
>>>
>>> B. Introducing "cyclic" GC logs.
>>>
>>> This is something that a lot of folks have asked for given that they
>>> were concerned with the GC logs getting very large (a 1TB disk is $85
>>> these days, but anyway...). Given that each GC log record is of
>>> variable size, we cannot easily cycle through the log using the same
>>> file (I'd rather not have to overwrite existing records). Our current
>>> proposal is for the user to specify a file number N and a size target
>>> S for each file. For a given GC log -Xloggc:foo, HotSpot will generate
>>>
>>> foo.00000001
>>> foo.00000002
>>> foo.00000003
>>> etc.
>>>
>>> (we'll create a new file as soon as the size of the one we are writing
>>> to exceeds S, so each file will be slightly larger than S but it will
>>> be helpful not to split individual log records between two files)
>>>
>>> When we create a new file, if we have more than N files we'll delete
>>> the oldest. So, in the above example, if N == 3, when we create
>>> foo.00000004 we'll delete foo.00000001.
>>>
>>> Note that in the above scheme, the logs are not really "cyclic" but,
>>> instead, we're pruning the oldest records every now and then, which
>>> has the same effect.
>>>
>>> Another (related) request has been to maybe append the GC log file
>>> name with the pid of the JVM that's generating it. Maybe we don't want
>>> to do this by default. But, would people find it helpful if we provide
>>> a new cmd line parameter to do that? So, for the above example and
>>> assuming that the JVM's pid is 1234, the GC log file(s) will be either:
>>>
>>> foo.1234
>>>
>>> or
>>>
>>> foo.1234.00000001
>>> foo.1234.00000002
>>> foo.1234.00000003
>>> etc.
>>>
>>> Specific questions:
>>>
>>> - Would people really hate it if HotSpot starts appending the GC log
>>> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
>>> default), HotSpot will skip the sequence number and ignore S, i.e.,
>>> behave as it does today.
>>> - To the people who have been asking for cyclic GC logs: is the
>>> sequence number scheme above good enough?
>>>
>>>
>>> Thanks in advance for your feedback,
>>>
>>> Tony, HotSpot GC Group
>>>
>>>
>>>       
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>     
>
>
>
>   

From martin at attivio.com  Tue May 11 07:13:41 2010
From: martin at attivio.com (Martin Serrano)
Date: Tue, 11 May 2010 10:13:41 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <4BE43C97.6000805@oracle.com>
References: <4BE3194E.902@oracle.com> <4BE43C97.6000805@oracle.com>
Message-ID: <9694A6C3D68A4249BD9E1A875B6BA81E105CCD27@bos0ex01.corp.attivio.com>

Tony,

Love the ideas.  We start java via a wrapper process and we currently set the gc log file name using a timestamp.  In upcoming releases we are planning on augmenting that with a meaningful application name.  Specific comments on your response:

> * "if you're going to roll the logs then I would prefer a meaningful 
> suffix rather than just a counter."
> 
> A counter seems like a perfectly meaningful suffix to me.

I would prefer to have a consistent suffix (like .log), in the filename.  Perhaps you could support just the %d format for the counter in the generated log name.  

We'd also appreciate having startup information at the top of the gc log.

Cheers,
Martin

-----Original Message-----
From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Tony Printezis
Sent: Friday, May 07, 2010 12:15 PM
To: hotspot-gc-use at openjdk.java.net
Subject: Re: Feedback requested: HotSpot GC logging improvements

Hi all,

First, thank you for all the excellent feedback (which I see as mostly 
positive to the proposals). We are glad that people still care about the 
GC logs. Instead of replying to individual e-mails, I'll consolidate my 
replies here.

* "I would say that PrintGCDateStamps should be the default" (several 
folks brought this up)

 From the point of view of analyzing logs to just look at the GC's 
behavior, we only need time stamps. And this is the reason why I'd like 
to see them turned on by default (too many times we got a log without 
time stamps for which we said "damn, if it had time stamps we'd get a 
better idea of what was happening"). So, they are the minimum we need to 
get a good picture of how the GC behaved. Date stamps will increase the 
size of the log (which still seems to be an issue for some people) and 
be helpful in fewer places (i.e., when comparing application and GC 
events; but we generally do not do that). So, you'll have to turn them 
on yourselves. :-)

* "successive runs overwrite the previous log file" (several folks 
brought this up)

Don't you think that adding the JVM's pid to the log file name would 
eliminate this problem?

* "I'm not attached to the old format" (several folks mentioned this)

Oh, good. I'll be quoting you when I'll be making a case to remove it.

* "A serial-attached-scsi (aka SAS) disk at 10k rpm is a little bit more 
expensive than $85/TB" (Ryan)

Point taken, but do you really need super duper 10k rpm disks to store 
GC log files. :-)

* "if you're going to roll the logs then I would prefer a meaningful 
suffix rather than just a counter."

A counter seems like a perfectly meaningful suffix to me.

* "wouldn't you still need a verbose output that is specific to each 
collector in order to provide a "debug" level of detail?" (Matt)

Very good point. The verbose output will be as unified as possible, but 
with indeed GC-specific extensions not to lose that information.

* "I guess you plan to provide a parser written in Java?" (Rainer)

Java? We're HotSpot developers! We only work in C++, assembly, and awk! 
Just kidding... Yes, indeed in Java.

* "so possibly Open Source with a nice license like Apache Software 
License 2" (Rainer)

Maybe, and not up to me to decide.

* "f00.00000001 might have been detected as old and copied to the remote 
host and during the same time GC decides to now reuse it...  That's why 
I personally find externally organized pruning better. Another thing I 
often miss is the ability to combine size and time based rotation." (Rainer)

The proposal never reuses log files. We'll never overwrite anything. 
Instead, we'll delete the oldest files as we create new ones. If we tell 
the users to prune the older log files themselves, I know what the first 
bug filed against the new policy will be. :-) Regarding rotating based 
on both size and time: most people care about size so I think that's 
what we'll do. If you want more advanced management of the logs you'll 
have to set N to infinity (at least we'll need a way to say "never 
delete older files") so that HotSpot doesn't delete any files and you'll 
be able to copy them and delete them yourself.

But, seriously, this is excellent feedback. You guys are doing more wild 
stuff with our logs than I had imagined. :-)

* "Will you start another discussion about the data contents of the 
file?" (Rainer)

We'll do that separately, based most likely on a wiki. When we get to 
it. No promises though!

* "For more debug detail per collector one could use PrintHeapAtGC" 
(Michael)

Well, PrintHeapAtGC was supposed to be added for debugging purposes, 
i.e., to find out what the address range of each generation is. However, 
it has clearer information on how full each generation is which is why 
people use it today (it's very space inefficient though...). We are 
hoping to add that information to the standard GC log records to 
eliminate the need for PrintHeapAtGC.

* "In our application we prefix _every_ "cyclic" log file with the 
config options used to start the app." (Jeff)

Adding configuration / whatever information at the top of every log file 
fragment is an excellent suggestion. Thanks for bringing it up.

* "How many digits in the sequence?  Would that be configurable?" (Adam)

8 should be more enough (do you really see the need for more than 99m 
log fragments)? Actually, even 6 will probably  be enough. And if we go 
over that, we won't cycle the numbers, we'll just expand the number field.

* "IMHO it would be great if we could use %p also for" (Johann)

I was going to say that this would start getting over the top. But I was 
not aware that you can do that with the fatal error log. I'll need to 
investigate that further. So, we'll leave this (and additional custom 
formatting in the GC log name) as a "maybe". :-) I'm not quite sure 
whether we'd want to use the same facility for the sequence numbers 
though, given that they'd be needed if we split the log and won't be 
needed if we don't. For those, I just vote to just add a suffix to the 
log file name when they are needed.

Thanks again for all the good points,

Tony, HotSpot GC Group

On 5/6/2010 3:32 PM, Tony Printezis wrote:
> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging that 
> we have been discussing. We have been wanting to improve our GC 
> logging for some time. However we haven't had the resources to spend 
> on it. We don't know when we'll get to it, but we'd still like to get 
> some feedback on our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails 
> output.
>
> I strongly believe that maintaining two GC log formats is 
> counter-productive, especially given that the current -verbosegc 
> format is unhelpful in many ways (i.e., lacks a lot of helpful 
> information). So, we would like to unify the two into one, with maybe 
> -XX:+PrintGCDetails generating a superset of what -verbosegc would 
> generate (so that a parser for the -XX:+PrintGCDetails output will 
> also be able to parse the -verbosegc output). The new output will not 
> be what -XX:+PrintGCDetails generates today but something that can be 
> reliably parsed and it is also reasonably human-readable (so, no xml 
> and no space/tab-separated formats). Additionally, we're proposing to 
> enable -XX:+PrintGCTimeStamps by default (in fact, we'll probably 
> deprecate and ignore that option, I can't believe that users will 
> really not want a time stamp per GC log record). We'll leave 
> -XX:+PrintGCDateStamps to be optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC 
> log formats. Would you be happy if we provided you with a (reliable!) 
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they 
> were concerned with the GC logs getting very large (a 1TB disk is $85 
> these days, but anyway...). Given that each GC log record is of 
> variable size, we cannot easily cycle through the log using the same 
> file (I'd rather not have to overwrite existing records). Our current 
> proposal is for the user to specify a file number N and a size target 
> S for each file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing 
> to exceeds S, so each file will be slightly larger than S but it will 
> be helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete 
> the oldest. So, in the above example, if N == 3, when we create 
> foo.00000004 we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but, 
> instead, we're pruning the oldest records every now and then, which 
> has the same effect.
>
> Another (related) request has been to maybe append the GC log file 
> name with the pid of the JVM that's generating it. Maybe we don't want 
> to do this by default. But, would people find it helpful if we provide 
> a new cmd line parameter to do that? So, for the above example and 
> assuming that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log 
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the 
> default), HotSpot will skip the sequence number and ignore S, i.e., 
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the 
> sequence number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
>
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From tony.printezis at oracle.com  Tue May 11 07:33:49 2010
From: tony.printezis at oracle.com (Tony Printezis)
Date: Tue, 11 May 2010 10:33:49 -0400
Subject: Feedback requested: HotSpot GC logging improvements
In-Reply-To: <9694A6C3D68A4249BD9E1A875B6BA81E105CCD27@bos0ex01.corp.attivio.com>
References: <4BE3194E.902@oracle.com> <4BE43C97.6000805@oracle.com>
	<9694A6C3D68A4249BD9E1A875B6BA81E105CCD27@bos0ex01.corp.attivio.com>
Message-ID: <4BE96ACD.2070805@oracle.com>

Martin,

Hi, thanks for the feedback.

Martin Serrano wrote:
> I would prefer to have a consistent suffix (like .log), in the filename.  Perhaps you could support just the %d format for the counter in the generated log name.  
>   
Well, if you allow parameters in the log file name, like 'foo.%d.%n.log' 
then folks can give their one suffix. I don't think we want to start 
adding one...

Tony

From matt.fowles at gmail.com  Wed May 12 15:19:30 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Wed, 12 May 2010 18:19:30 -0400
Subject: Growing GC Young Gen Times
Message-ID: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>

All~

I have a large app that produces ~4g of garbage every 30 seconds and
am trying to reduce the size of gc outliers.  About 99% of this data
is garbage, but almost anything that survives one collection survives
for an indeterminately long amount of time.  We are currently using
the following VM and options:

java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)

               -verbose:gc
               -XX:+PrintGCTimeStamps
               -XX:+PrintGCDetails
               -XX:+PrintGCTaskTimeStamps
               -XX:+PrintTenuringDistribution
               -XX:+PrintCommandLineFlags
               -XX:+PrintReferenceGC
               -Xms32g -Xmx32g -Xmn4g
               -XX:+UseParNewGC
               -XX:ParallelGCThreads=4
               -XX:+UseConcMarkSweepGC
               -XX:ParallelCMSThreads=4
               -XX:CMSInitiatingOccupancyFraction=60
               -XX:+UseCMSInitiatingOccupancyOnly
               -XX:+CMSParallelRemarkEnabled
               -XX:MaxGCPauseMillis=50
               -Xloggc:gc.log


As you can see from the GC log, we never actually reach the point
where the CMS kicks in (after app startup).  But our young gens seem
to take increasingly long to collect as time goes by.

The steady state of the app is reached around 956.392 into the log
with a collection that takes 0.106 seconds.  Thereafter the survivor
space remains roughly constantly as filled and the amount promoted to
old gen also remains constant, but the collection times increase to
2.855 seconds by the end of the 3.5 hour run.

Has anyone seen this sort of behavior before?  Are there more switches
that I should try running with?

Obviously, I am working to profile the app and reduce the garbage load
in parallel.  But if I still see this sort of problem, it is only a
question of how long must the app run before I see unacceptable
latency spikes.

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc.log.gz
Type: application/x-gzip
Size: 48564 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100512/df7b7e27/attachment-0001.bin 

From wmadams824 at comcast.net  Wed May 12 16:02:47 2010
From: wmadams824 at comcast.net (wmadams824 at comcast.net)
Date: Wed, 12 May 2010 23:02:47 +0000 (UTC)
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
Message-ID: <67188202.18794611273705367501.JavaMail.root@sz0070a.emeryville.ca.mail.comcast.net>

Hi, Matt: 


I don't have a solution but I can add more information, as I recently dealt with the same issue. My customer's app also generated a large amount of garbage per unit time. We were using CMS (with most of the same options you are using, but a smaller heap) and saw the same behavior of the young GC time increasing monotonically. Some additional information: their app did experience a number of CMS collections per day, and the young-GC collection time still continued to rise. We also arranged for a Full GC to occur in the middle of the night (to see if fragmentation in the old generation was impacting young-GC collection times), and still saw no change in the ever-increasing time for young GC. Printing FLS stats at level 2 showed no noticeable fragmentation in the old or perm gens. The only thing that caused the young GC times to go down was to bounce the JVM (at which point they began rising again, of course). 


While I was there, the server ran for a max of 3 or 4 days between bounces, and the young-GC collection time never leveled off. At least that's where they were when they threw me out of the office. :) 


Regards, Wayne 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100512/188f08f7/attachment.html 

From chkwok at digibites.nl  Wed May 12 16:35:17 2010
From: chkwok at digibites.nl (Chi Ho Kwok)
Date: Thu, 13 May 2010 01:35:17 +0200
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
Message-ID: <AANLkTimarp_TcOWvmtACLcl85uhXaDnsMuiaCT936HcS@mail.gmail.com>

Hi Matt,

You're having a 1-2s break every ~30 seconds. It's more efficient to collect
more things in 1 pass, but if you want smaller, more frequent delays, try
lowering the heap sizes. As you never reach more than 8G used memory before
the pauses get too insane, try the following setting:

-Xmx8g
-Xmn1g

This way, the maximum amount of data the ParNew collector has to process is
only 1/4 of the original, so the worst case delay is reduced by 75%. And
unless you're leaking memory somewhere, the old generation heap should
contain only long living objects, like data files that should stay loaded.
The way it looks now is that the whole old generation is filled with
garbage, but as you never reach 60% used, it's never collected.

I'd drop the MaxGCPauseMillis too, the goal is unreachable anyway. You can't
collect a heap of a few GB in 50ms.


Our setup here is pretty similar, the current heap size at least, but our
memory allocation pressure is much higher; plus, with a LRU cache as large
as the java heap allows, we constantly need to promote things from young to
old, and collect the old generation with CMS to free up space for more data.
This is how we tuned it:

Basics: -Xmx32g -Xms32g -Xmn1500m
Voodoo: 4 threads, CMSInitiatingOccupancyFraction 76%, MaxTenuringThreshold
1 (yes, promote after 1 copy max, LRU means if it sticks around for 10s, it
stays), SurvivorRatio=2 (prevents overflow directly into old gen)

We collect the young gen about once every 3 to 5 seconds during peaks,
example line:
2010-05-12T23:05:55.253+0200: 1496100.414: [GC 1496100.414: [ParNew:
1077270K->318127K(1152000K), 0.3551790 secs]
20722984K->20150315K(33170432K), 0.3554220 secs] [Times: user=1.57 sys=0.02,
real=0.36 secs]

So, to scan, promote, about 1GB of data, we use 0.4s. If we used a 4G new
generation, it could be as bad as your logs, yes; 0.4s x4 = 1.6s.

Too bad we can't get the minimum delay any lower; with our allocation rate,
if we reduce the new generation size, there's a good chance that a lot of
temporary data leaks through to the old generation, which is much, much more
expensive to collect. With the current size, it already means that data held
for more than 2x the collect delay, or ~8 seconds, leaks through to the old
gen, even if it isn't supposed to be - only data in the LRU cache should be
there.

With your allocation rate of 4G/30s = 136M/sec, you can play with sizes as
small as 512m and just let some objects with a ~10s+ lifetime leak through
to the old gen - CMS does it work in the background anyway, so if you want
to minimize pauses and have spare CPU cycles, go for a tiny new generation.


Chi Ho

On Thu, May 13, 2010 at 12:19 AM, Matt Fowles <matt.fowles at gmail.com> wrote:

> All~
>
> I have a large app that produces ~4g of garbage every 30 seconds and
> am trying to reduce the size of gc outliers.  About 99% of this data
> is garbage, but almost anything that survives one collection survives
> for an indeterminately long amount of time.  We are currently using
> the following VM and options:
>
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>
>               -verbose:gc
>               -XX:+PrintGCTimeStamps
>               -XX:+PrintGCDetails
>               -XX:+PrintGCTaskTimeStamps
>               -XX:+PrintTenuringDistribution
>               -XX:+PrintCommandLineFlags
>               -XX:+PrintReferenceGC
>               -Xms32g -Xmx32g -Xmn4g
>               -XX:+UseParNewGC
>               -XX:ParallelGCThreads=4
>               -XX:+UseConcMarkSweepGC
>               -XX:ParallelCMSThreads=4
>               -XX:CMSInitiatingOccupancyFraction=60
>               -XX:+UseCMSInitiatingOccupancyOnly
>               -XX:+CMSParallelRemarkEnabled
>               -XX:MaxGCPauseMillis=50
>               -Xloggc:gc.log
>
>
> As you can see from the GC log, we never actually reach the point
> where the CMS kicks in (after app startup).  But our young gens seem
> to take increasingly long to collect as time goes by.
>
> The steady state of the app is reached around 956.392 into the log
> with a collection that takes 0.106 seconds.  Thereafter the survivor
> space remains roughly constantly as filled and the amount promoted to
> old gen also remains constant, but the collection times increase to
> 2.855 seconds by the end of the 3.5 hour run.
>
> Has anyone seen this sort of behavior before?  Are there more switches
> that I should try running with?
>
> Obviously, I am working to profile the app and reduce the garbage load
> in parallel.  But if I still see this sort of problem, it is only a
> question of how long must the app run before I see unacceptable
> latency spikes.
>
> Matt
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100513/49a02491/attachment.html 

From y.s.ramakrishna at oracle.com  Wed May 12 16:38:51 2010
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Wed, 12 May 2010 16:38:51 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
Message-ID: <4BEB3C0B.6040804@oracle.com>

Try the jvm from hs18 (jdk 7) and let us know what you see.
Or wait for JDK 6u21 which (i think) is slated for sometime
next month. Or get an hs17 JVM with the fix 6631166 via
your Java support and give it a try.

Also add -XX:+UseLargePages -XX:+AlwaysPreTouch and if
you have enough cores try increasing yr ParallelGCThreads from
your current setting of 4 (what was the default you got?).

I have not looked at the log you sent, but can take a look
when i get some time; but no promises, as i am drowned in
other work at the moment.

-- ramki

On 05/12/10 15:19, Matt Fowles wrote:
> All~
> 
> I have a large app that produces ~4g of garbage every 30 seconds and
> am trying to reduce the size of gc outliers.  About 99% of this data
> is garbage, but almost anything that survives one collection survives
> for an indeterminately long amount of time.  We are currently using
> the following VM and options:
> 
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
> 
>                -verbose:gc
>                -XX:+PrintGCTimeStamps
>                -XX:+PrintGCDetails
>                -XX:+PrintGCTaskTimeStamps
>                -XX:+PrintTenuringDistribution
>                -XX:+PrintCommandLineFlags
>                -XX:+PrintReferenceGC
>                -Xms32g -Xmx32g -Xmn4g
>                -XX:+UseParNewGC
>                -XX:ParallelGCThreads=4
>                -XX:+UseConcMarkSweepGC
>                -XX:ParallelCMSThreads=4
>                -XX:CMSInitiatingOccupancyFraction=60
>                -XX:+UseCMSInitiatingOccupancyOnly
>                -XX:+CMSParallelRemarkEnabled
>                -XX:MaxGCPauseMillis=50
>                -Xloggc:gc.log
> 
> 
> As you can see from the GC log, we never actually reach the point
> where the CMS kicks in (after app startup).  But our young gens seem
> to take increasingly long to collect as time goes by.
> 
> The steady state of the app is reached around 956.392 into the log
> with a collection that takes 0.106 seconds.  Thereafter the survivor
> space remains roughly constantly as filled and the amount promoted to
> old gen also remains constant, but the collection times increase to
> 2.855 seconds by the end of the 3.5 hour run.
> 
> Has anyone seen this sort of behavior before?  Are there more switches
> that I should try running with?
> 
> Obviously, I am working to profile the app and reduce the garbage load
> in parallel.  But if I still see this sort of problem, it is only a
> question of how long must the app run before I see unacceptable
> latency spikes.
> 
> Matt
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From jon.masamitsu at oracle.com  Thu May 13 09:23:18 2010
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 13 May 2010 09:23:18 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
Message-ID: <4BEC2776.8010609@oracle.com>

Matt,

As Ramki indicated fragmentation might be an issue.  As the fragmentation
in the old generation increases, it takes longer to find space in the 
old generation
into which to promote objects from the young generation.  This is 
apparently not
the problem that Wayne is having but you still might be hitting it.  If 
you can
connect jconsole to the VM and force a full GC, that would tell us if it's
fragmentation.

There might be a scaling issue with the UseParNewGC.  If you can use
-XX:-UseParNewGC (turning off the parallel young
generation collection) with  -XX:+UseConcMarkSweepGC the pauses
will be longer but may be more stable.  That's not the solution but just 
part
of the investigation.

You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
and if you don't see the growing young generation pause, that would indicate
something specific about promotion into the CMS generation.

UseParallelGC is different from UseParNewGC in a number of ways
and if you try UseParallelGC and still see the growing young generation
pauses, I'd suspect something special about your application.

If you can run these experiments hopefully they will tell
us where to look next.

Jon


On 05/12/10 15:19, Matt Fowles wrote:
> All~
>
> I have a large app that produces ~4g of garbage every 30 seconds and
> am trying to reduce the size of gc outliers.  About 99% of this data
> is garbage, but almost anything that survives one collection survives
> for an indeterminately long amount of time.  We are currently using
> the following VM and options:
>
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>
>                -verbose:gc
>                -XX:+PrintGCTimeStamps
>                -XX:+PrintGCDetails
>                -XX:+PrintGCTaskTimeStamps
>                -XX:+PrintTenuringDistribution
>                -XX:+PrintCommandLineFlags
>                -XX:+PrintReferenceGC
>                -Xms32g -Xmx32g -Xmn4g
>                -XX:+UseParNewGC
>                -XX:ParallelGCThreads=4
>                -XX:+UseConcMarkSweepGC
>                -XX:ParallelCMSThreads=4
>                -XX:CMSInitiatingOccupancyFraction=60
>                -XX:+UseCMSInitiatingOccupancyOnly
>                -XX:+CMSParallelRemarkEnabled
>                -XX:MaxGCPauseMillis=50
>                -Xloggc:gc.log
>
>
> As you can see from the GC log, we never actually reach the point
> where the CMS kicks in (after app startup).  But our young gens seem
> to take increasingly long to collect as time goes by.
>
> The steady state of the app is reached around 956.392 into the log
> with a collection that takes 0.106 seconds.  Thereafter the survivor
> space remains roughly constantly as filled and the amount promoted to
> old gen also remains constant, but the collection times increase to
> 2.855 seconds by the end of the 3.5 hour run.
>
> Has anyone seen this sort of behavior before?  Are there more switches
> that I should try running with?
>
> Obviously, I am working to profile the app and reduce the garbage load
> in parallel.  But if I still see this sort of problem, it is only a
> question of how long must the app run before I see unacceptable
> latency spikes.
>
> Matt
> ------------------------------------------------------------------------
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100513/290423f6/attachment-0001.html 

From matt.fowles at gmail.com  Thu May 13 10:50:32 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Thu, 13 May 2010 13:50:32 -0400
Subject: Growing GC Young Gen Times
In-Reply-To: <4BEC2776.8010609@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com> 
	<4BEC2776.8010609@oracle.com>
Message-ID: <AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>

Jon~

This may sound naive, but how can fragmentation be an issue if the old
gen has never been collected?  I would think we are still in the space
where we can just bump the old gen alloc pointer...

Matt

On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
<jon.masamitsu at oracle.com> wrote:
> Matt,
>
> As Ramki indicated fragmentation might be an issue.? As the fragmentation
> in the old generation increases, it takes longer to find space in the old
> generation
> into which to promote objects from the young generation.? This is apparently
> not
> the problem that Wayne is having but you still might be hitting it.? If you
> can
> connect jconsole to the VM and force a full GC, that would tell us if it's
> fragmentation.
>
> There might be a scaling issue with the UseParNewGC.? If you can use
> -XX:-UseParNewGC (turning off the parallel young
> generation collection) with? -XX:+UseConcMarkSweepGC the pauses
> will be longer but may be more stable.? That's not the solution but just
> part
> of the investigation.
>
> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
> and if you don't see the growing young generation pause, that would indicate
> something specific about promotion into the CMS generation.
>
> UseParallelGC is different from UseParNewGC in a number of ways
> and if you try UseParallelGC and still see the growing young generation
> pauses, I'd suspect something special about your application.
>
> If you can run these experiments hopefully they will tell
> us where to look next.
>
> Jon
>
>
> On 05/12/10 15:19, Matt Fowles wrote:
>
> All~
>
> I have a large app that produces ~4g of garbage every 30 seconds and
> am trying to reduce the size of gc outliers.  About 99% of this data
> is garbage, but almost anything that survives one collection survives
> for an indeterminately long amount of time.  We are currently using
> the following VM and options:
>
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>
>                -verbose:gc
>                -XX:+PrintGCTimeStamps
>                -XX:+PrintGCDetails
>                -XX:+PrintGCTaskTimeStamps
>                -XX:+PrintTenuringDistribution
>                -XX:+PrintCommandLineFlags
>                -XX:+PrintReferenceGC
>                -Xms32g -Xmx32g -Xmn4g
>                -XX:+UseParNewGC
>                -XX:ParallelGCThreads=4
>                -XX:+UseConcMarkSweepGC
>                -XX:ParallelCMSThreads=4
>                -XX:CMSInitiatingOccupancyFraction=60
>                -XX:+UseCMSInitiatingOccupancyOnly
>                -XX:+CMSParallelRemarkEnabled
>                -XX:MaxGCPauseMillis=50
>                -Xloggc:gc.log
>
>
> As you can see from the GC log, we never actually reach the point
> where the CMS kicks in (after app startup).  But our young gens seem
> to take increasingly long to collect as time goes by.
>
> The steady state of the app is reached around 956.392 into the log
> with a collection that takes 0.106 seconds.  Thereafter the survivor
> space remains roughly constantly as filled and the amount promoted to
> old gen also remains constant, but the collection times increase to
> 2.855 seconds by the end of the 3.5 hour run.
>
> Has anyone seen this sort of behavior before?  Are there more switches
> that I should try running with?
>
> Obviously, I am working to profile the app and reduce the garbage load
> in parallel.  But if I still see this sort of problem, it is only a
> question of how long must the app run before I see unacceptable
> latency spikes.
>
> Matt
>
> ________________________________
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From y.s.ramakrishna at oracle.com  Thu May 13 14:52:24 2010
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Thu, 13 May 2010 14:52:24 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
Message-ID: <4BEC7498.6030405@oracle.com>

On 05/13/10 10:50, Matt Fowles wrote:
> Jon~
> 
> This may sound naive, but how can fragmentation be an issue if the old
> gen has never been collected?  I would think we are still in the space
> where we can just bump the old gen alloc pointer...

Matt, The old gen allocator may fragment the space. Allocation is not exactly "bump a pointer".

-- ramki

> 
> Matt
> 
> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
> <jon.masamitsu at oracle.com> wrote:
>> Matt,
>>
>> As Ramki indicated fragmentation might be an issue.  As the fragmentation
>> in the old generation increases, it takes longer to find space in the old
>> generation
>> into which to promote objects from the young generation.  This is apparently
>> not
>> the problem that Wayne is having but you still might be hitting it.  If you
>> can
>> connect jconsole to the VM and force a full GC, that would tell us if it's
>> fragmentation.
>>
>> There might be a scaling issue with the UseParNewGC.  If you can use
>> -XX:-UseParNewGC (turning off the parallel young
>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
>> will be longer but may be more stable.  That's not the solution but just
>> part
>> of the investigation.
>>
>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>> and if you don't see the growing young generation pause, that would indicate
>> something specific about promotion into the CMS generation.
>>
>> UseParallelGC is different from UseParNewGC in a number of ways
>> and if you try UseParallelGC and still see the growing young generation
>> pauses, I'd suspect something special about your application.
>>
>> If you can run these experiments hopefully they will tell
>> us where to look next.
>>
>> Jon
>>
>>
>> On 05/12/10 15:19, Matt Fowles wrote:
>>
>> All~
>>
>> I have a large app that produces ~4g of garbage every 30 seconds and
>> am trying to reduce the size of gc outliers.  About 99% of this data
>> is garbage, but almost anything that survives one collection survives
>> for an indeterminately long amount of time.  We are currently using
>> the following VM and options:
>>
>> java version "1.6.0_20"
>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>
>>                -verbose:gc
>>                -XX:+PrintGCTimeStamps
>>                -XX:+PrintGCDetails
>>                -XX:+PrintGCTaskTimeStamps
>>                -XX:+PrintTenuringDistribution
>>                -XX:+PrintCommandLineFlags
>>                -XX:+PrintReferenceGC
>>                -Xms32g -Xmx32g -Xmn4g
>>                -XX:+UseParNewGC
>>                -XX:ParallelGCThreads=4
>>                -XX:+UseConcMarkSweepGC
>>                -XX:ParallelCMSThreads=4
>>                -XX:CMSInitiatingOccupancyFraction=60
>>                -XX:+UseCMSInitiatingOccupancyOnly
>>                -XX:+CMSParallelRemarkEnabled
>>                -XX:MaxGCPauseMillis=50
>>                -Xloggc:gc.log
>>
>>
>> As you can see from the GC log, we never actually reach the point
>> where the CMS kicks in (after app startup).  But our young gens seem
>> to take increasingly long to collect as time goes by.
>>
>> The steady state of the app is reached around 956.392 into the log
>> with a collection that takes 0.106 seconds.  Thereafter the survivor
>> space remains roughly constantly as filled and the amount promoted to
>> old gen also remains constant, but the collection times increase to
>> 2.855 seconds by the end of the 3.5 hour run.
>>
>> Has anyone seen this sort of behavior before?  Are there more switches
>> that I should try running with?
>>
>> Obviously, I am working to profile the app and reduce the garbage load
>> in parallel.  But if I still see this sort of problem, it is only a
>> question of how long must the app run before I see unacceptable
>> latency spikes.
>>
>> Matt
>>
>> ________________________________
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From jon.masamitsu at oracle.com  Thu May 13 15:29:33 2010
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 13 May 2010 15:29:33 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <4BEC7498.6030405@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
	<4BEC7498.6030405@oracle.com>
Message-ID: <4BEC7D4D.2000905@oracle.com>

Matt,

To amplify on Ramki's comment, the allocations out of the
old generation are always from a free list.  During a young
generation collection each GC thread will get its own
local free lists from the old generation so that it can
copy objects to the old generation without synchronizing
with the other GC thread (most of the time).  Objects from
a GC thread's local free lists are pushed to the globals lists
after the collection (as far as I recall). So there is some
churn in the free lists.

Jon

On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
> On 05/13/10 10:50, Matt Fowles wrote:
>> Jon~
>>
>> This may sound naive, but how can fragmentation be an issue if the old
>> gen has never been collected?  I would think we are still in the space
>> where we can just bump the old gen alloc pointer...
>
> Matt, The old gen allocator may fragment the space. Allocation is not 
> exactly "bump a pointer".
>
> -- ramki
>
>>
>> Matt
>>
>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>> <jon.masamitsu at oracle.com> wrote:
>>> Matt,
>>>
>>> As Ramki indicated fragmentation might be an issue.  As the 
>>> fragmentation
>>> in the old generation increases, it takes longer to find space in 
>>> the old
>>> generation
>>> into which to promote objects from the young generation.  This is 
>>> apparently
>>> not
>>> the problem that Wayne is having but you still might be hitting it.  
>>> If you
>>> can
>>> connect jconsole to the VM and force a full GC, that would tell us 
>>> if it's
>>> fragmentation.
>>>
>>> There might be a scaling issue with the UseParNewGC.  If you can use
>>> -XX:-UseParNewGC (turning off the parallel young
>>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
>>> will be longer but may be more stable.  That's not the solution but 
>>> just
>>> part
>>> of the investigation.
>>>
>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>>> and if you don't see the growing young generation pause, that would 
>>> indicate
>>> something specific about promotion into the CMS generation.
>>>
>>> UseParallelGC is different from UseParNewGC in a number of ways
>>> and if you try UseParallelGC and still see the growing young generation
>>> pauses, I'd suspect something special about your application.
>>>
>>> If you can run these experiments hopefully they will tell
>>> us where to look next.
>>>
>>> Jon
>>>
>>>
>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>
>>> All~
>>>
>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>> am trying to reduce the size of gc outliers.  About 99% of this data
>>> is garbage, but almost anything that survives one collection survives
>>> for an indeterminately long amount of time.  We are currently using
>>> the following VM and options:
>>>
>>> java version "1.6.0_20"
>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>
>>>                -verbose:gc
>>>                -XX:+PrintGCTimeStamps
>>>                -XX:+PrintGCDetails
>>>                -XX:+PrintGCTaskTimeStamps
>>>                -XX:+PrintTenuringDistribution
>>>                -XX:+PrintCommandLineFlags
>>>                -XX:+PrintReferenceGC
>>>                -Xms32g -Xmx32g -Xmn4g
>>>                -XX:+UseParNewGC
>>>                -XX:ParallelGCThreads=4
>>>                -XX:+UseConcMarkSweepGC
>>>                -XX:ParallelCMSThreads=4
>>>                -XX:CMSInitiatingOccupancyFraction=60
>>>                -XX:+UseCMSInitiatingOccupancyOnly
>>>                -XX:+CMSParallelRemarkEnabled
>>>                -XX:MaxGCPauseMillis=50
>>>                -Xloggc:gc.log
>>>
>>>
>>> As you can see from the GC log, we never actually reach the point
>>> where the CMS kicks in (after app startup).  But our young gens seem
>>> to take increasingly long to collect as time goes by.
>>>
>>> The steady state of the app is reached around 956.392 into the log
>>> with a collection that takes 0.106 seconds.  Thereafter the survivor
>>> space remains roughly constantly as filled and the amount promoted to
>>> old gen also remains constant, but the collection times increase to
>>> 2.855 seconds by the end of the 3.5 hour run.
>>>
>>> Has anyone seen this sort of behavior before?  Are there more switches
>>> that I should try running with?
>>>
>>> Obviously, I am working to profile the app and reduce the garbage load
>>> in parallel.  But if I still see this sort of problem, it is only a
>>> question of how long must the app run before I see unacceptable
>>> latency spikes.
>>>
>>> Matt
>>>
>>> ________________________________
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From ryanobjc at gmail.com  Thu May 13 16:54:34 2010
From: ryanobjc at gmail.com (Ryan Rawson)
Date: Thu, 13 May 2010 16:54:34 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <4BEC7D4D.2000905@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
	<4BEC7498.6030405@oracle.com> <4BEC7D4D.2000905@oracle.com>
Message-ID: <AANLkTilm-u__nRA0JLUZ7MNKlstz5hHboLOCE45-I7uA@mail.gmail.com>

Hi,

I have had a similar experience and I ran into a reasonable and
unsatisfactory solution...

In my case, the young GC times keeps on taking increasingly longer and
longer. Doing GC logging I saw something similar to what you saw - the
ParNew keeps on growing and the amount of data being tenured was huge.
  Having a 800ms GC pause every 4 seconds was no good for me.  I
eventually added this to my java start:

"-XX:NewSize=64m -XX:MaxNewSize=64m"

A friend suggested this to me - he said that a young GC should be fast
because the young gen should be ~ the size of L3 cache.

With this setting I see YoungGCs between .5-3 times a second and they
last between 10-80ms or so.  The CMS will prune massive amounts of
garbage out when it runs, up to 2GB ram in my 6GB heap processes.


Now for a little theorycrafting... The root cause here is my
application breaks the Object Generational Hypothesis.  The GC
auto-tuning will grow the ParNew to reduce the amount of data it is
tenuring, but it is never really able to reach a good steady state.
At this point you are now tenuring 1-2GB of ram.  Tenuring = copying
objects = time consuming.

Once you are at this spot, you find out that every current shipping GC
is just not good enough.  My hope was to use G1, but considering how
unstable it was for me (I have tried 12+ releases of Java7, a few
releases of Java6) I am now shifting my approach.

In my application one of the primary causes of allocation is a block
cache for a database-type application. I am planning on testing a
change where the block cache is maintained in massive
DirectByteBuffers (think sizes from 2-15GB of RAM) and I will manage
all the allocation by hand (in Java).  If you have some ability to
shift memory usage out of the domain of the GC I would highly suggest
doing so.

At this point I can honestly say if you are not Object Generational
Hypothesis Compliant (OGH (tm)) then Java for large heaps can be very
very painful.  I think the choices are DirectByteBuffer, JNI, and not
using Java.  I'd like an option to that, but I'm not sure what it
might be while avoiding that last option (also avoiding JNI too
ideally).

I feel this is the greatest weakness of Java - the memory management
is 1 size fits all, and there are few great options.
DirectByteBuffers have a limited interface and require copying data in
and out to talk to the rest of Java.  JNI has the same issue and has
had a high invokation cost.

Good luck out there, and stay OGH compliant!
-ryan

On Thu, May 13, 2010 at 3:29 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
> Matt,
>
> To amplify on Ramki's comment, the allocations out of the
> old generation are always from a free list. ?During a young
> generation collection each GC thread will get its own
> local free lists from the old generation so that it can
> copy objects to the old generation without synchronizing
> with the other GC thread (most of the time). ?Objects from
> a GC thread's local free lists are pushed to the globals lists
> after the collection (as far as I recall). So there is some
> churn in the free lists.
>
> Jon
>
> On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
>> On 05/13/10 10:50, Matt Fowles wrote:
>>> Jon~
>>>
>>> This may sound naive, but how can fragmentation be an issue if the old
>>> gen has never been collected? ?I would think we are still in the space
>>> where we can just bump the old gen alloc pointer...
>>
>> Matt, The old gen allocator may fragment the space. Allocation is not
>> exactly "bump a pointer".
>>
>> -- ramki
>>
>>>
>>> Matt
>>>
>>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>>> <jon.masamitsu at oracle.com> wrote:
>>>> Matt,
>>>>
>>>> As Ramki indicated fragmentation might be an issue. ?As the
>>>> fragmentation
>>>> in the old generation increases, it takes longer to find space in
>>>> the old
>>>> generation
>>>> into which to promote objects from the young generation. ?This is
>>>> apparently
>>>> not
>>>> the problem that Wayne is having but you still might be hitting it.
>>>> If you
>>>> can
>>>> connect jconsole to the VM and force a full GC, that would tell us
>>>> if it's
>>>> fragmentation.
>>>>
>>>> There might be a scaling issue with the UseParNewGC. ?If you can use
>>>> -XX:-UseParNewGC (turning off the parallel young
>>>> generation collection) with ?-XX:+UseConcMarkSweepGC the pauses
>>>> will be longer but may be more stable. ?That's not the solution but
>>>> just
>>>> part
>>>> of the investigation.
>>>>
>>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>>>> and if you don't see the growing young generation pause, that would
>>>> indicate
>>>> something specific about promotion into the CMS generation.
>>>>
>>>> UseParallelGC is different from UseParNewGC in a number of ways
>>>> and if you try UseParallelGC and still see the growing young generation
>>>> pauses, I'd suspect something special about your application.
>>>>
>>>> If you can run these experiments hopefully they will tell
>>>> us where to look next.
>>>>
>>>> Jon
>>>>
>>>>
>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>
>>>> All~
>>>>
>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>> am trying to reduce the size of gc outliers. ?About 99% of this data
>>>> is garbage, but almost anything that survives one collection survives
>>>> for an indeterminately long amount of time. ?We are currently using
>>>> the following VM and options:
>>>>
>>>> java version "1.6.0_20"
>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>
>>>> ? ? ? ? ? ? ? ?-verbose:gc
>>>> ? ? ? ? ? ? ? ?-XX:+PrintGCTimeStamps
>>>> ? ? ? ? ? ? ? ?-XX:+PrintGCDetails
>>>> ? ? ? ? ? ? ? ?-XX:+PrintGCTaskTimeStamps
>>>> ? ? ? ? ? ? ? ?-XX:+PrintTenuringDistribution
>>>> ? ? ? ? ? ? ? ?-XX:+PrintCommandLineFlags
>>>> ? ? ? ? ? ? ? ?-XX:+PrintReferenceGC
>>>> ? ? ? ? ? ? ? ?-Xms32g -Xmx32g -Xmn4g
>>>> ? ? ? ? ? ? ? ?-XX:+UseParNewGC
>>>> ? ? ? ? ? ? ? ?-XX:ParallelGCThreads=4
>>>> ? ? ? ? ? ? ? ?-XX:+UseConcMarkSweepGC
>>>> ? ? ? ? ? ? ? ?-XX:ParallelCMSThreads=4
>>>> ? ? ? ? ? ? ? ?-XX:CMSInitiatingOccupancyFraction=60
>>>> ? ? ? ? ? ? ? ?-XX:+UseCMSInitiatingOccupancyOnly
>>>> ? ? ? ? ? ? ? ?-XX:+CMSParallelRemarkEnabled
>>>> ? ? ? ? ? ? ? ?-XX:MaxGCPauseMillis=50
>>>> ? ? ? ? ? ? ? ?-Xloggc:gc.log
>>>>
>>>>
>>>> As you can see from the GC log, we never actually reach the point
>>>> where the CMS kicks in (after app startup). ?But our young gens seem
>>>> to take increasingly long to collect as time goes by.
>>>>
>>>> The steady state of the app is reached around 956.392 into the log
>>>> with a collection that takes 0.106 seconds. ?Thereafter the survivor
>>>> space remains roughly constantly as filled and the amount promoted to
>>>> old gen also remains constant, but the collection times increase to
>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>
>>>> Has anyone seen this sort of behavior before? ?Are there more switches
>>>> that I should try running with?
>>>>
>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>> in parallel. ?But if I still see this sort of problem, it is only a
>>>> question of how long must the app run before I see unacceptable
>>>> latency spikes.
>>>>
>>>> Matt
>>>>
>>>> ________________________________
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From chkwok at digibites.nl  Fri May 14 06:49:22 2010
From: chkwok at digibites.nl (Chi Ho Kwok)
Date: Fri, 14 May 2010 15:49:22 +0200
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTilm-u__nRA0JLUZ7MNKlstz5hHboLOCE45-I7uA@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
	<4BEC7498.6030405@oracle.com> <4BEC7D4D.2000905@oracle.com>
	<AANLkTilm-u__nRA0JLUZ7MNKlstz5hHboLOCE45-I7uA@mail.gmail.com>
Message-ID: <AANLkTilnJL_lA1Xs6Y38iR4LjYfP9ddfT9r1Ue10Ddva@mail.gmail.com>

Good read. Yeah, it's the same story with every caching app: least recently
used cache = every object that's in it sticks around for [cache objects
average lifetime], breaking the Object Generational Hypothesis.

My solution: throw more hardware at it. Just lower the new generation size
(-Xmn is a shortcut for MaxNewSize/NewSize) until the pauses are acceptable.
The old gen gets collected by CMS all the time, removing expired cache
items, but it doesn't introduce any pauses. The only cost is CPU time -
which isn't really scarce in most memory cache apps. If you pick a limit
high enough (1 pause per 2-3s), most temporary objects won't even make it to
the old generation.

Going the JNI route can be doable for some apps (block / page cache), but in
my case we'd have to serialize / unserialize huge object graphs all the
time.


Chi Ho

On Fri, May 14, 2010 at 1:54 AM, Ryan Rawson <ryanobjc at gmail.com> wrote:

> Hi,
>
> I have had a similar experience and I ran into a reasonable and
> unsatisfactory solution...
>
> In my case, the young GC times keeps on taking increasingly longer and
> longer. Doing GC logging I saw something similar to what you saw - the
> ParNew keeps on growing and the amount of data being tenured was huge.
>  Having a 800ms GC pause every 4 seconds was no good for me.  I
> eventually added this to my java start:
>
> "-XX:NewSize=64m -XX:MaxNewSize=64m"
>
> A friend suggested this to me - he said that a young GC should be fast
> because the young gen should be ~ the size of L3 cache.
>
> With this setting I see YoungGCs between .5-3 times a second and they
> last between 10-80ms or so.  The CMS will prune massive amounts of
> garbage out when it runs, up to 2GB ram in my 6GB heap processes.
>
>
> Now for a little theorycrafting... The root cause here is my
> application breaks the Object Generational Hypothesis.  The GC
> auto-tuning will grow the ParNew to reduce the amount of data it is
> tenuring, but it is never really able to reach a good steady state.
> At this point you are now tenuring 1-2GB of ram.  Tenuring = copying
> objects = time consuming.
>
> Once you are at this spot, you find out that every current shipping GC
> is just not good enough.  My hope was to use G1, but considering how
> unstable it was for me (I have tried 12+ releases of Java7, a few
> releases of Java6) I am now shifting my approach.
>
> In my application one of the primary causes of allocation is a block
> cache for a database-type application. I am planning on testing a
> change where the block cache is maintained in massive
> DirectByteBuffers (think sizes from 2-15GB of RAM) and I will manage
> all the allocation by hand (in Java).  If you have some ability to
> shift memory usage out of the domain of the GC I would highly suggest
> doing so.
>
> At this point I can honestly say if you are not Object Generational
> Hypothesis Compliant (OGH (tm)) then Java for large heaps can be very
> very painful.  I think the choices are DirectByteBuffer, JNI, and not
> using Java.  I'd like an option to that, but I'm not sure what it
> might be while avoiding that last option (also avoiding JNI too
> ideally).
>
> I feel this is the greatest weakness of Java - the memory management
> is 1 size fits all, and there are few great options.
> DirectByteBuffers have a limited interface and require copying data in
> and out to talk to the rest of Java.  JNI has the same issue and has
> had a high invokation cost.
>
> Good luck out there, and stay OGH compliant!
> -ryan
>
> On Thu, May 13, 2010 at 3:29 PM, Jon Masamitsu <jon.masamitsu at oracle.com>
> wrote:
> > Matt,
> >
> > To amplify on Ramki's comment, the allocations out of the
> > old generation are always from a free list.  During a young
> > generation collection each GC thread will get its own
> > local free lists from the old generation so that it can
> > copy objects to the old generation without synchronizing
> > with the other GC thread (most of the time).  Objects from
> > a GC thread's local free lists are pushed to the globals lists
> > after the collection (as far as I recall). So there is some
> > churn in the free lists.
> >
> > Jon
> >
> > On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
> >> On 05/13/10 10:50, Matt Fowles wrote:
> >>> Jon~
> >>>
> >>> This may sound naive, but how can fragmentation be an issue if the old
> >>> gen has never been collected?  I would think we are still in the space
> >>> where we can just bump the old gen alloc pointer...
> >>
> >> Matt, The old gen allocator may fragment the space. Allocation is not
> >> exactly "bump a pointer".
> >>
> >> -- ramki
> >>
> >>>
> >>> Matt
> >>>
> >>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
> >>> <jon.masamitsu at oracle.com> wrote:
> >>>> Matt,
> >>>>
> >>>> As Ramki indicated fragmentation might be an issue.  As the
> >>>> fragmentation
> >>>> in the old generation increases, it takes longer to find space in
> >>>> the old
> >>>> generation
> >>>> into which to promote objects from the young generation.  This is
> >>>> apparently
> >>>> not
> >>>> the problem that Wayne is having but you still might be hitting it.
> >>>> If you
> >>>> can
> >>>> connect jconsole to the VM and force a full GC, that would tell us
> >>>> if it's
> >>>> fragmentation.
> >>>>
> >>>> There might be a scaling issue with the UseParNewGC.  If you can use
> >>>> -XX:-UseParNewGC (turning off the parallel young
> >>>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
> >>>> will be longer but may be more stable.  That's not the solution but
> >>>> just
> >>>> part
> >>>> of the investigation.
> >>>>
> >>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
> >>>> and if you don't see the growing young generation pause, that would
> >>>> indicate
> >>>> something specific about promotion into the CMS generation.
> >>>>
> >>>> UseParallelGC is different from UseParNewGC in a number of ways
> >>>> and if you try UseParallelGC and still see the growing young
> generation
> >>>> pauses, I'd suspect something special about your application.
> >>>>
> >>>> If you can run these experiments hopefully they will tell
> >>>> us where to look next.
> >>>>
> >>>> Jon
> >>>>
> >>>>
> >>>> On 05/12/10 15:19, Matt Fowles wrote:
> >>>>
> >>>> All~
> >>>>
> >>>> I have a large app that produces ~4g of garbage every 30 seconds and
> >>>> am trying to reduce the size of gc outliers.  About 99% of this data
> >>>> is garbage, but almost anything that survives one collection survives
> >>>> for an indeterminately long amount of time.  We are currently using
> >>>> the following VM and options:
> >>>>
> >>>> java version "1.6.0_20"
> >>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> >>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
> >>>>
> >>>>                -verbose:gc
> >>>>                -XX:+PrintGCTimeStamps
> >>>>                -XX:+PrintGCDetails
> >>>>                -XX:+PrintGCTaskTimeStamps
> >>>>                -XX:+PrintTenuringDistribution
> >>>>                -XX:+PrintCommandLineFlags
> >>>>                -XX:+PrintReferenceGC
> >>>>                -Xms32g -Xmx32g -Xmn4g
> >>>>                -XX:+UseParNewGC
> >>>>                -XX:ParallelGCThreads=4
> >>>>                -XX:+UseConcMarkSweepGC
> >>>>                -XX:ParallelCMSThreads=4
> >>>>                -XX:CMSInitiatingOccupancyFraction=60
> >>>>                -XX:+UseCMSInitiatingOccupancyOnly
> >>>>                -XX:+CMSParallelRemarkEnabled
> >>>>                -XX:MaxGCPauseMillis=50
> >>>>                -Xloggc:gc.log
> >>>>
> >>>>
> >>>> As you can see from the GC log, we never actually reach the point
> >>>> where the CMS kicks in (after app startup).  But our young gens seem
> >>>> to take increasingly long to collect as time goes by.
> >>>>
> >>>> The steady state of the app is reached around 956.392 into the log
> >>>> with a collection that takes 0.106 seconds.  Thereafter the survivor
> >>>> space remains roughly constantly as filled and the amount promoted to
> >>>> old gen also remains constant, but the collection times increase to
> >>>> 2.855 seconds by the end of the 3.5 hour run.
> >>>>
> >>>> Has anyone seen this sort of behavior before?  Are there more switches
> >>>> that I should try running with?
> >>>>
> >>>> Obviously, I am working to profile the app and reduce the garbage load
> >>>> in parallel.  But if I still see this sort of problem, it is only a
> >>>> question of how long must the app run before I see unacceptable
> >>>> latency spikes.
> >>>>
> >>>> Matt
> >>>>
> >>>> ________________________________
> >>>> _______________________________________________
> >>>> hotspot-gc-use mailing list
> >>>> hotspot-gc-use at openjdk.java.net
> >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >>> _______________________________________________
> >>> hotspot-gc-use mailing list
> >>> hotspot-gc-use at openjdk.java.net
> >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >>
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100514/c13c6dd9/attachment-0001.html 

From y.s.ramakrishna at oracle.com  Fri May 14 09:58:09 2010
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 14 May 2010 09:58:09 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
Message-ID: <4BED8121.5000405@oracle.com>

Hi Matt -- i am computing some metrics from yr log file
and would like to know how many cpu's you have for the logs below?

Also, as you noted, almost anything that survives a scavenge
lives for a while. To reduce the overhead of unnecessary
back-and-forth copying in the survivor spaces, just use
MaxTenuringThreshold=1 (This suggestion was also made by
several others in the thread, and is corroborated by your
PrintTenuringDistribution data). Since you have farily large survivor
spaces configured now, (at least large enough to fit 4 age cohorts,
which will be down to 1 age cohort if you use MTT=1), i'd
suggest making your surviror spaces smaller, may be down to
about 64 MB from the current 420 MB each, and give the excess
to your Eden space.

Then use 6u21 when it comes out (or ask your Java support to
send you a 6u21 for a beta test), or drop in a JVM from JDK 7 into
your 6u20 installation, and run with that. If you still see
rising pause times let me know or file a bug, and send us the
log file and JVM options along with full platform information.

I'll run some metrics from yr log file if you send me the info
re platform above, and that may perhaps reveal a few more secrets.

later.
-- ramki

On 05/12/10 15:19, Matt Fowles wrote:
> All~
> 
> I have a large app that produces ~4g of garbage every 30 seconds and
> am trying to reduce the size of gc outliers.  About 99% of this data
> is garbage, but almost anything that survives one collection survives
> for an indeterminately long amount of time.  We are currently using
> the following VM and options:
> 
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
> 
>                -verbose:gc
>                -XX:+PrintGCTimeStamps
>                -XX:+PrintGCDetails
>                -XX:+PrintGCTaskTimeStamps
>                -XX:+PrintTenuringDistribution
>                -XX:+PrintCommandLineFlags
>                -XX:+PrintReferenceGC
>                -Xms32g -Xmx32g -Xmn4g
>                -XX:+UseParNewGC
>                -XX:ParallelGCThreads=4
>                -XX:+UseConcMarkSweepGC
>                -XX:ParallelCMSThreads=4
>                -XX:CMSInitiatingOccupancyFraction=60
>                -XX:+UseCMSInitiatingOccupancyOnly
>                -XX:+CMSParallelRemarkEnabled
>                -XX:MaxGCPauseMillis=50
>                -Xloggc:gc.log
> 
> 
> As you can see from the GC log, we never actually reach the point
> where the CMS kicks in (after app startup).  But our young gens seem
> to take increasingly long to collect as time goes by.
> 
> The steady state of the app is reached around 956.392 into the log
> with a collection that takes 0.106 seconds.  Thereafter the survivor
> space remains roughly constantly as filled and the amount promoted to
> old gen also remains constant, but the collection times increase to
> 2.855 seconds by the end of the 3.5 hour run.
> 
> Has anyone seen this sort of behavior before?  Are there more switches
> that I should try running with?
> 
> Obviously, I am working to profile the app and reduce the garbage load
> in parallel.  But if I still see this sort of problem, it is only a
> question of how long must the app run before I see unacceptable
> latency spikes.
> 
> Matt
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From matt.fowles at gmail.com  Fri May 14 10:07:59 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Fri, 14 May 2010 13:07:59 -0400
Subject: Growing GC Young Gen Times
In-Reply-To: <4BED8121.5000405@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com> 
	<4BED8121.5000405@oracle.com>
Message-ID: <AANLkTinFXqIRNqScqlfGdEFQxLWSWiEXAKtTRpsg2GH0@mail.gmail.com>

Ramki~

The machine has 4 cpus each of which have 4 cores.  I will adjust the
survivor spaces as you suggest.  Previously I had been running with
MTT 0, but change it to 4 at the suggestion of others.

Running with the JDK7 version may take a bit of time, but I will
pursue that as well.

Matt


On Fri, May 14, 2010 at 12:58 PM, Y. Srinivas Ramakrishna
<y.s.ramakrishna at oracle.com> wrote:
> Hi Matt -- i am computing some metrics from yr log file
> and would like to know how many cpu's you have for the logs below?
>
> Also, as you noted, almost anything that survives a scavenge
> lives for a while. To reduce the overhead of unnecessary
> back-and-forth copying in the survivor spaces, just use
> MaxTenuringThreshold=1 (This suggestion was also made by
> several others in the thread, and is corroborated by your
> PrintTenuringDistribution data). Since you have farily large survivor
> spaces configured now, (at least large enough to fit 4 age cohorts,
> which will be down to 1 age cohort if you use MTT=1), i'd
> suggest making your surviror spaces smaller, may be down to
> about 64 MB from the current 420 MB each, and give the excess
> to your Eden space.
>
> Then use 6u21 when it comes out (or ask your Java support to
> send you a 6u21 for a beta test), or drop in a JVM from JDK 7 into
> your 6u20 installation, and run with that. If you still see
> rising pause times let me know or file a bug, and send us the
> log file and JVM options along with full platform information.
>
> I'll run some metrics from yr log file if you send me the info
> re platform above, and that may perhaps reveal a few more secrets.
>
> later.
> -- ramki
>
> On 05/12/10 15:19, Matt Fowles wrote:
>>
>> All~
>>
>> I have a large app that produces ~4g of garbage every 30 seconds and
>> am trying to reduce the size of gc outliers. ?About 99% of this data
>> is garbage, but almost anything that survives one collection survives
>> for an indeterminately long amount of time. ?We are currently using
>> the following VM and options:
>>
>> java version "1.6.0_20"
>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>
>> ? ? ? ? ? ? ? -verbose:gc
>> ? ? ? ? ? ? ? -XX:+PrintGCTimeStamps
>> ? ? ? ? ? ? ? -XX:+PrintGCDetails
>> ? ? ? ? ? ? ? -XX:+PrintGCTaskTimeStamps
>> ? ? ? ? ? ? ? -XX:+PrintTenuringDistribution
>> ? ? ? ? ? ? ? -XX:+PrintCommandLineFlags
>> ? ? ? ? ? ? ? -XX:+PrintReferenceGC
>> ? ? ? ? ? ? ? -Xms32g -Xmx32g -Xmn4g
>> ? ? ? ? ? ? ? -XX:+UseParNewGC
>> ? ? ? ? ? ? ? -XX:ParallelGCThreads=4
>> ? ? ? ? ? ? ? -XX:+UseConcMarkSweepGC
>> ? ? ? ? ? ? ? -XX:ParallelCMSThreads=4
>> ? ? ? ? ? ? ? -XX:CMSInitiatingOccupancyFraction=60
>> ? ? ? ? ? ? ? -XX:+UseCMSInitiatingOccupancyOnly
>> ? ? ? ? ? ? ? -XX:+CMSParallelRemarkEnabled
>> ? ? ? ? ? ? ? -XX:MaxGCPauseMillis=50
>> ? ? ? ? ? ? ? -Xloggc:gc.log
>>
>>
>> As you can see from the GC log, we never actually reach the point
>> where the CMS kicks in (after app startup). ?But our young gens seem
>> to take increasingly long to collect as time goes by.
>>
>> The steady state of the app is reached around 956.392 into the log
>> with a collection that takes 0.106 seconds. ?Thereafter the survivor
>> space remains roughly constantly as filled and the amount promoted to
>> old gen also remains constant, but the collection times increase to
>> 2.855 seconds by the end of the 3.5 hour run.
>>
>> Has anyone seen this sort of behavior before? ?Are there more switches
>> that I should try running with?
>>
>> Obviously, I am working to profile the app and reduce the garbage load
>> in parallel. ?But if I still see this sort of problem, it is only a
>> question of how long must the app run before I see unacceptable
>> latency spikes.
>>
>> Matt
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>

From y.s.ramakrishna at oracle.com  Fri May 14 10:23:38 2010
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 14 May 2010 10:23:38 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTinFXqIRNqScqlfGdEFQxLWSWiEXAKtTRpsg2GH0@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BED8121.5000405@oracle.com>
	<AANLkTinFXqIRNqScqlfGdEFQxLWSWiEXAKtTRpsg2GH0@mail.gmail.com>
Message-ID: <4BED871A.4010306@oracle.com>

On 05/14/10 10:07, Matt Fowles wrote:
> Ramki~
> 
> The machine has 4 cpus each of which have 4 cores.  I will adjust the

Great, thanks. I'd suggest make ParallelGCThreads=8. Also compare with
-XX:-UseParNewGC. if it's the kind of fragmentation that we
believe may be the cause here, you'd see larger gc times in the
latter case but they would not increase as they do now. But that
is conjecture at this point.

> survivor spaces as you suggest.  Previously I had been running with
> MTT 0, but change it to 4 at the suggestion of others.

MTT=0 can give very poor performance, as people said MTT=4
would definitely be better here than MTT=0.
You should use MTT=1 here though.

> 
> Running with the JDK7 version may take a bit of time, but I will
> pursue that as well.

All you should do is pull the libjvm.so that is in the JDK 7 installation
(or bundle) and plonk it down into the appropriate directory of your
existing JDK 6u20 installation. We just want to see the results with
the latest JVM which includes a fix for 6631166.

I attached a very rough plot of some metrics extracted from your log
and this behaviour is definitely deserving of a bug, especially
if it can be shown that it happens in the latest JVM. In the plot:

   red: scavenge durations
   dark blue: promoted data per scavenge
   pink: data in survivor space following scavenge
   light blue: live data in old gen

As you can see the scavenge clearly correlates with the
occupancy of the old gen (as Jon and others indicated).
Did you try Jon's suggestion of doing a manual GC at that
point via jconsole, and seeing if the upward trend of
scavenges continues beyond that?

Did you use -XX:+UseLargePages and -XX:+AlwaysPreTouch?

Do you have an easily used test case that you can share with us via
your support channels? If/when you do so, please copy me and
send them a reference to this thread on this mailing list.

later, with your new data.
-- ramki

> 
> Matt
> 
> 
> 
> On Fri, May 14, 2010 at 12:58 PM, Y. Srinivas Ramakrishna
> <y.s.ramakrishna at oracle.com> wrote:
>> Hi Matt -- i am computing some metrics from yr log file
>> and would like to know how many cpu's you have for the logs below?
>>
>> Also, as you noted, almost anything that survives a scavenge
>> lives for a while. To reduce the overhead of unnecessary
>> back-and-forth copying in the survivor spaces, just use
>> MaxTenuringThreshold=1 (This suggestion was also made by
>> several others in the thread, and is corroborated by your
>> PrintTenuringDistribution data). Since you have farily large survivor
>> spaces configured now, (at least large enough to fit 4 age cohorts,
>> which will be down to 1 age cohort if you use MTT=1), i'd
>> suggest making your surviror spaces smaller, may be down to
>> about 64 MB from the current 420 MB each, and give the excess
>> to your Eden space.
>>
>> Then use 6u21 when it comes out (or ask your Java support to
>> send you a 6u21 for a beta test), or drop in a JVM from JDK 7 into
>> your 6u20 installation, and run with that. If you still see
>> rising pause times let me know or file a bug, and send us the
>> log file and JVM options along with full platform information.
>>
>> I'll run some metrics from yr log file if you send me the info
>> re platform above, and that may perhaps reveal a few more secrets.
>>
>> later.
>> -- ramki
>>
>> On 05/12/10 15:19, Matt Fowles wrote:
>>> All~
>>>
>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>> am trying to reduce the size of gc outliers.  About 99% of this data
>>> is garbage, but almost anything that survives one collection survives
>>> for an indeterminately long amount of time.  We are currently using
>>> the following VM and options:
>>>
>>> java version "1.6.0_20"
>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>
>>>               -verbose:gc
>>>               -XX:+PrintGCTimeStamps
>>>               -XX:+PrintGCDetails
>>>               -XX:+PrintGCTaskTimeStamps
>>>               -XX:+PrintTenuringDistribution
>>>               -XX:+PrintCommandLineFlags
>>>               -XX:+PrintReferenceGC
>>>               -Xms32g -Xmx32g -Xmn4g
>>>               -XX:+UseParNewGC
>>>               -XX:ParallelGCThreads=4
>>>               -XX:+UseConcMarkSweepGC
>>>               -XX:ParallelCMSThreads=4
>>>               -XX:CMSInitiatingOccupancyFraction=60
>>>               -XX:+UseCMSInitiatingOccupancyOnly
>>>               -XX:+CMSParallelRemarkEnabled
>>>               -XX:MaxGCPauseMillis=50
>>>               -Xloggc:gc.log
>>>
>>>
>>> As you can see from the GC log, we never actually reach the point
>>> where the CMS kicks in (after app startup).  But our young gens seem
>>> to take increasingly long to collect as time goes by.
>>>
>>> The steady state of the app is reached around 956.392 into the log
>>> with a collection that takes 0.106 seconds.  Thereafter the survivor
>>> space remains roughly constantly as filled and the amount promoted to
>>> old gen also remains constant, but the collection times increase to
>>> 2.855 seconds by the end of the 3.5 hour run.
>>>
>>> Has anyone seen this sort of behavior before?  Are there more switches
>>> that I should try running with?
>>>
>>> Obviously, I am working to profile the app and reduce the garbage load
>>> in parallel.  But if I still see this sort of problem, it is only a
>>> question of how long must the app run before I see unacceptable
>>> latency spikes.
>>>
>>> Matt
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: rough_plot.gif
Type: image/gif
Size: 16547 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100514/acdad9eb/attachment-0001.gif 

From matt.fowles at gmail.com  Fri May 14 10:24:07 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Fri, 14 May 2010 13:24:07 -0400
Subject: Growing GC Young Gen Times
In-Reply-To: <4BEC7D4D.2000905@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com> 
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com> 
	<4BEC7498.6030405@oracle.com> <4BEC7D4D.2000905@oracle.com>
Message-ID: <AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com>

Jon~

That makes, sense but the fact is that the old gen *never* get
collected.  So all the allocations happen from the giant empty space
at the end of the free list.  I thought fragmentation only occurred
when the free lists are added to after freeing memory...

Matt

On Thu, May 13, 2010 at 6:29 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
> Matt,
>
> To amplify on Ramki's comment, the allocations out of the
> old generation are always from a free list. ?During a young
> generation collection each GC thread will get its own
> local free lists from the old generation so that it can
> copy objects to the old generation without synchronizing
> with the other GC thread (most of the time). ?Objects from
> a GC thread's local free lists are pushed to the globals lists
> after the collection (as far as I recall). So there is some
> churn in the free lists.
>
> Jon
>
> On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
>>
>> On 05/13/10 10:50, Matt Fowles wrote:
>>>
>>> Jon~
>>>
>>> This may sound naive, but how can fragmentation be an issue if the old
>>> gen has never been collected? ?I would think we are still in the space
>>> where we can just bump the old gen alloc pointer...
>>
>> Matt, The old gen allocator may fragment the space. Allocation is not
>> exactly "bump a pointer".
>>
>> -- ramki
>>
>>>
>>> Matt
>>>
>>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>>> <jon.masamitsu at oracle.com> wrote:
>>>>
>>>> Matt,
>>>>
>>>> As Ramki indicated fragmentation might be an issue. ?As the
>>>> fragmentation
>>>> in the old generation increases, it takes longer to find space in the
>>>> old
>>>> generation
>>>> into which to promote objects from the young generation. ?This is
>>>> apparently
>>>> not
>>>> the problem that Wayne is having but you still might be hitting it. ?If
>>>> you
>>>> can
>>>> connect jconsole to the VM and force a full GC, that would tell us if
>>>> it's
>>>> fragmentation.
>>>>
>>>> There might be a scaling issue with the UseParNewGC. ?If you can use
>>>> -XX:-UseParNewGC (turning off the parallel young
>>>> generation collection) with ?-XX:+UseConcMarkSweepGC the pauses
>>>> will be longer but may be more stable. ?That's not the solution but just
>>>> part
>>>> of the investigation.
>>>>
>>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>>>> and if you don't see the growing young generation pause, that would
>>>> indicate
>>>> something specific about promotion into the CMS generation.
>>>>
>>>> UseParallelGC is different from UseParNewGC in a number of ways
>>>> and if you try UseParallelGC and still see the growing young generation
>>>> pauses, I'd suspect something special about your application.
>>>>
>>>> If you can run these experiments hopefully they will tell
>>>> us where to look next.
>>>>
>>>> Jon
>>>>
>>>>
>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>
>>>> All~
>>>>
>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>> am trying to reduce the size of gc outliers. ?About 99% of this data
>>>> is garbage, but almost anything that survives one collection survives
>>>> for an indeterminately long amount of time. ?We are currently using
>>>> the following VM and options:
>>>>
>>>> java version "1.6.0_20"
>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>
>>>> ? ? ? ? ? ? ? -verbose:gc
>>>> ? ? ? ? ? ? ? -XX:+PrintGCTimeStamps
>>>> ? ? ? ? ? ? ? -XX:+PrintGCDetails
>>>> ? ? ? ? ? ? ? -XX:+PrintGCTaskTimeStamps
>>>> ? ? ? ? ? ? ? -XX:+PrintTenuringDistribution
>>>> ? ? ? ? ? ? ? -XX:+PrintCommandLineFlags
>>>> ? ? ? ? ? ? ? -XX:+PrintReferenceGC
>>>> ? ? ? ? ? ? ? -Xms32g -Xmx32g -Xmn4g
>>>> ? ? ? ? ? ? ? -XX:+UseParNewGC
>>>> ? ? ? ? ? ? ? -XX:ParallelGCThreads=4
>>>> ? ? ? ? ? ? ? -XX:+UseConcMarkSweepGC
>>>> ? ? ? ? ? ? ? -XX:ParallelCMSThreads=4
>>>> ? ? ? ? ? ? ? -XX:CMSInitiatingOccupancyFraction=60
>>>> ? ? ? ? ? ? ? -XX:+UseCMSInitiatingOccupancyOnly
>>>> ? ? ? ? ? ? ? -XX:+CMSParallelRemarkEnabled
>>>> ? ? ? ? ? ? ? -XX:MaxGCPauseMillis=50
>>>> ? ? ? ? ? ? ? -Xloggc:gc.log
>>>>
>>>>
>>>> As you can see from the GC log, we never actually reach the point
>>>> where the CMS kicks in (after app startup). ?But our young gens seem
>>>> to take increasingly long to collect as time goes by.
>>>>
>>>> The steady state of the app is reached around 956.392 into the log
>>>> with a collection that takes 0.106 seconds. ?Thereafter the survivor
>>>> space remains roughly constantly as filled and the amount promoted to
>>>> old gen also remains constant, but the collection times increase to
>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>
>>>> Has anyone seen this sort of behavior before? ?Are there more switches
>>>> that I should try running with?
>>>>
>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>> in parallel. ?But if I still see this sort of problem, it is only a
>>>> question of how long must the app run before I see unacceptable
>>>> latency spikes.
>>>>
>>>> Matt
>>>>
>>>> ________________________________
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>

From y.s.ramakrishna at oracle.com  Fri May 14 10:36:23 2010
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 14 May 2010 10:36:23 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
	<4BEC7498.6030405@oracle.com> <4BEC7D4D.2000905@oracle.com>
	<AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com>
Message-ID: <4BED8A17.9090208@oracle.com>

On 05/14/10 10:24, Matt Fowles wrote:
> Jon~
> 
> That makes, sense but the fact is that the old gen *never* get
> collected.  So all the allocations happen from the giant empty space
> at the end of the free list.  I thought fragmentation only occurred
> when the free lists are added to after freeing memory...

As Jon indicated allocation is done from free lists of blocks
that are pre-carved on demand to avoid contention while allocating.
The old heuristics for how large to make those lists and the
inventory to hold in those lists was not working well as you
scaled the number of workers. Following 6631166 we believe it
works better and causes both less contention and less
fragmentation than it did before, because we do not hold
unnecessary excess inventory of free blocks.

The fragmentation in turn causes card-scanning to suffer
adversely, besides the issues with loss of spatial locality also
increasing cache misses and TLB misses. (The large page
option might help mitigate the latter a bit, especially
since you have such a large heap and our fragmented
allocation may be exacerbating the TLB pressure.)

-- ramki

> 
> Matt
> 
> On Thu, May 13, 2010 at 6:29 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
>> Matt,
>>
>> To amplify on Ramki's comment, the allocations out of the
>> old generation are always from a free list.  During a young
>> generation collection each GC thread will get its own
>> local free lists from the old generation so that it can
>> copy objects to the old generation without synchronizing
>> with the other GC thread (most of the time).  Objects from
>> a GC thread's local free lists are pushed to the globals lists
>> after the collection (as far as I recall). So there is some
>> churn in the free lists.
>>
>> Jon
>>
>> On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
>>> On 05/13/10 10:50, Matt Fowles wrote:
>>>> Jon~
>>>>
>>>> This may sound naive, but how can fragmentation be an issue if the old
>>>> gen has never been collected?  I would think we are still in the space
>>>> where we can just bump the old gen alloc pointer...
>>> Matt, The old gen allocator may fragment the space. Allocation is not
>>> exactly "bump a pointer".
>>>
>>> -- ramki
>>>
>>>> Matt
>>>>
>>>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>>>> <jon.masamitsu at oracle.com> wrote:
>>>>> Matt,
>>>>>
>>>>> As Ramki indicated fragmentation might be an issue.  As the
>>>>> fragmentation
>>>>> in the old generation increases, it takes longer to find space in the
>>>>> old
>>>>> generation
>>>>> into which to promote objects from the young generation.  This is
>>>>> apparently
>>>>> not
>>>>> the problem that Wayne is having but you still might be hitting it.  If
>>>>> you
>>>>> can
>>>>> connect jconsole to the VM and force a full GC, that would tell us if
>>>>> it's
>>>>> fragmentation.
>>>>>
>>>>> There might be a scaling issue with the UseParNewGC.  If you can use
>>>>> -XX:-UseParNewGC (turning off the parallel young
>>>>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
>>>>> will be longer but may be more stable.  That's not the solution but just
>>>>> part
>>>>> of the investigation.
>>>>>
>>>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>>>>> and if you don't see the growing young generation pause, that would
>>>>> indicate
>>>>> something specific about promotion into the CMS generation.
>>>>>
>>>>> UseParallelGC is different from UseParNewGC in a number of ways
>>>>> and if you try UseParallelGC and still see the growing young generation
>>>>> pauses, I'd suspect something special about your application.
>>>>>
>>>>> If you can run these experiments hopefully they will tell
>>>>> us where to look next.
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>>
>>>>> All~
>>>>>
>>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>>> am trying to reduce the size of gc outliers.  About 99% of this data
>>>>> is garbage, but almost anything that survives one collection survives
>>>>> for an indeterminately long amount of time.  We are currently using
>>>>> the following VM and options:
>>>>>
>>>>> java version "1.6.0_20"
>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>>
>>>>>               -verbose:gc
>>>>>               -XX:+PrintGCTimeStamps
>>>>>               -XX:+PrintGCDetails
>>>>>               -XX:+PrintGCTaskTimeStamps
>>>>>               -XX:+PrintTenuringDistribution
>>>>>               -XX:+PrintCommandLineFlags
>>>>>               -XX:+PrintReferenceGC
>>>>>               -Xms32g -Xmx32g -Xmn4g
>>>>>               -XX:+UseParNewGC
>>>>>               -XX:ParallelGCThreads=4
>>>>>               -XX:+UseConcMarkSweepGC
>>>>>               -XX:ParallelCMSThreads=4
>>>>>               -XX:CMSInitiatingOccupancyFraction=60
>>>>>               -XX:+UseCMSInitiatingOccupancyOnly
>>>>>               -XX:+CMSParallelRemarkEnabled
>>>>>               -XX:MaxGCPauseMillis=50
>>>>>               -Xloggc:gc.log
>>>>>
>>>>>
>>>>> As you can see from the GC log, we never actually reach the point
>>>>> where the CMS kicks in (after app startup).  But our young gens seem
>>>>> to take increasingly long to collect as time goes by.
>>>>>
>>>>> The steady state of the app is reached around 956.392 into the log
>>>>> with a collection that takes 0.106 seconds.  Thereafter the survivor
>>>>> space remains roughly constantly as filled and the amount promoted to
>>>>> old gen also remains constant, but the collection times increase to
>>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>>
>>>>> Has anyone seen this sort of behavior before?  Are there more switches
>>>>> that I should try running with?
>>>>>
>>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>>> in parallel.  But if I still see this sort of problem, it is only a
>>>>> question of how long must the app run before I see unacceptable
>>>>> latency spikes.
>>>>>
>>>>> Matt
>>>>>
>>>>> ________________________________
>>>>> _______________________________________________
>>>>> hotspot-gc-use mailing list
>>>>> hotspot-gc-use at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From jon.masamitsu at oracle.com  Fri May 14 10:39:50 2010
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 14 May 2010 10:39:50 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
	<4BEC7498.6030405@oracle.com> <4BEC7D4D.2000905@oracle.com>
	<AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com>
Message-ID: <4BED8AE6.1020408@oracle.com>

On 5/14/10 10:24 AM, Matt Fowles wrote:
> Jon~
>
> That makes, sense but the fact is that the old gen *never* get
> collected.  So all the allocations happen from the giant empty space
> at the end of the free list.  I thought fragmentation only occurred
> when the free lists are added to after freeing memory...
>    

Ok.  You may be right.
> Matt
>
> On Thu, May 13, 2010 at 6:29 PM, Jon Masamitsu<jon.masamitsu at oracle.com>  wrote:
>    
>> Matt,
>>
>> To amplify on Ramki's comment, the allocations out of the
>> old generation are always from a free list.  During a young
>> generation collection each GC thread will get its own
>> local free lists from the old generation so that it can
>> copy objects to the old generation without synchronizing
>> with the other GC thread (most of the time).  Objects from
>> a GC thread's local free lists are pushed to the globals lists
>> after the collection (as far as I recall). So there is some
>> churn in the free lists.
>>
>> Jon
>>
>> On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
>>      
>>> On 05/13/10 10:50, Matt Fowles wrote:
>>>        
>>>> Jon~
>>>>
>>>> This may sound naive, but how can fragmentation be an issue if the old
>>>> gen has never been collected?  I would think we are still in the space
>>>> where we can just bump the old gen alloc pointer...
>>>>          
>>> Matt, The old gen allocator may fragment the space. Allocation is not
>>> exactly "bump a pointer".
>>>
>>> -- ramki
>>>
>>>        
>>>> Matt
>>>>
>>>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>>>> <jon.masamitsu at oracle.com>  wrote:
>>>>          
>>>>> Matt,
>>>>>
>>>>> As Ramki indicated fragmentation might be an issue.  As the
>>>>> fragmentation
>>>>> in the old generation increases, it takes longer to find space in the
>>>>> old
>>>>> generation
>>>>> into which to promote objects from the young generation.  This is
>>>>> apparently
>>>>> not
>>>>> the problem that Wayne is having but you still might be hitting it.  If
>>>>> you
>>>>> can
>>>>> connect jconsole to the VM and force a full GC, that would tell us if
>>>>> it's
>>>>> fragmentation.
>>>>>
>>>>> There might be a scaling issue with the UseParNewGC.  If you can use
>>>>> -XX:-UseParNewGC (turning off the parallel young
>>>>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
>>>>> will be longer but may be more stable.  That's not the solution but just
>>>>> part
>>>>> of the investigation.
>>>>>
>>>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>>>>> and if you don't see the growing young generation pause, that would
>>>>> indicate
>>>>> something specific about promotion into the CMS generation.
>>>>>
>>>>> UseParallelGC is different from UseParNewGC in a number of ways
>>>>> and if you try UseParallelGC and still see the growing young generation
>>>>> pauses, I'd suspect something special about your application.
>>>>>
>>>>> If you can run these experiments hopefully they will tell
>>>>> us where to look next.
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>>
>>>>> All~
>>>>>
>>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>>> am trying to reduce the size of gc outliers.  About 99% of this data
>>>>> is garbage, but almost anything that survives one collection survives
>>>>> for an indeterminately long amount of time.  We are currently using
>>>>> the following VM and options:
>>>>>
>>>>> java version "1.6.0_20"
>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>>
>>>>>                -verbose:gc
>>>>>                -XX:+PrintGCTimeStamps
>>>>>                -XX:+PrintGCDetails
>>>>>                -XX:+PrintGCTaskTimeStamps
>>>>>                -XX:+PrintTenuringDistribution
>>>>>                -XX:+PrintCommandLineFlags
>>>>>                -XX:+PrintReferenceGC
>>>>>                -Xms32g -Xmx32g -Xmn4g
>>>>>                -XX:+UseParNewGC
>>>>>                -XX:ParallelGCThreads=4
>>>>>                -XX:+UseConcMarkSweepGC
>>>>>                -XX:ParallelCMSThreads=4
>>>>>                -XX:CMSInitiatingOccupancyFraction=60
>>>>>                -XX:+UseCMSInitiatingOccupancyOnly
>>>>>                -XX:+CMSParallelRemarkEnabled
>>>>>                -XX:MaxGCPauseMillis=50
>>>>>                -Xloggc:gc.log
>>>>>
>>>>>
>>>>> As you can see from the GC log, we never actually reach the point
>>>>> where the CMS kicks in (after app startup).  But our young gens seem
>>>>> to take increasingly long to collect as time goes by.
>>>>>
>>>>> The steady state of the app is reached around 956.392 into the log
>>>>> with a collection that takes 0.106 seconds.  Thereafter the survivor
>>>>> space remains roughly constantly as filled and the amount promoted to
>>>>> old gen also remains constant, but the collection times increase to
>>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>>
>>>>> Has anyone seen this sort of behavior before?  Are there more switches
>>>>> that I should try running with?
>>>>>
>>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>>> in parallel.  But if I still see this sort of problem, it is only a
>>>>> question of how long must the app run before I see unacceptable
>>>>> latency spikes.
>>>>>
>>>>> Matt
>>>>>
>>>>> ________________________________
>>>>> _______________________________________________
>>>>> hotspot-gc-use mailing list
>>>>> hotspot-gc-use at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>>            
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>          
>>>        
>>      


From y.s.ramakrishna at oracle.com  Fri May 14 10:44:25 2010
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 14 May 2010 10:44:25 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <4BED8A17.9090208@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>	<4BEC2776.8010609@oracle.com>	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>	<4BEC7498.6030405@oracle.com>
	<4BEC7D4D.2000905@oracle.com>	<AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com>
	<4BED8A17.9090208@oracle.com>
Message-ID: <4BED8BF9.7000803@oracle.com>

On 05/14/10 10:36, Y. Srinivas Ramakrishna wrote:
> On 05/14/10 10:24, Matt Fowles wrote:
>> Jon~
>>
>> That makes, sense but the fact is that the old gen *never* get
>> collected.  So all the allocations happen from the giant empty space
>> at the end of the free list.  I thought fragmentation only occurred
>> when the free lists are added to after freeing memory...
> 
> As Jon indicated allocation is done from free lists of blocks
> that are pre-carved on demand to avoid contention while allocating.
> The old heuristics for how large to make those lists and the
> inventory to hold in those lists was not working well as you
> scaled the number of workers. Following 6631166 we believe it
> works better and causes both less contention and less
> fragmentation than it did before, because we do not hold
> unnecessary excess inventory of free blocks.

To see what the fragmentation is, try -XX:PrintFLSStatistics=2.
This will slow down your scavenge pauses (perhaps by quite a bit
for your 26 GB heap), but you will get a report of the number of
blocks on free lists and how fragmented the space is on that ccount
(for some appropriate notion of fragmentation). Don't use that
flag in production though :-)

-- ramki

> 
> The fragmentation in turn causes card-scanning to suffer
> adversely, besides the issues with loss of spatial locality also
> increasing cache misses and TLB misses. (The large page
> option might help mitigate the latter a bit, especially
> since you have such a large heap and our fragmented
> allocation may be exacerbating the TLB pressure.)
> 
> -- ramki
> 
>> Matt
>>
>> On Thu, May 13, 2010 at 6:29 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
>>> Matt,
>>>
>>> To amplify on Ramki's comment, the allocations out of the
>>> old generation are always from a free list.  During a young
>>> generation collection each GC thread will get its own
>>> local free lists from the old generation so that it can
>>> copy objects to the old generation without synchronizing
>>> with the other GC thread (most of the time).  Objects from
>>> a GC thread's local free lists are pushed to the globals lists
>>> after the collection (as far as I recall). So there is some
>>> churn in the free lists.
>>>
>>> Jon
>>>
>>> On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
>>>> On 05/13/10 10:50, Matt Fowles wrote:
>>>>> Jon~
>>>>>
>>>>> This may sound naive, but how can fragmentation be an issue if the old
>>>>> gen has never been collected?  I would think we are still in the space
>>>>> where we can just bump the old gen alloc pointer...
>>>> Matt, The old gen allocator may fragment the space. Allocation is not
>>>> exactly "bump a pointer".
>>>>
>>>> -- ramki
>>>>
>>>>> Matt
>>>>>
>>>>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>>>>> <jon.masamitsu at oracle.com> wrote:
>>>>>> Matt,
>>>>>>
>>>>>> As Ramki indicated fragmentation might be an issue.  As the
>>>>>> fragmentation
>>>>>> in the old generation increases, it takes longer to find space in the
>>>>>> old
>>>>>> generation
>>>>>> into which to promote objects from the young generation.  This is
>>>>>> apparently
>>>>>> not
>>>>>> the problem that Wayne is having but you still might be hitting it.  If
>>>>>> you
>>>>>> can
>>>>>> connect jconsole to the VM and force a full GC, that would tell us if
>>>>>> it's
>>>>>> fragmentation.
>>>>>>
>>>>>> There might be a scaling issue with the UseParNewGC.  If you can use
>>>>>> -XX:-UseParNewGC (turning off the parallel young
>>>>>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
>>>>>> will be longer but may be more stable.  That's not the solution but just
>>>>>> part
>>>>>> of the investigation.
>>>>>>
>>>>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>>>>>> and if you don't see the growing young generation pause, that would
>>>>>> indicate
>>>>>> something specific about promotion into the CMS generation.
>>>>>>
>>>>>> UseParallelGC is different from UseParNewGC in a number of ways
>>>>>> and if you try UseParallelGC and still see the growing young generation
>>>>>> pauses, I'd suspect something special about your application.
>>>>>>
>>>>>> If you can run these experiments hopefully they will tell
>>>>>> us where to look next.
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>>
>>>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>>>
>>>>>> All~
>>>>>>
>>>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>>>> am trying to reduce the size of gc outliers.  About 99% of this data
>>>>>> is garbage, but almost anything that survives one collection survives
>>>>>> for an indeterminately long amount of time.  We are currently using
>>>>>> the following VM and options:
>>>>>>
>>>>>> java version "1.6.0_20"
>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>>>
>>>>>>               -verbose:gc
>>>>>>               -XX:+PrintGCTimeStamps
>>>>>>               -XX:+PrintGCDetails
>>>>>>               -XX:+PrintGCTaskTimeStamps
>>>>>>               -XX:+PrintTenuringDistribution
>>>>>>               -XX:+PrintCommandLineFlags
>>>>>>               -XX:+PrintReferenceGC
>>>>>>               -Xms32g -Xmx32g -Xmn4g
>>>>>>               -XX:+UseParNewGC
>>>>>>               -XX:ParallelGCThreads=4
>>>>>>               -XX:+UseConcMarkSweepGC
>>>>>>               -XX:ParallelCMSThreads=4
>>>>>>               -XX:CMSInitiatingOccupancyFraction=60
>>>>>>               -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>               -XX:+CMSParallelRemarkEnabled
>>>>>>               -XX:MaxGCPauseMillis=50
>>>>>>               -Xloggc:gc.log
>>>>>>
>>>>>>
>>>>>> As you can see from the GC log, we never actually reach the point
>>>>>> where the CMS kicks in (after app startup).  But our young gens seem
>>>>>> to take increasingly long to collect as time goes by.
>>>>>>
>>>>>> The steady state of the app is reached around 956.392 into the log
>>>>>> with a collection that takes 0.106 seconds.  Thereafter the survivor
>>>>>> space remains roughly constantly as filled and the amount promoted to
>>>>>> old gen also remains constant, but the collection times increase to
>>>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>>>
>>>>>> Has anyone seen this sort of behavior before?  Are there more switches
>>>>>> that I should try running with?
>>>>>>
>>>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>>>> in parallel.  But if I still see this sort of problem, it is only a
>>>>>> question of how long must the app run before I see unacceptable
>>>>>> latency spikes.
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>> ________________________________
>>>>>> _______________________________________________
>>>>>> hotspot-gc-use mailing list
>>>>>> hotspot-gc-use at openjdk.java.net
>>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>> _______________________________________________
>>>>> hotspot-gc-use mailing list
>>>>> hotspot-gc-use at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From matt.fowles at gmail.com  Fri May 14 11:30:14 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Fri, 14 May 2010 14:30:14 -0400
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTimpQ16VmjRFE7fSaNYD9k4PcGEn_H8glxQ1gH2y@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com> 
	<4BED8121.5000405@oracle.com>
	<AANLkTinFXqIRNqScqlfGdEFQxLWSWiEXAKtTRpsg2GH0@mail.gmail.com> 
	<4BED871A.4010306@oracle.com>
	<AANLkTimpQ16VmjRFE7fSaNYD9k4PcGEn_H8glxQ1gH2y@mail.gmail.com>
Message-ID: <AANLkTilqrpCyKWcG9JfyAydYY10RzS3ZnDHdsxP9g-L3@mail.gmail.com>

Ramki~

File 2.

Matt

On Fri, May 14, 2010 at 2:29 PM, Matt Fowles <matt.fowles at gmail.com> wrote:
> Ramki~
>
> Attached are 3 different runs with slightly tweaked VM settings based
> on suggestions from this list and others.
>
> All of them have reduced the MaxTenuringThreshold to 2.
>
> gc1.log reduces the young gen size to 1g and the old gen size to 7g
> initially. ?As you can see from it, the young gen sweep speed does not
> improve after the CMS sweep that occurs part way through the run.
> gc2.log adds the -XX:+UseLargePages and -XX:+AlwaysPreTouch options to
> the settings from gc1.log
> gc3.log adds the -XX:+UseLargePages and -XX:+AlwaysPreTouch options to
> a 4g young gen with the original 32g total heap.
>
> Due to the quirks of the infrastructure running these tests, data
> volumes (and hence allocation rates) are NOT comparable between runs.
> Because the tests take ~4 hours to run, we run several tests in
> parallel against different sources of data.
>
> Matt
>
> PS - Due to attachment size restrictions I am going to send each file
> in its own email.
>
> On Fri, May 14, 2010 at 1:23 PM, Y. Srinivas Ramakrishna
> <y.s.ramakrishna at oracle.com> wrote:
>> On 05/14/10 10:07, Matt Fowles wrote:
>>>
>>> Ramki~
>>>
>>> The machine has 4 cpus each of which have 4 cores. ?I will adjust the
>>
>> Great, thanks. I'd suggest make ParallelGCThreads=8. Also compare with
>> -XX:-UseParNewGC. if it's the kind of fragmentation that we
>> believe may be the cause here, you'd see larger gc times in the
>> latter case but they would not increase as they do now. But that
>> is conjecture at this point.
>>
>>> survivor spaces as you suggest. ?Previously I had been running with
>>> MTT 0, but change it to 4 at the suggestion of others.
>>
>> MTT=0 can give very poor performance, as people said MTT=4
>> would definitely be better here than MTT=0.
>> You should use MTT=1 here though.
>>
>>>
>>> Running with the JDK7 version may take a bit of time, but I will
>>> pursue that as well.
>>
>> All you should do is pull the libjvm.so that is in the JDK 7 installation
>> (or bundle) and plonk it down into the appropriate directory of your
>> existing JDK 6u20 installation. We just want to see the results with
>> the latest JVM which includes a fix for 6631166.
>>
>> I attached a very rough plot of some metrics extracted from your log
>> and this behaviour is definitely deserving of a bug, especially
>> if it can be shown that it happens in the latest JVM. In the plot:
>>
>> ?red: scavenge durations
>> ?dark blue: promoted data per scavenge
>> ?pink: data in survivor space following scavenge
>> ?light blue: live data in old gen
>>
>> As you can see the scavenge clearly correlates with the
>> occupancy of the old gen (as Jon and others indicated).
>> Did you try Jon's suggestion of doing a manual GC at that
>> point via jconsole, and seeing if the upward trend of
>> scavenges continues beyond that?
>>
>> Did you use -XX:+UseLargePages and -XX:+AlwaysPreTouch?
>>
>> Do you have an easily used test case that you can share with us via
>> your support channels? If/when you do so, please copy me and
>> send them a reference to this thread on this mailing list.
>>
>> later, with your new data.
>> -- ramki
>>
>>>
>>> Matt
>>>
>>>
>>>
>>> On Fri, May 14, 2010 at 12:58 PM, Y. Srinivas Ramakrishna
>>> <y.s.ramakrishna at oracle.com> wrote:
>>>>
>>>> Hi Matt -- i am computing some metrics from yr log file
>>>> and would like to know how many cpu's you have for the logs below?
>>>>
>>>> Also, as you noted, almost anything that survives a scavenge
>>>> lives for a while. To reduce the overhead of unnecessary
>>>> back-and-forth copying in the survivor spaces, just use
>>>> MaxTenuringThreshold=1 (This suggestion was also made by
>>>> several others in the thread, and is corroborated by your
>>>> PrintTenuringDistribution data). Since you have farily large survivor
>>>> spaces configured now, (at least large enough to fit 4 age cohorts,
>>>> which will be down to 1 age cohort if you use MTT=1), i'd
>>>> suggest making your surviror spaces smaller, may be down to
>>>> about 64 MB from the current 420 MB each, and give the excess
>>>> to your Eden space.
>>>>
>>>> Then use 6u21 when it comes out (or ask your Java support to
>>>> send you a 6u21 for a beta test), or drop in a JVM from JDK 7 into
>>>> your 6u20 installation, and run with that. If you still see
>>>> rising pause times let me know or file a bug, and send us the
>>>> log file and JVM options along with full platform information.
>>>>
>>>> I'll run some metrics from yr log file if you send me the info
>>>> re platform above, and that may perhaps reveal a few more secrets.
>>>>
>>>> later.
>>>> -- ramki
>>>>
>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>>
>>>>> All~
>>>>>
>>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>>> am trying to reduce the size of gc outliers. ?About 99% of this data
>>>>> is garbage, but almost anything that survives one collection survives
>>>>> for an indeterminately long amount of time. ?We are currently using
>>>>> the following VM and options:
>>>>>
>>>>> java version "1.6.0_20"
>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>>
>>>>> ? ? ? ? ? ? ?-verbose:gc
>>>>> ? ? ? ? ? ? ?-XX:+PrintGCTimeStamps
>>>>> ? ? ? ? ? ? ?-XX:+PrintGCDetails
>>>>> ? ? ? ? ? ? ?-XX:+PrintGCTaskTimeStamps
>>>>> ? ? ? ? ? ? ?-XX:+PrintTenuringDistribution
>>>>> ? ? ? ? ? ? ?-XX:+PrintCommandLineFlags
>>>>> ? ? ? ? ? ? ?-XX:+PrintReferenceGC
>>>>> ? ? ? ? ? ? ?-Xms32g -Xmx32g -Xmn4g
>>>>> ? ? ? ? ? ? ?-XX:+UseParNewGC
>>>>> ? ? ? ? ? ? ?-XX:ParallelGCThreads=4
>>>>> ? ? ? ? ? ? ?-XX:+UseConcMarkSweepGC
>>>>> ? ? ? ? ? ? ?-XX:ParallelCMSThreads=4
>>>>> ? ? ? ? ? ? ?-XX:CMSInitiatingOccupancyFraction=60
>>>>> ? ? ? ? ? ? ?-XX:+UseCMSInitiatingOccupancyOnly
>>>>> ? ? ? ? ? ? ?-XX:+CMSParallelRemarkEnabled
>>>>> ? ? ? ? ? ? ?-XX:MaxGCPauseMillis=50
>>>>> ? ? ? ? ? ? ?-Xloggc:gc.log
>>>>>
>>>>>
>>>>> As you can see from the GC log, we never actually reach the point
>>>>> where the CMS kicks in (after app startup). ?But our young gens seem
>>>>> to take increasingly long to collect as time goes by.
>>>>>
>>>>> The steady state of the app is reached around 956.392 into the log
>>>>> with a collection that takes 0.106 seconds. ?Thereafter the survivor
>>>>> space remains roughly constantly as filled and the amount promoted to
>>>>> old gen also remains constant, but the collection times increase to
>>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>>
>>>>> Has anyone seen this sort of behavior before? ?Are there more switches
>>>>> that I should try running with?
>>>>>
>>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>>> in parallel. ?But if I still see this sort of problem, it is only a
>>>>> question of how long must the app run before I see unacceptable
>>>>> latency spikes.
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> _______________________________________________
>>>>> hotspot-gc-use mailing list
>>>>> hotspot-gc-use at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r229-vm-tweaks-gc2.log.bz2
Type: application/x-bzip2
Size: 30955 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100514/2a476a93/attachment-0001.bin 

From matt.fowles at gmail.com  Fri May 14 11:30:30 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Fri, 14 May 2010 14:30:30 -0400
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTilqrpCyKWcG9JfyAydYY10RzS3ZnDHdsxP9g-L3@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com> 
	<4BED8121.5000405@oracle.com>
	<AANLkTinFXqIRNqScqlfGdEFQxLWSWiEXAKtTRpsg2GH0@mail.gmail.com> 
	<4BED871A.4010306@oracle.com>
	<AANLkTimpQ16VmjRFE7fSaNYD9k4PcGEn_H8glxQ1gH2y@mail.gmail.com> 
	<AANLkTilqrpCyKWcG9JfyAydYY10RzS3ZnDHdsxP9g-L3@mail.gmail.com>
Message-ID: <AANLkTikvSWFz-gHFKwZqzqhVAVUoxvvOFE3MuXzPthMz@mail.gmail.com>

Ramki~

File 3.

Matt

On Fri, May 14, 2010 at 2:30 PM, Matt Fowles <matt.fowles at gmail.com> wrote:
> Ramki~
>
> File 2.
>
> Matt
>
> On Fri, May 14, 2010 at 2:29 PM, Matt Fowles <matt.fowles at gmail.com> wrote:
>> Ramki~
>>
>> Attached are 3 different runs with slightly tweaked VM settings based
>> on suggestions from this list and others.
>>
>> All of them have reduced the MaxTenuringThreshold to 2.
>>
>> gc1.log reduces the young gen size to 1g and the old gen size to 7g
>> initially. ?As you can see from it, the young gen sweep speed does not
>> improve after the CMS sweep that occurs part way through the run.
>> gc2.log adds the -XX:+UseLargePages and -XX:+AlwaysPreTouch options to
>> the settings from gc1.log
>> gc3.log adds the -XX:+UseLargePages and -XX:+AlwaysPreTouch options to
>> a 4g young gen with the original 32g total heap.
>>
>> Due to the quirks of the infrastructure running these tests, data
>> volumes (and hence allocation rates) are NOT comparable between runs.
>> Because the tests take ~4 hours to run, we run several tests in
>> parallel against different sources of data.
>>
>> Matt
>>
>> PS - Due to attachment size restrictions I am going to send each file
>> in its own email.
>>
>> On Fri, May 14, 2010 at 1:23 PM, Y. Srinivas Ramakrishna
>> <y.s.ramakrishna at oracle.com> wrote:
>>> On 05/14/10 10:07, Matt Fowles wrote:
>>>>
>>>> Ramki~
>>>>
>>>> The machine has 4 cpus each of which have 4 cores. ?I will adjust the
>>>
>>> Great, thanks. I'd suggest make ParallelGCThreads=8. Also compare with
>>> -XX:-UseParNewGC. if it's the kind of fragmentation that we
>>> believe may be the cause here, you'd see larger gc times in the
>>> latter case but they would not increase as they do now. But that
>>> is conjecture at this point.
>>>
>>>> survivor spaces as you suggest. ?Previously I had been running with
>>>> MTT 0, but change it to 4 at the suggestion of others.
>>>
>>> MTT=0 can give very poor performance, as people said MTT=4
>>> would definitely be better here than MTT=0.
>>> You should use MTT=1 here though.
>>>
>>>>
>>>> Running with the JDK7 version may take a bit of time, but I will
>>>> pursue that as well.
>>>
>>> All you should do is pull the libjvm.so that is in the JDK 7 installation
>>> (or bundle) and plonk it down into the appropriate directory of your
>>> existing JDK 6u20 installation. We just want to see the results with
>>> the latest JVM which includes a fix for 6631166.
>>>
>>> I attached a very rough plot of some metrics extracted from your log
>>> and this behaviour is definitely deserving of a bug, especially
>>> if it can be shown that it happens in the latest JVM. In the plot:
>>>
>>> ?red: scavenge durations
>>> ?dark blue: promoted data per scavenge
>>> ?pink: data in survivor space following scavenge
>>> ?light blue: live data in old gen
>>>
>>> As you can see the scavenge clearly correlates with the
>>> occupancy of the old gen (as Jon and others indicated).
>>> Did you try Jon's suggestion of doing a manual GC at that
>>> point via jconsole, and seeing if the upward trend of
>>> scavenges continues beyond that?
>>>
>>> Did you use -XX:+UseLargePages and -XX:+AlwaysPreTouch?
>>>
>>> Do you have an easily used test case that you can share with us via
>>> your support channels? If/when you do so, please copy me and
>>> send them a reference to this thread on this mailing list.
>>>
>>> later, with your new data.
>>> -- ramki
>>>
>>>>
>>>> Matt
>>>>
>>>>
>>>>
>>>> On Fri, May 14, 2010 at 12:58 PM, Y. Srinivas Ramakrishna
>>>> <y.s.ramakrishna at oracle.com> wrote:
>>>>>
>>>>> Hi Matt -- i am computing some metrics from yr log file
>>>>> and would like to know how many cpu's you have for the logs below?
>>>>>
>>>>> Also, as you noted, almost anything that survives a scavenge
>>>>> lives for a while. To reduce the overhead of unnecessary
>>>>> back-and-forth copying in the survivor spaces, just use
>>>>> MaxTenuringThreshold=1 (This suggestion was also made by
>>>>> several others in the thread, and is corroborated by your
>>>>> PrintTenuringDistribution data). Since you have farily large survivor
>>>>> spaces configured now, (at least large enough to fit 4 age cohorts,
>>>>> which will be down to 1 age cohort if you use MTT=1), i'd
>>>>> suggest making your surviror spaces smaller, may be down to
>>>>> about 64 MB from the current 420 MB each, and give the excess
>>>>> to your Eden space.
>>>>>
>>>>> Then use 6u21 when it comes out (or ask your Java support to
>>>>> send you a 6u21 for a beta test), or drop in a JVM from JDK 7 into
>>>>> your 6u20 installation, and run with that. If you still see
>>>>> rising pause times let me know or file a bug, and send us the
>>>>> log file and JVM options along with full platform information.
>>>>>
>>>>> I'll run some metrics from yr log file if you send me the info
>>>>> re platform above, and that may perhaps reveal a few more secrets.
>>>>>
>>>>> later.
>>>>> -- ramki
>>>>>
>>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>>>
>>>>>> All~
>>>>>>
>>>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>>>> am trying to reduce the size of gc outliers. ?About 99% of this data
>>>>>> is garbage, but almost anything that survives one collection survives
>>>>>> for an indeterminately long amount of time. ?We are currently using
>>>>>> the following VM and options:
>>>>>>
>>>>>> java version "1.6.0_20"
>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>>>
>>>>>> ? ? ? ? ? ? ?-verbose:gc
>>>>>> ? ? ? ? ? ? ?-XX:+PrintGCTimeStamps
>>>>>> ? ? ? ? ? ? ?-XX:+PrintGCDetails
>>>>>> ? ? ? ? ? ? ?-XX:+PrintGCTaskTimeStamps
>>>>>> ? ? ? ? ? ? ?-XX:+PrintTenuringDistribution
>>>>>> ? ? ? ? ? ? ?-XX:+PrintCommandLineFlags
>>>>>> ? ? ? ? ? ? ?-XX:+PrintReferenceGC
>>>>>> ? ? ? ? ? ? ?-Xms32g -Xmx32g -Xmn4g
>>>>>> ? ? ? ? ? ? ?-XX:+UseParNewGC
>>>>>> ? ? ? ? ? ? ?-XX:ParallelGCThreads=4
>>>>>> ? ? ? ? ? ? ?-XX:+UseConcMarkSweepGC
>>>>>> ? ? ? ? ? ? ?-XX:ParallelCMSThreads=4
>>>>>> ? ? ? ? ? ? ?-XX:CMSInitiatingOccupancyFraction=60
>>>>>> ? ? ? ? ? ? ?-XX:+UseCMSInitiatingOccupancyOnly
>>>>>> ? ? ? ? ? ? ?-XX:+CMSParallelRemarkEnabled
>>>>>> ? ? ? ? ? ? ?-XX:MaxGCPauseMillis=50
>>>>>> ? ? ? ? ? ? ?-Xloggc:gc.log
>>>>>>
>>>>>>
>>>>>> As you can see from the GC log, we never actually reach the point
>>>>>> where the CMS kicks in (after app startup). ?But our young gens seem
>>>>>> to take increasingly long to collect as time goes by.
>>>>>>
>>>>>> The steady state of the app is reached around 956.392 into the log
>>>>>> with a collection that takes 0.106 seconds. ?Thereafter the survivor
>>>>>> space remains roughly constantly as filled and the amount promoted to
>>>>>> old gen also remains constant, but the collection times increase to
>>>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>>>
>>>>>> Has anyone seen this sort of behavior before? ?Are there more switches
>>>>>> that I should try running with?
>>>>>>
>>>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>>>> in parallel. ?But if I still see this sort of problem, it is only a
>>>>>> question of how long must the app run before I see unacceptable
>>>>>> latency spikes.
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> hotspot-gc-use mailing list
>>>>>> hotspot-gc-use at openjdk.java.net
>>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>>
>>>
>>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r229-vm-tweaks-gc3.log.bz2
Type: application/x-bzip2
Size: 9269 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100514/47a15a46/attachment.bin 

From matt.fowles at gmail.com  Fri May 14 11:29:55 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Fri, 14 May 2010 14:29:55 -0400
Subject: Growing GC Young Gen Times
In-Reply-To: <4BED871A.4010306@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com> 
	<4BED8121.5000405@oracle.com>
	<AANLkTinFXqIRNqScqlfGdEFQxLWSWiEXAKtTRpsg2GH0@mail.gmail.com> 
	<4BED871A.4010306@oracle.com>
Message-ID: <AANLkTimpQ16VmjRFE7fSaNYD9k4PcGEn_H8glxQ1gH2y@mail.gmail.com>

Ramki~

Attached are 3 different runs with slightly tweaked VM settings based
on suggestions from this list and others.

All of them have reduced the MaxTenuringThreshold to 2.

gc1.log reduces the young gen size to 1g and the old gen size to 7g
initially.  As you can see from it, the young gen sweep speed does not
improve after the CMS sweep that occurs part way through the run.
gc2.log adds the -XX:+UseLargePages and -XX:+AlwaysPreTouch options to
the settings from gc1.log
gc3.log adds the -XX:+UseLargePages and -XX:+AlwaysPreTouch options to
a 4g young gen with the original 32g total heap.

Due to the quirks of the infrastructure running these tests, data
volumes (and hence allocation rates) are NOT comparable between runs.
Because the tests take ~4 hours to run, we run several tests in
parallel against different sources of data.

Matt

PS - Due to attachment size restrictions I am going to send each file
in its own email.

On Fri, May 14, 2010 at 1:23 PM, Y. Srinivas Ramakrishna
<y.s.ramakrishna at oracle.com> wrote:
> On 05/14/10 10:07, Matt Fowles wrote:
>>
>> Ramki~
>>
>> The machine has 4 cpus each of which have 4 cores. ?I will adjust the
>
> Great, thanks. I'd suggest make ParallelGCThreads=8. Also compare with
> -XX:-UseParNewGC. if it's the kind of fragmentation that we
> believe may be the cause here, you'd see larger gc times in the
> latter case but they would not increase as they do now. But that
> is conjecture at this point.
>
>> survivor spaces as you suggest. ?Previously I had been running with
>> MTT 0, but change it to 4 at the suggestion of others.
>
> MTT=0 can give very poor performance, as people said MTT=4
> would definitely be better here than MTT=0.
> You should use MTT=1 here though.
>
>>
>> Running with the JDK7 version may take a bit of time, but I will
>> pursue that as well.
>
> All you should do is pull the libjvm.so that is in the JDK 7 installation
> (or bundle) and plonk it down into the appropriate directory of your
> existing JDK 6u20 installation. We just want to see the results with
> the latest JVM which includes a fix for 6631166.
>
> I attached a very rough plot of some metrics extracted from your log
> and this behaviour is definitely deserving of a bug, especially
> if it can be shown that it happens in the latest JVM. In the plot:
>
> ?red: scavenge durations
> ?dark blue: promoted data per scavenge
> ?pink: data in survivor space following scavenge
> ?light blue: live data in old gen
>
> As you can see the scavenge clearly correlates with the
> occupancy of the old gen (as Jon and others indicated).
> Did you try Jon's suggestion of doing a manual GC at that
> point via jconsole, and seeing if the upward trend of
> scavenges continues beyond that?
>
> Did you use -XX:+UseLargePages and -XX:+AlwaysPreTouch?
>
> Do you have an easily used test case that you can share with us via
> your support channels? If/when you do so, please copy me and
> send them a reference to this thread on this mailing list.
>
> later, with your new data.
> -- ramki
>
>>
>> Matt
>>
>>
>>
>> On Fri, May 14, 2010 at 12:58 PM, Y. Srinivas Ramakrishna
>> <y.s.ramakrishna at oracle.com> wrote:
>>>
>>> Hi Matt -- i am computing some metrics from yr log file
>>> and would like to know how many cpu's you have for the logs below?
>>>
>>> Also, as you noted, almost anything that survives a scavenge
>>> lives for a while. To reduce the overhead of unnecessary
>>> back-and-forth copying in the survivor spaces, just use
>>> MaxTenuringThreshold=1 (This suggestion was also made by
>>> several others in the thread, and is corroborated by your
>>> PrintTenuringDistribution data). Since you have farily large survivor
>>> spaces configured now, (at least large enough to fit 4 age cohorts,
>>> which will be down to 1 age cohort if you use MTT=1), i'd
>>> suggest making your surviror spaces smaller, may be down to
>>> about 64 MB from the current 420 MB each, and give the excess
>>> to your Eden space.
>>>
>>> Then use 6u21 when it comes out (or ask your Java support to
>>> send you a 6u21 for a beta test), or drop in a JVM from JDK 7 into
>>> your 6u20 installation, and run with that. If you still see
>>> rising pause times let me know or file a bug, and send us the
>>> log file and JVM options along with full platform information.
>>>
>>> I'll run some metrics from yr log file if you send me the info
>>> re platform above, and that may perhaps reveal a few more secrets.
>>>
>>> later.
>>> -- ramki
>>>
>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>
>>>> All~
>>>>
>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>> am trying to reduce the size of gc outliers. ?About 99% of this data
>>>> is garbage, but almost anything that survives one collection survives
>>>> for an indeterminately long amount of time. ?We are currently using
>>>> the following VM and options:
>>>>
>>>> java version "1.6.0_20"
>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>
>>>> ? ? ? ? ? ? ?-verbose:gc
>>>> ? ? ? ? ? ? ?-XX:+PrintGCTimeStamps
>>>> ? ? ? ? ? ? ?-XX:+PrintGCDetails
>>>> ? ? ? ? ? ? ?-XX:+PrintGCTaskTimeStamps
>>>> ? ? ? ? ? ? ?-XX:+PrintTenuringDistribution
>>>> ? ? ? ? ? ? ?-XX:+PrintCommandLineFlags
>>>> ? ? ? ? ? ? ?-XX:+PrintReferenceGC
>>>> ? ? ? ? ? ? ?-Xms32g -Xmx32g -Xmn4g
>>>> ? ? ? ? ? ? ?-XX:+UseParNewGC
>>>> ? ? ? ? ? ? ?-XX:ParallelGCThreads=4
>>>> ? ? ? ? ? ? ?-XX:+UseConcMarkSweepGC
>>>> ? ? ? ? ? ? ?-XX:ParallelCMSThreads=4
>>>> ? ? ? ? ? ? ?-XX:CMSInitiatingOccupancyFraction=60
>>>> ? ? ? ? ? ? ?-XX:+UseCMSInitiatingOccupancyOnly
>>>> ? ? ? ? ? ? ?-XX:+CMSParallelRemarkEnabled
>>>> ? ? ? ? ? ? ?-XX:MaxGCPauseMillis=50
>>>> ? ? ? ? ? ? ?-Xloggc:gc.log
>>>>
>>>>
>>>> As you can see from the GC log, we never actually reach the point
>>>> where the CMS kicks in (after app startup). ?But our young gens seem
>>>> to take increasingly long to collect as time goes by.
>>>>
>>>> The steady state of the app is reached around 956.392 into the log
>>>> with a collection that takes 0.106 seconds. ?Thereafter the survivor
>>>> space remains roughly constantly as filled and the amount promoted to
>>>> old gen also remains constant, but the collection times increase to
>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>
>>>> Has anyone seen this sort of behavior before? ?Are there more switches
>>>> that I should try running with?
>>>>
>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>> in parallel. ?But if I still see this sort of problem, it is only a
>>>> question of how long must the app run before I see unacceptable
>>>> latency spikes.
>>>>
>>>> Matt
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r229-vm-tweaks-gc1.log.bz2
Type: application/x-bzip2
Size: 45158 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100514/34386d25/attachment-0001.bin 

From matt.fowles at gmail.com  Fri May 14 12:24:11 2010
From: matt.fowles at gmail.com (Matt Fowles)
Date: Fri, 14 May 2010 15:24:11 -0400
Subject: Growing GC Young Gen Times
In-Reply-To: <4BED8BF9.7000803@oracle.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com> 
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com> 
	<4BEC7498.6030405@oracle.com> <4BEC7D4D.2000905@oracle.com> 
	<AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com> 
	<4BED8A17.9090208@oracle.com> <4BED8BF9.7000803@oracle.com>
Message-ID: <AANLkTiknmtrnka6XfnH6yZTjD4XlLltZatUquEedzpi0@mail.gmail.com>

Ramki~

I am preparing the flags for the next 3 runs (which run in parallel) and
wanted to check a few things with you.  I believe that each of these is
collecting a useful data point,

Server 1 is running with 8 threads, reduced young gen, and MTT 1.
Server 2 is running with 8 threads, reduced young gen, and MTT 1, ParNew,
but NOT CMS.
Server 3 is running with 8 threads, reduced young gen, and MTT 1, and
PrintFLSStatistics.

I can (additionally) run all of these tests on JDK7 (Java HotSpot(TM) 64-Bit
Server VM (build 17.0-b05, mixed mode)).

Server 1:
            -verbose:gc
            -XX:+PrintGCTimeStamps
            -XX:+PrintGCDetails
            -XX:+PrintGCTaskTimeStamps
            -XX:+PrintCommandLineFlags

            -Xms32g -Xmx32g -Xmn1g
            -XX:+UseParNewGC
            -XX:ParallelGCThreads=8
            -XX:+UseConcMarkSweepGC
            -XX:ParallelCMSThreads=8
            -XX:MaxTenuringThreshold=1
            -XX:SurvivorRatio=14
            -XX:+CMSParallelRemarkEnabled
            -Xloggc:gc1.log
            -XX:+UseLargePages
            -XX:+AlwaysPreTouch

Server 2:
            -verbose:gc
            -XX:+PrintGCTimeStamps
            -XX:+PrintGCDetails
            -XX:+PrintGCTaskTimeStamps
            -XX:+PrintCommandLineFlags

            -Xms32g -Xmx32g -Xmn1g
            -XX:+UseParNewGC
            -XX:ParallelGCThreads=8
            -XX:MaxTenuringThreshold=1
            -XX:SurvivorRatio=14
            -Xloggc:gc2.log
            -XX:+UseLargePages
            -XX:+AlwaysPreTouch


Server 3:
            -verbose:gc
            -XX:+PrintGCTimeStamps
            -XX:+PrintGCDetails
            -XX:+PrintGCTaskTimeStamps
            -XX:+PrintCommandLineFlags

            -Xms32g -Xmx32g -Xmn1g
            -XX:+UseParNewGC
            -XX:ParallelGCThreads=8
            -XX:+UseConcMarkSweepGC
            -XX:ParallelCMSThreads=8
            -XX:MaxTenuringThreshold=1
            -XX:SurvivorRatio=14
            -XX:+CMSParallelRemarkEnabled
            -Xloggc:gc3.log

            -XX:PrintFLSStatistics=2
            -XX:+UseLargePages
            -XX:+AlwaysPreTouch

Matt

On Fri, May 14, 2010 at 1:44 PM, Y. Srinivas Ramakrishna <
y.s.ramakrishna at oracle.com> wrote:
> On 05/14/10 10:36, Y. Srinivas Ramakrishna wrote:
>>
>> On 05/14/10 10:24, Matt Fowles wrote:
>>>
>>> Jon~
>>>
>>> That makes, sense but the fact is that the old gen *never* get
>>> collected.  So all the allocations happen from the giant empty space
>>> at the end of the free list.  I thought fragmentation only occurred
>>> when the free lists are added to after freeing memory...
>>
>> As Jon indicated allocation is done from free lists of blocks
>> that are pre-carved on demand to avoid contention while allocating.
>> The old heuristics for how large to make those lists and the
>> inventory to hold in those lists was not working well as you
>> scaled the number of workers. Following 6631166 we believe it
>> works better and causes both less contention and less
>> fragmentation than it did before, because we do not hold
>> unnecessary excess inventory of free blocks.
>
> To see what the fragmentation is, try -XX:PrintFLSStatistics=2.
> This will slow down your scavenge pauses (perhaps by quite a bit
> for your 26 GB heap), but you will get a report of the number of
> blocks on free lists and how fragmented the space is on that ccount
> (for some appropriate notion of fragmentation). Don't use that
> flag in production though :-)
>
> -- ramki
>
>>
>> The fragmentation in turn causes card-scanning to suffer
>> adversely, besides the issues with loss of spatial locality also
>> increasing cache misses and TLB misses. (The large page
>> option might help mitigate the latter a bit, especially
>> since you have such a large heap and our fragmented
>> allocation may be exacerbating the TLB pressure.)
>>
>> -- ramki
>>
>>> Matt
>>>
>>> On Thu, May 13, 2010 at 6:29 PM, Jon Masamitsu <jon.masamitsu at oracle.com
>
>>> wrote:
>>>>
>>>> Matt,
>>>>
>>>> To amplify on Ramki's comment, the allocations out of the
>>>> old generation are always from a free list.  During a young
>>>> generation collection each GC thread will get its own
>>>> local free lists from the old generation so that it can
>>>> copy objects to the old generation without synchronizing
>>>> with the other GC thread (most of the time).  Objects from
>>>> a GC thread's local free lists are pushed to the globals lists
>>>> after the collection (as far as I recall). So there is some
>>>> churn in the free lists.
>>>>
>>>> Jon
>>>>
>>>> On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
>>>>>
>>>>> On 05/13/10 10:50, Matt Fowles wrote:
>>>>>>
>>>>>> Jon~
>>>>>>
>>>>>> This may sound naive, but how can fragmentation be an issue if the
old
>>>>>> gen has never been collected?  I would think we are still in the
space
>>>>>> where we can just bump the old gen alloc pointer...
>>>>>
>>>>> Matt, The old gen allocator may fragment the space. Allocation is not
>>>>> exactly "bump a pointer".
>>>>>
>>>>> -- ramki
>>>>>
>>>>>> Matt
>>>>>>
>>>>>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>>>>>> <jon.masamitsu at oracle.com> wrote:
>>>>>>>
>>>>>>> Matt,
>>>>>>>
>>>>>>> As Ramki indicated fragmentation might be an issue.  As the
>>>>>>> fragmentation
>>>>>>> in the old generation increases, it takes longer to find space in
the
>>>>>>> old
>>>>>>> generation
>>>>>>> into which to promote objects from the young generation.  This is
>>>>>>> apparently
>>>>>>> not
>>>>>>> the problem that Wayne is having but you still might be hitting it.
>>>>>>>  If
>>>>>>> you
>>>>>>> can
>>>>>>> connect jconsole to the VM and force a full GC, that would tell us
if
>>>>>>> it's
>>>>>>> fragmentation.
>>>>>>>
>>>>>>> There might be a scaling issue with the UseParNewGC.  If you can use
>>>>>>> -XX:-UseParNewGC (turning off the parallel young
>>>>>>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
>>>>>>> will be longer but may be more stable.  That's not the solution but
>>>>>>> just
>>>>>>> part
>>>>>>> of the investigation.
>>>>>>>
>>>>>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>>>>>>> and if you don't see the growing young generation pause, that would
>>>>>>> indicate
>>>>>>> something specific about promotion into the CMS generation.
>>>>>>>
>>>>>>> UseParallelGC is different from UseParNewGC in a number of ways
>>>>>>> and if you try UseParallelGC and still see the growing young
>>>>>>> generation
>>>>>>> pauses, I'd suspect something special about your application.
>>>>>>>
>>>>>>> If you can run these experiments hopefully they will tell
>>>>>>> us where to look next.
>>>>>>>
>>>>>>> Jon
>>>>>>>
>>>>>>>
>>>>>>> On 05/12/10 15:19, Matt Fowles wrote:
>>>>>>>
>>>>>>> All~
>>>>>>>
>>>>>>> I have a large app that produces ~4g of garbage every 30 seconds and
>>>>>>> am trying to reduce the size of gc outliers.  About 99% of this data
>>>>>>> is garbage, but almost anything that survives one collection
survives
>>>>>>> for an indeterminately long amount of time.  We are currently using
>>>>>>> the following VM and options:
>>>>>>>
>>>>>>> java version "1.6.0_20"
>>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>>>>
>>>>>>>              -verbose:gc
>>>>>>>              -XX:+PrintGCTimeStamps
>>>>>>>              -XX:+PrintGCDetails
>>>>>>>              -XX:+PrintGCTaskTimeStamps
>>>>>>>              -XX:+PrintTenuringDistribution
>>>>>>>              -XX:+PrintCommandLineFlags
>>>>>>>              -XX:+PrintReferenceGC
>>>>>>>              -Xms32g -Xmx32g -Xmn4g
>>>>>>>              -XX:+UseParNewGC
>>>>>>>              -XX:ParallelGCThreads=4
>>>>>>>              -XX:+UseConcMarkSweepGC
>>>>>>>              -XX:ParallelCMSThreads=4
>>>>>>>              -XX:CMSInitiatingOccupancyFraction=60
>>>>>>>              -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>>              -XX:+CMSParallelRemarkEnabled
>>>>>>>              -XX:MaxGCPauseMillis=50
>>>>>>>              -Xloggc:gc.log
>>>>>>>
>>>>>>>
>>>>>>> As you can see from the GC log, we never actually reach the point
>>>>>>> where the CMS kicks in (after app startup).  But our young gens seem
>>>>>>> to take increasingly long to collect as time goes by.
>>>>>>>
>>>>>>> The steady state of the app is reached around 956.392 into the log
>>>>>>> with a collection that takes 0.106 seconds.  Thereafter the survivor
>>>>>>> space remains roughly constantly as filled and the amount promoted
to
>>>>>>> old gen also remains constant, but the collection times increase to
>>>>>>> 2.855 seconds by the end of the 3.5 hour run.
>>>>>>>
>>>>>>> Has anyone seen this sort of behavior before?  Are there more
>>>>>>> switches
>>>>>>> that I should try running with?
>>>>>>>
>>>>>>> Obviously, I am working to profile the app and reduce the garbage
>>>>>>> load
>>>>>>> in parallel.  But if I still see this sort of problem, it is only a
>>>>>>> question of how long must the app run before I see unacceptable
>>>>>>> latency spikes.
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>>> ________________________________
>>>>>>> _______________________________________________
>>>>>>> hotspot-gc-use mailing list
>>>>>>> hotspot-gc-use at openjdk.java.net
>>>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>>>
>>>>>> _______________________________________________
>>>>>> hotspot-gc-use mailing list
>>>>>> hotspot-gc-use at openjdk.java.net
>>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100514/7d593fc0/attachment.html 

From y.s.ramakrishna at oracle.com  Fri May 14 12:32:45 2010
From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna)
Date: Fri, 14 May 2010 12:32:45 -0700
Subject: Growing GC Young Gen Times
In-Reply-To: <AANLkTiknmtrnka6XfnH6yZTjD4XlLltZatUquEedzpi0@mail.gmail.com>
References: <AANLkTik0Ft_M27QE4OGUi0Ycn1Fe82bUkFw0znEgch09@mail.gmail.com>
	<4BEC2776.8010609@oracle.com>
	<AANLkTikQRchdpOHwMZfYi1sqYYgJBx0SAI_awaV_s1Ke@mail.gmail.com>
	<4BEC7498.6030405@oracle.com> <4BEC7D4D.2000905@oracle.com>
	<AANLkTil_AMIxaK-eKswAuaCIvhvakouu3GTWBS-ktj5x@mail.gmail.com>
	<4BED8A17.9090208@oracle.com> <4BED8BF9.7000803@oracle.com>
	<AANLkTiknmtrnka6XfnH6yZTjD4XlLltZatUquEedzpi0@mail.gmail.com>
Message-ID: <4BEDA55D.5030703@oracle.com>

Matt -- Yes, comparative data for all these for 6u20 and jdk 7
would be great. Naturally, server 1 is most immediately useful
for determining if 6631166 addresses this at all,
but others would be useful too if it turns out it doesn't
(i.e. if jdk 7's server 1 turns out to be no better than 6u20's --
at which point we should get this into the right channel -- open a bug,
and a support case).

thanks.
-- ramki

On 05/14/10 12:24, Matt Fowles wrote:
> Ramki~
> 
> I am preparing the flags for the next 3 runs (which run in parallel) and 
> wanted to check a few things with you.  I believe that each of these is 
> collecting a useful data point,
> 
> Server 1 is running with 8 threads, reduced young gen, and MTT 1.
> Server 2 is running with 8 threads, reduced young gen, and MTT 1, 
> ParNew, but NOT CMS.
> Server 3 is running with 8 threads, reduced young gen, and MTT 1, 
> and PrintFLSStatistics.
> 
> I can (additionally) run all of these tests on JDK7 (Java HotSpot(TM) 
> 64-Bit Server VM (build 17.0-b05, mixed mode)).
> 
> Server 1:
>             -verbose:gc
>             -XX:+PrintGCTimeStamps
>             -XX:+PrintGCDetails
>             -XX:+PrintGCTaskTimeStamps
>             -XX:+PrintCommandLineFlags
> 
>             -Xms32g -Xmx32g -Xmn1g
>             -XX:+UseParNewGC
>             -XX:ParallelGCThreads=8
>             -XX:+UseConcMarkSweepGC
>             -XX:ParallelCMSThreads=8
>             -XX:MaxTenuringThreshold=1
>             -XX:SurvivorRatio=14
>             -XX:+CMSParallelRemarkEnabled
>             -Xloggc:gc1.log
>             -XX:+UseLargePages 
>             -XX:+AlwaysPreTouch
> 
> Server 2:
>             -verbose:gc
>             -XX:+PrintGCTimeStamps
>             -XX:+PrintGCDetails
>             -XX:+PrintGCTaskTimeStamps
>             -XX:+PrintCommandLineFlags
> 
>             -Xms32g -Xmx32g -Xmn1g
>             -XX:+UseParNewGC
>             -XX:ParallelGCThreads=8
>             -XX:MaxTenuringThreshold=1
>             -XX:SurvivorRatio=14
>             -Xloggc:gc2.log
>             -XX:+UseLargePages 
>             -XX:+AlwaysPreTouch
> 
> 
> Server 3:
>             -verbose:gc
>             -XX:+PrintGCTimeStamps
>             -XX:+PrintGCDetails
>             -XX:+PrintGCTaskTimeStamps
>             -XX:+PrintCommandLineFlags
> 
>             -Xms32g -Xmx32g -Xmn1g
>             -XX:+UseParNewGC
>             -XX:ParallelGCThreads=8
>             -XX:+UseConcMarkSweepGC
>             -XX:ParallelCMSThreads=8
>             -XX:MaxTenuringThreshold=1
>             -XX:SurvivorRatio=14
>             -XX:+CMSParallelRemarkEnabled
>             -Xloggc:gc3.log
> 
>             -XX:PrintFLSStatistics=2
>             -XX:+UseLargePages
>             -XX:+AlwaysPreTouch
>   
> Matt
> 
> On Fri, May 14, 2010 at 1:44 PM, Y. Srinivas Ramakrishna 
> <y.s.ramakrishna at oracle.com <mailto:y.s.ramakrishna at oracle.com>> wrote:
>  > On 05/14/10 10:36, Y. Srinivas Ramakrishna wrote:
>  >>
>  >> On 05/14/10 10:24, Matt Fowles wrote:
>  >>>
>  >>> Jon~
>  >>>
>  >>> That makes, sense but the fact is that the old gen *never* get
>  >>> collected.  So all the allocations happen from the giant empty space
>  >>> at the end of the free list.  I thought fragmentation only occurred
>  >>> when the free lists are added to after freeing memory...
>  >>
>  >> As Jon indicated allocation is done from free lists of blocks
>  >> that are pre-carved on demand to avoid contention while allocating.
>  >> The old heuristics for how large to make those lists and the
>  >> inventory to hold in those lists was not working well as you
>  >> scaled the number of workers. Following 6631166 we believe it
>  >> works better and causes both less contention and less
>  >> fragmentation than it did before, because we do not hold
>  >> unnecessary excess inventory of free blocks.
>  >
>  > To see what the fragmentation is, try -XX:PrintFLSStatistics=2.
>  > This will slow down your scavenge pauses (perhaps by quite a bit
>  > for your 26 GB heap), but you will get a report of the number of
>  > blocks on free lists and how fragmented the space is on that ccount
>  > (for some appropriate notion of fragmentation). Don't use that
>  > flag in production though :-)
>  >
>  > -- ramki
>  >
>  >>
>  >> The fragmentation in turn causes card-scanning to suffer
>  >> adversely, besides the issues with loss of spatial locality also
>  >> increasing cache misses and TLB misses. (The large page
>  >> option might help mitigate the latter a bit, especially
>  >> since you have such a large heap and our fragmented
>  >> allocation may be exacerbating the TLB pressure.)
>  >>
>  >> -- ramki
>  >>
>  >>> Matt
>  >>>
>  >>> On Thu, May 13, 2010 at 6:29 PM, Jon Masamitsu 
> <jon.masamitsu at oracle.com <mailto:jon.masamitsu at oracle.com>>
>  >>> wrote:
>  >>>>
>  >>>> Matt,
>  >>>>
>  >>>> To amplify on Ramki's comment, the allocations out of the
>  >>>> old generation are always from a free list.  During a young
>  >>>> generation collection each GC thread will get its own
>  >>>> local free lists from the old generation so that it can
>  >>>> copy objects to the old generation without synchronizing
>  >>>> with the other GC thread (most of the time).  Objects from
>  >>>> a GC thread's local free lists are pushed to the globals lists
>  >>>> after the collection (as far as I recall). So there is some
>  >>>> churn in the free lists.
>  >>>>
>  >>>> Jon
>  >>>>
>  >>>> On 05/13/10 14:52, Y. Srinivas Ramakrishna wrote:
>  >>>>>
>  >>>>> On 05/13/10 10:50, Matt Fowles wrote:
>  >>>>>>
>  >>>>>> Jon~
>  >>>>>>
>  >>>>>> This may sound naive, but how can fragmentation be an issue if 
> the old
>  >>>>>> gen has never been collected?  I would think we are still in the 
> space
>  >>>>>> where we can just bump the old gen alloc pointer...
>  >>>>>
>  >>>>> Matt, The old gen allocator may fragment the space. Allocation is not
>  >>>>> exactly "bump a pointer".
>  >>>>>
>  >>>>> -- ramki
>  >>>>>
>  >>>>>> Matt
>  >>>>>>
>  >>>>>> On Thu, May 13, 2010 at 12:23 PM, Jon Masamitsu
>  >>>>>> <jon.masamitsu at oracle.com <mailto:jon.masamitsu at oracle.com>> wrote:
>  >>>>>>>
>  >>>>>>> Matt,
>  >>>>>>>
>  >>>>>>> As Ramki indicated fragmentation might be an issue.  As the
>  >>>>>>> fragmentation
>  >>>>>>> in the old generation increases, it takes longer to find space 
> in the
>  >>>>>>> old
>  >>>>>>> generation
>  >>>>>>> into which to promote objects from the young generation.  This is
>  >>>>>>> apparently
>  >>>>>>> not
>  >>>>>>> the problem that Wayne is having but you still might be hitting it.
>  >>>>>>>  If
>  >>>>>>> you
>  >>>>>>> can
>  >>>>>>> connect jconsole to the VM and force a full GC, that would tell 
> us if
>  >>>>>>> it's
>  >>>>>>> fragmentation.
>  >>>>>>>
>  >>>>>>> There might be a scaling issue with the UseParNewGC.  If you 
> can use
>  >>>>>>> -XX:-UseParNewGC (turning off the parallel young
>  >>>>>>> generation collection) with  -XX:+UseConcMarkSweepGC the pauses
>  >>>>>>> will be longer but may be more stable.  That's not the solution but
>  >>>>>>> just
>  >>>>>>> part
>  >>>>>>> of the investigation.
>  >>>>>>>
>  >>>>>>> You could try just -XX:+UseParNewGC without -XX:+UseConcMarkSweepGC
>  >>>>>>> and if you don't see the growing young generation pause, that would
>  >>>>>>> indicate
>  >>>>>>> something specific about promotion into the CMS generation.
>  >>>>>>>
>  >>>>>>> UseParallelGC is different from UseParNewGC in a number of ways
>  >>>>>>> and if you try UseParallelGC and still see the growing young
>  >>>>>>> generation
>  >>>>>>> pauses, I'd suspect something special about your application.
>  >>>>>>>
>  >>>>>>> If you can run these experiments hopefully they will tell
>  >>>>>>> us where to look next.
>  >>>>>>>
>  >>>>>>> Jon
>  >>>>>>>
>  >>>>>>>
>  >>>>>>> On 05/12/10 15:19, Matt Fowles wrote:
>  >>>>>>>
>  >>>>>>> All~
>  >>>>>>>
>  >>>>>>> I have a large app that produces ~4g of garbage every 30 
> seconds and
>  >>>>>>> am trying to reduce the size of gc outliers.  About 99% of this 
> data
>  >>>>>>> is garbage, but almost anything that survives one collection 
> survives
>  >>>>>>> for an indeterminately long amount of time.  We are currently using
>  >>>>>>> the following VM and options:
>  >>>>>>>
>  >>>>>>> java version "1.6.0_20"
>  >>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>  >>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>  >>>>>>>
>  >>>>>>>              -verbose:gc
>  >>>>>>>              -XX:+PrintGCTimeStamps
>  >>>>>>>              -XX:+PrintGCDetails
>  >>>>>>>              -XX:+PrintGCTaskTimeStamps
>  >>>>>>>              -XX:+PrintTenuringDistribution
>  >>>>>>>              -XX:+PrintCommandLineFlags
>  >>>>>>>              -XX:+PrintReferenceGC
>  >>>>>>>              -Xms32g -Xmx32g -Xmn4g
>  >>>>>>>              -XX:+UseParNewGC
>  >>>>>>>              -XX:ParallelGCThreads=4
>  >>>>>>>              -XX:+UseConcMarkSweepGC
>  >>>>>>>              -XX:ParallelCMSThreads=4
>  >>>>>>>              -XX:CMSInitiatingOccupancyFraction=60
>  >>>>>>>              -XX:+UseCMSInitiatingOccupancyOnly
>  >>>>>>>              -XX:+CMSParallelRemarkEnabled
>  >>>>>>>              -XX:MaxGCPauseMillis=50
>  >>>>>>>              -Xloggc:gc.log
>  >>>>>>>
>  >>>>>>>
>  >>>>>>> As you can see from the GC log, we never actually reach the point
>  >>>>>>> where the CMS kicks in (after app startup).  But our young gens 
> seem
>  >>>>>>> to take increasingly long to collect as time goes by.
>  >>>>>>>
>  >>>>>>> The steady state of the app is reached around 956.392 into the log
>  >>>>>>> with a collection that takes 0.106 seconds.  Thereafter the 
> survivor
>  >>>>>>> space remains roughly constantly as filled and the amount 
> promoted to
>  >>>>>>> old gen also remains constant, but the collection times increase to
>  >>>>>>> 2.855 seconds by the end of the 3.5 hour run.
>  >>>>>>>
>  >>>>>>> Has anyone seen this sort of behavior before?  Are there more
>  >>>>>>> switches
>  >>>>>>> that I should try running with?
>  >>>>>>>
>  >>>>>>> Obviously, I am working to profile the app and reduce the garbage
>  >>>>>>> load
>  >>>>>>> in parallel.  But if I still see this sort of problem, it is only a
>  >>>>>>> question of how long must the app run before I see unacceptable
>  >>>>>>> latency spikes.
>  >>>>>>>
>  >>>>>>> Matt
>  >>>>>>>
>  >>>>>>> ________________________________
>  >>>>>>> _______________________________________________
>  >>>>>>> hotspot-gc-use mailing list
>  >>>>>>> hotspot-gc-use at openjdk.java.net 
> <mailto:hotspot-gc-use at openjdk.java.net>
>  >>>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  >>>>>>
>  >>>>>> _______________________________________________
>  >>>>>> hotspot-gc-use mailing list
>  >>>>>> hotspot-gc-use at openjdk.java.net 
> <mailto:hotspot-gc-use at openjdk.java.net>
>  >>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  >>
>  >> _______________________________________________
>  >> hotspot-gc-use mailing list
>  >> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>  >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  >
>  >
> 


From peter.schuller at infidyne.com  Sat May 15 10:12:20 2010
From: peter.schuller at infidyne.com (Peter Schuller)
Date: Sat, 15 May 2010 19:12:20 +0200
Subject: g1 not doing partial aggressively enough -> fallback to full gc
In-Reply-To: <AANLkTimb9E8fXhxD2bZxOdodz9KO3j_3bBoyebVU0tOO@mail.gmail.com>
References: <AANLkTimb9E8fXhxD2bZxOdodz9KO3j_3bBoyebVU0tOO@mail.gmail.com>
Message-ID: <AANLkTin_QoHevNztVF2fjC1P9R_pEsxxRjcqrqpd-05h@mail.gmail.com>

> ? ?HTTPGCTEST_LOGGC=gc.log HTTPGCTEST_COLLECTOR=g1 ./run.sh

I forgot to mention that JAVA_HOME must be set or the script will fail
without a user-friendly error.

-- 
/ Peter Schuller

From peter.schuller at infidyne.com  Sat May 15 10:10:52 2010
From: peter.schuller at infidyne.com (Peter Schuller)
Date: Sat, 15 May 2010 19:10:52 +0200
Subject: g1 not doing partial aggressively enough -> fallback to full gc
Message-ID: <AANLkTimb9E8fXhxD2bZxOdodz9KO3j_3bBoyebVU0tOO@mail.gmail.com>

Hello,

I have another utterly unscientific (but I think interesting)
variation of an older test. The behavior that is interesting in this
case is that the heap grows until it hits the maximum heap size and
falls back to a full GC in spite of there being high-payoff old
regions. Based on on the heap size after the full GC, the live working
set if roughly ~ 230 MB, and the maximum heap size is 4 GB. This means
that during the GC:s right before the full GC:s, g1 is choosing to do
young generation GC:s even though the average live ratio in the older
regions should be roughly 5% (low-hanging fruit for partial
collections).

Links to the test, the GC log file and an executable .jar file for
convenience, follows at the bottom of this E-Mail.

A short description of roughly what the test is doing in my particular
invocation and use of it: I have a loop running which repeatedly tells
the httpgctest server to add 25000 "data items" to its in-memory set,
followed by removing 0.1% of all data items (pseudo-randomly). The
end-result is that of a steady-state of roughly 250000 data items that
are aged pseudo-randomly (the removal of 0.1% is done by selecting a
pseudo-randomly from the set). Thus, dead objects will be accumulated
over time over all regions, though the data structure overhead of the
set itself will tend to generate data that is shorter-lived on average
(due to use of clojure's immutable data structures).

Given this behavior, no individual region is likely to become
completely empty, so they need to be evacuated in a partial collection
(rather than as part of the 'cleanup' phase).

The bulk of the data collected in the young generation is expected to
be the immutable data structure's internal structure.

There should be very little writing to older generations (again due to
the use of clojure's immutable data structures).

The JVM version is one built from a recently merged-from-main bsd-port:

  changeset:   206:12f0c051d819
  tag:         tip
  parent:      202:44ad17a6ffea
  parent:      205:b7b4797303cb
  user:        Greg Lewis <glewis at eyesbeyond.com>
  date:        Sat May 08 10:53:55 2010 -0700
  summary:     Merge from main OpenJDK repository

The JVM options used in the particular run that produced the log is
(cut'n'paste from -XX:+PrintCommandLineFlags, but re-ordered for
clarity):

  -XX:+UnlockDiagnosticVMOptions
  -XX:+UnlockExperimentalVMOptions

  -XX:+UseG1GC
  -XX:GCPauseIntervalMillis=15
  -XX:MaxGCPauseMillis=10
  -XX:+G1ParallelRSetScanningEnabled
  -XX:+G1ParallelRSetUpdatingEnabled

  -XX:InitialHeapSize=52428800
  -XX:MaxHeapSize=4294967296
  -XX:ThreadStackSize=256

  -XX:+PrintCommandLineFlags
  -XX:+PrintGC
  -XX:+PrintGCTimeStamps
  -XX:+TraceClassUnloading
  -XX:+CITime

The log files should lots of young collections, and a very select few
partial collections after marking phases. Typically along the lines of
(excerpt, because the full log is very long):

46.219: [GC pause (young) 2553M->2550M(4096M), 0.0071520 secs]
46.234: [GC pause (young) 2555M->2551M(4096M), 0.0052560 secs]
46.250: [GC pause (young) 2559M->2554M(4096M), 0.0074780 secs]
46.281: [GC pause (young) 2559M->2555M(4096M), 0.0058670 secs]
46.306: [GC pause (young) 2569M->2555M(4096M), 0.0039900 secs]
46.326: [GC pause (young) 2569M->2556M(4096M), 0.0056980 secs]
46.339: [GC concurrent-count-end, 0.4546580]
46.339: [GC cleanup 2562M->2502M(4096M), 0.0702940 secs]
46.410: [GC concurrent-cleanup-start]
46.414: [GC concurrent-cleanup-end, 0.0041480]
46.431: [GC pause (young) 2515M->2497M(4096M), 0.0069320 secs]
46.444: [GC pause (partial) 2501M->2495M(4096M), 0.0056890 secs]
46.469: [GC pause (partial) 2499M->2493M(4096M), 0.0065570 secs]
46.486: [GC pause (partial) 2497M->2493M(4096M), 0.0058280 secs]
46.497: [GC pause (partial) 2496M->2493M(4096M), 0.0108240 secs]
46.525: [GC pause (young) (initial-mark) 2507M->2494M(4096M)46.529:
[GC concurrent-mark-start]
, 0.0044130 secs]
46.544: [GC pause (young) 2508M->2494M(4096M), 0.0053780 secs]
46.574: [GC pause (young) 2513M->2495M(4096M), 0.0058290 secs]
46.727: [GC pause (young) 2512M->2499M(4096M), 0.0122760 secs]
46.761: [GC pause (young) 2507M->2501M(4096M), 0.0121180 secs]
46.799: [GC pause (young) 2512M->2505M(4096M), 0.0116450 secs]
46.826: [GC pause (young) 2513M->2507M(4096M), 0.0098290 secs]
46.852: [GC pause (young) 2515M->2510M(4096M), 0.0111450 secs]
46.873: [GC pause (young) 2517M->2512M(4096M), 0.0095310 secs]
46.893: [GC pause (young) 2519M->2514M(4096M), 0.0113990 secs]
46.918: [GC pause (young) 2519M->2516M(4096M), 0.0092540 secs]

At the very end we see the full GC and the resulting heap size:

74.777: [GC pause (young) 4090M->4074M(4096M), 0.0085890 secs]
74.796: [GC pause (young) 4082M->4075M(4096M), 0.0061880 secs]
74.833: [Full GC 4095M->227M(758M), 2.9635480 secs]
77.940: [GC pause (young) 253M->232M(758M), 0.0206140 secs]
77.970: [GC pause (young) 236M->233M(1426M), 0.0168640 secs]

Close to this there are only young collections in sight.

The effect is lessened by providing less strict pause time demands on
g1 (I tested 250/300 instead of the 10/15 used in this case), but the
behavior *does* remain. It just takes longer to kick in (given the
same heap size).

Test links/reproduction information:

The test is the httpgctest (as of version
751d8374810a497cf26e48211183db5dd0a73185):

   http://github.com/scode/httpgctest

A GC log produced with:

    HTTPGCTEST_LOGGC=gc.log HTTPGCTEST_COLLECTOR=g1 ./run.sh

Can be found here:

  http://distfiles.scode.org/mlref/gctest/httpgctest-g1-fullgc-20100515/gc.log

An executable .jar file (product of 'lein uberjar') is here:

  http://distfiles.scode.org/mlref/gctest/httpgctest-g1-fullgc-20100515/httpgctest-standalone.jar

For running the executable .jar with the same options, the direct link
to run.sh of the correct version is:

  http://github.com/scode/httpgctest/blob/751d8374810a497cf26e48211183db5dd0a73185/run.sh

The input to the test once running, is the following little loop
running concurrently:

  while [ 1 ] ; do curl 'http://localhost:9191/gendata?amount=25000' ;
curl 'http://localhost:9191/dropdata?ratio=0.1' ; sleep 0.1 ; done

-- 
/ Peter Schuller

From adamh at basis.com  Tue May 18 11:19:51 2010
From: adamh at basis.com (Adam Hawthorne)
Date: Tue, 18 May 2010 14:19:51 -0400
Subject: PrintGCStats
In-Reply-To: <AANLkTilwe2vehNif0JZUbAW1J4L91jICXkcYUhxgX0Ze@mail.gmail.com>
References: <AANLkTilwe2vehNif0JZUbAW1J4L91jICXkcYUhxgX0Ze@mail.gmail.com>
Message-ID: <AANLkTinwZ4Rc5HsMjTQ3CZArnhRHFCtKCqdepzFSMexd@mail.gmail.com>

I hacked this one up a few months ago when I couldn't find one.  You might
want to review it, I don't guarantee it's accurate, and I don't know awk
very well, but it seemed to be working when I last used it.

Let me know if it doesn't come through (PrintGCStats.tgz) .

Adam

--
Adam Hawthorne
Software Engineer
BASIS International Ltd.
www.basis.com
+1.505.345.5232 Phone


On Tue, May 18, 2010 at 14:12, Hiroshi Yamauchi <yamauchi at google.com> wrote:

> Hi,
>
> Does any have a version of the PrintGCStats script that works with the
> recent Hotspot builds and that we can share in the community?
>
> It appears that an old version of it is available here:
>
>
> http://java.sun.com/developer/technicalArticles/Programming/turbo/#PrintGCStats
>
> But it does not seem to be able to parse the output from a recent
> Hotspot correctly (eg gc times always show zero.)
>
> I think it's very convenient and almost a must to have a version that
> we can share and standardize on.
>
> Thanks,
> Hiroshi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100518/9cb78ee3/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PrintGCStats.tgz
Type: application/x-gzip
Size: 13188 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100518/9cb78ee3/attachment.bin 

From yamauchi at google.com  Tue May 18 12:07:15 2010
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Tue, 18 May 2010 12:07:15 -0700
Subject: PrintGCStats
In-Reply-To: <AANLkTinwZ4Rc5HsMjTQ3CZArnhRHFCtKCqdepzFSMexd@mail.gmail.com>
References: <AANLkTilwe2vehNif0JZUbAW1J4L91jICXkcYUhxgX0Ze@mail.gmail.com>
	<AANLkTinwZ4Rc5HsMjTQ3CZArnhRHFCtKCqdepzFSMexd@mail.gmail.com>
Message-ID: <AANLkTikmdPgQTjf9khQ5eDQDV6o5DvSSf3LghozqeKOn@mail.gmail.com>

Hi Adam,

Thanks for a quick response. I'll try to take a look at it at my next
chance. Does anyone, who knows more about the script than I, feel like
taking a look at it?

Thanks,
Hiroshi

On Tue, May 18, 2010 at 11:19 AM, Adam Hawthorne <adamh at basis.com> wrote:
> I hacked this one up a few months ago when I couldn't find one. ?You might
> want to review it, I don't guarantee it's accurate, and I don't know awk
> very well, but it seemed to be working when I last used it.
>
> Let me know if it doesn't come through (PrintGCStats.tgz) .
> Adam
>
> --
> Adam Hawthorne
> Software Engineer
> BASIS International Ltd.
> www.basis.com
> +1.505.345.5232 Phone
>
>
> On Tue, May 18, 2010 at 14:12, Hiroshi Yamauchi <yamauchi at google.com> wrote:
>>
>> Hi,
>>
>> Does any have a version of the PrintGCStats script that works with the
>> recent Hotspot builds and that we can share in the community?
>>
>> It appears that an old version of it is available here:
>>
>>
>> ?http://java.sun.com/developer/technicalArticles/Programming/turbo/#PrintGCStats
>>
>> But it does not seem to be able to parse the output from a recent
>> Hotspot correctly (eg gc times always show zero.)
>>
>> I think it's very convenient and almost a must to have a version that
>> we can share and standardize on.
>>
>> Thanks,
>> Hiroshi
>
>