<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
Hi John,<br>
<br>
Thanks for the review!<br>
<br>
Here is an updated webrev:<br>
<a class="moz-txt-link-freetext" href="http://cr.openjdk.java.net/~brutisso/7152954/webrev.01/">http://cr.openjdk.java.net/~brutisso/7152954/webrev.01/</a><br>
<br>
Some comments inline...<br>
<br>
On 2012-03-13 00:03, John Cuthbertson wrote:
<blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
Hi Bengt,<br>
<br>
Looks good - but I do have a couple of comments:<br>
<br>
survRateGroup.cpp:<br>
<br>
This is not related to your changes - but I just noticed it:<br>
<br>
<pre> 108 _surv_rate_pred = NEW_C_HEAP_ARRAY(TruncatedSeq*, _region_num);
109 if (_surv_rate == NULL) {
110 vm_exit_out_of_memory(sizeof(TruncatedSeq*) * _region_num,
111 "Not enough space for surv rate pred array.");
112 }
</pre>
should be:<br>
<br>
<pre> 108 _surv_rate_pred = NEW_C_HEAP_ARRAY(TruncatedSeq*, _region_num);
109 if (<b>_surv_rate_pred</b> == NULL) {
110 vm_exit_out_of_memory(sizeof(TruncatedSeq*) * _region_num,
111 "Not enough space for surv rate pred array.");
112 }
</pre>
</blockquote>
<br>
You are correct, but NEW_C_HEAP_ARRAY will call
vm_exit_out_of_memory() if the allocation fails. So this is dead
code. I removed it instead of updating it.<br>
<br>
<blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">
My picture of SurvRateGroup::stop_adding_regions() has always been
that
it's just extending the stats arrays (i.e., a realloc operation).
So,
with that picture in mind, the following lines:<br>
<br>
<pre><span class="changed"> 117 for (size_t i = _stats_arrays_length; i < _prev_stats_arrays_length; ++i) {</span>
<span class="changed"> 118 delete old_surv_rate_pred[i];</span>
<span class="changed"> 119 }</span>
</pre>
really look like they belong in SurvRateGroup::reset().<br>
</blockquote>
<br>
Good point. I was struggling a bit with how this code was supposed
to be used. I moved the delete calls into reset(). Thanks for
pointing this out. <br>
<br>
<blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">
Excellent catch - what brought it to your attention?<br>
</blockquote>
<br>
It is kind of a funny story. Here is probably more information than
you want, so just stop reading if you find it too long ;-)<br>
<br>
I just wanted to confirm that a small test I found would eventually
throw an OOME. I ran it on JDK7 b147 over a weekend. When I checked
the test after the weekend it had gotten a native out of memory
instead. The test would mostly allocate humongous objects so I
figured we were leaking in this respect somehow.<br>
<br>
I instrumented a build from the tip of hotspot-gc. But to my
surprise the leak was gone. That's when I realized that in b147 we
were constantly doing full GCs to handle all the humongous objects.
But in the latest builds we have the fix I made to initiate
concurrent marking on humongous object allocations, so we never did
any full GCs. By inducing full GCs with System.gc() calls I could
show that the memory leak was in fact still present in the latest
builds as well.<br>
<br>
I tried out the prototype that Zhengyu has for native memory
tracking. Unfortunately I didn't get it to work properly on my 64
bit system. After I had already found the cause of the leak, Zhengyu
ran my reproducer on a 32 bit system and got this report:<br>
<br>
[0x54dd14fc] TruncatedSeq::TruncatedSeq+0x4c<br>
(malloc=162KB 143KB, #2071 1833)<br>
<br>
[0x54f43737] SurvRateGroup::reset+0x47<br>
(malloc=128KB 115KB, #2049 1833) <br>
<br>
That is really spot on (assuming
SurvRateGroup::stop_adding_regions() got inlined into
SurvRateGroup::reset()). So, it looks to me as though NMT will be a
really useful tool to find this type of issues in the future.<br>
<br>
Once I had the small reproducer that just calls System.gc() in a
loop I could manually do a binary search of the full GC code to find
out where we were leaking.<br>
<br>
Long story short: I found this by accident and NMT will be a really
good tool.<br>
<br>
Bengt<br>
<br>
<br>
<br>
<blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">
<br>
JohnC<br>
<br>
<br>
On 03/12/12 02:36, Bengt Rutisson wrote:
<blockquote cite="mid:4F5DC3A1.1050306@oracle.com" type="cite"><br>
Hi all, <br>
<br>
Could I have a couple of reviews for this fairly small change: <br>
<br>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ebrutisso/7152954/webrev.00/">http://cr.openjdk.java.net/~brutisso/7152954/webrev.00/</a>
<br>
<br>
The CR will be publicly available soon here: <br>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7152954">http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7152954</a>
<br>
<br>
<br>
I choose to do a fairly small fix. To me it looks like the
SurvRateGroup class could benefit from some refactoring. Maybe
this
hole issue could go away with some simplifications to the code.
But I'd
like to keep the change as small as possible for now just in
case we
would like to push this into hs23. <br>
<br>
<br>
Background from the CR: <br>
<br>
There is a memory leak in the full GC code for G1. This can be
seen by
running this simple reproducer: <br>
<br>
public class SysGC { <br>
public static void main(String[] args) { <br>
while (true) { <br>
System.gc(); <br>
} <br>
} <br>
} <br>
<br>
I run it with this command line: <br>
<br>
java -XX:+UseG1GC -Xms16m -Xmx16m -XX:+PrintGC SysGC <br>
<br>
Watching the memory footprint for the java process shows that it
is
constantly using more memory. <br>
<br>
The leak comes from SurvRateGroup::stop_adding_regions() which
is
called from SurvRateGroup::reset(), which in turn is called from
G1CollectorPolicy::record_full_collection_end(). <br>
<br>
The problem with SurvRateGroup::stop_adding_regions() is that it
does: <br>
<br>
_surv_rate_pred[i] = new TruncatedSeq(10); <br>
<br>
in a loop every time it is called. But there is no corresponding
call
to delete. <br>
<br>
Adding a loop to call delete on the previously allocated
TruncatedSeq
objects is not enough to solve the problem since TruncatedSeq is
itself
allocating an array without freeing it. Adding a destructor to
TruncatedSeq that frees the allocated array solves the issue. <br>
<br>
With these two fixes the memory leak seems to go away. <br>
<br>
Thanks, <br>
Bengt <br>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>