<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    John,<br>

    <br>

    Thanks for the review!<br>

    <br>

    Bengt<br>

    <br>

    On 2012-03-13 17:50, John Cuthbertson wrote:

    <blockquote cite="mid:4F5F7AC5.8080406@oracle.com" type="cite">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      Hi Bengt,<br>

      <br>

      Looks good to me. Thanks for making the changes. Thanks also for

      the

      explanation - very informative. NMT does sound like it will be

      useful.<br>

      <br>

      JohnC<br>

      <br>

      On 03/13/12 01:49, Bengt Rutisson wrote:

      <blockquote cite="mid:4F5F0A10.7050301@oracle.com" type="cite">

        <meta content="text/html; charset=ISO-8859-1"

          http-equiv="Content-Type">

        <br>

        Hi John,<br>

        <br>

        Thanks for the review!<br>

        <br>

        Here is an updated webrev:<br>

        <a moz-do-not-send="true" class="moz-txt-link-freetext"

          href="http://cr.openjdk.java.net/%7Ebrutisso/7152954/webrev.01/">http://cr.openjdk.java.net/~brutisso/7152954/webrev.01/</a><br>

        <br>

        Some comments inline...<br>

        <br>

        On 2012-03-13 00:03, John Cuthbertson wrote:

        <blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">

          <meta content="text/html; charset=ISO-8859-1"

            http-equiv="Content-Type">

          Hi Bengt,<br>

          <br>

          Looks good  - but I do have a couple of comments:<br>

          <br>

          survRateGroup.cpp:<br>

          <br>

          This is not related to your changes - but I just noticed it:<br>

          <br>

          <pre> 108     _surv_rate_pred = NEW_C_HEAP_ARRAY(TruncatedSeq*, _region_num);

 109     if (_surv_rate == NULL) {

 110       vm_exit_out_of_memory(sizeof(TruncatedSeq*) * _region_num,

 111                             "Not enough space for surv rate pred array.");

 112     }

    </pre>

          should be:<br>

          <br>

          <pre> 108     _surv_rate_pred = NEW_C_HEAP_ARRAY(TruncatedSeq*, _region_num);

 109     if (<b>_surv_rate_pred</b> == NULL) {

 110       vm_exit_out_of_memory(sizeof(TruncatedSeq*) * _region_num,

 111                             "Not enough space for surv rate pred array.");

 112     }

    </pre>

        </blockquote>

        <br>

        You are correct, but NEW_C_HEAP_ARRAY will call

        vm_exit_out_of_memory()

        if the allocation fails. So this is dead code. I removed it

        instead of

        updating it.<br>

        <br>

        <blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">

          My

          picture of SurvRateGroup::stop_adding_regions() has always

          been that

          it's just extending the stats arrays (i.e., a realloc

          operation). So,

          with that picture in mind, the following lines:<br>

          <br>

          <pre><span class="changed"> 117     for (size_t i = _stats_arrays_length; i < _prev_stats_arrays_length; ++i) {</span>

<span class="changed"> 118       delete old_surv_rate_pred[i];</span>

<span class="changed"> 119     }</span>

    </pre>

          really look like they belong in SurvRateGroup::reset().<br>

        </blockquote>

        <br>

        Good point. I was struggling a bit with how this code was

        supposed to

        be used. I moved the delete calls into reset(). Thanks for

        pointing

        this out. <br>

        <br>

        <blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">

          Excellent catch - what brought it to your attention?<br>

        </blockquote>

        <br>

        It is kind of a funny story. Here is probably more information

        than you

        want, so just stop reading if you find it too long ;-)<br>

        <br>

        I just wanted to confirm that a small test I found would

        eventually

        throw an OOME. I ran it on JDK7 b147 over a weekend. When I

        checked the

        test after the weekend it had gotten a native out of memory

        instead.

        The test would mostly allocate humongous objects so I figured we

        were

        leaking in this respect somehow.<br>

        <br>

        I instrumented a build from the tip of hotspot-gc. But to my

        surprise

        the leak was gone. That's when I realized that in b147 we were

        constantly doing full GCs to handle all the humongous objects.

        But in

        the latest builds we have the fix I made to initiate concurrent

        marking

        on humongous object allocations, so we never did any full GCs.

        By

        inducing full GCs with System.gc() calls I could show that the

        memory

        leak was in fact still present in the latest builds as well.<br>

        <br>

        I tried out the prototype that Zhengyu has for native memory

        tracking.

        Unfortunately I didn't get it to work properly on my 64 bit

        system.

        After I had already found the cause of the leak, Zhengyu ran my

        reproducer on a 32 bit system and got this report:<br>

        <br>

        [0x54dd14fc] TruncatedSeq::TruncatedSeq+0x4c<br>

                                    (malloc=162KB 143KB, #2071 1833)<br>

        <br>

        [0x54f43737] SurvRateGroup::reset+0x47<br>

                                    (malloc=128KB 115KB, #2049 1833) <br>

        <br>

        That is really spot on (assuming

        SurvRateGroup::stop_adding_regions()

        got inlined into SurvRateGroup::reset()). So, it looks to me as

        though

        NMT will be a really useful tool to find this type of issues in

        the

        future.<br>

        <br>

        Once I had the small reproducer that just calls System.gc() in a

        loop I

        could manually do a binary search of the full GC code to find

        out where

        we were leaking.<br>

        <br>

        Long story short: I found this by accident and NMT will be a

        really

        good tool.<br>

        <br>

        Bengt<br>

        <br>

        <br>

        <br>

        <blockquote cite="mid:4F5E80DD.6030801@oracle.com" type="cite">

          <br>

          JohnC<br>

          <br>

          <br>

          On 03/12/12 02:36, Bengt Rutisson wrote:

          <blockquote cite="mid:4F5DC3A1.1050306@oracle.com" type="cite"><br>

            Hi all, <br>

            <br>

            Could I have a couple of reviews for this fairly small

            change: <br>

            <br>

            <a moz-do-not-send="true" class="moz-txt-link-freetext"

              href="http://cr.openjdk.java.net/%7Ebrutisso/7152954/webrev.00/">http://cr.openjdk.java.net/~brutisso/7152954/webrev.00/</a>

            <br>

            <br>

            The CR will be publicly available soon here: <br>

            <a moz-do-not-send="true" class="moz-txt-link-freetext"

              href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7152954">http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7152954</a>

            <br>

            <br>

            <br>

            I choose to do a fairly small fix. To me it looks like the

            SurvRateGroup class could benefit from some refactoring.

            Maybe this

            hole issue could go away with some simplifications to the

            code. But I'd

            like to keep the change as small as possible for now just in

            case we

            would like to push this into hs23. <br>

            <br>

            <br>

            Background from the CR: <br>

            <br>

            There is a memory leak in the full GC code for G1. This can

            be seen by

            running this simple reproducer: <br>

            <br>

            public class SysGC { <br>

                public static void main(String[] args) { <br>

                    while (true) { <br>

                        System.gc(); <br>

                    } <br>

                } <br>

            } <br>

            <br>

            I run it with this command line: <br>

            <br>

            java -XX:+UseG1GC -Xms16m -Xmx16m -XX:+PrintGC SysGC <br>

            <br>

            Watching the memory footprint for the java process shows

            that it is

            constantly using more memory. <br>

            <br>

            The leak comes from SurvRateGroup::stop_adding_regions()

            which is

            called from SurvRateGroup::reset(), which in turn is called

            from

            G1CollectorPolicy::record_full_collection_end(). <br>

            <br>

            The problem with SurvRateGroup::stop_adding_regions() is

            that it does: <br>

            <br>

            _surv_rate_pred[i] = new TruncatedSeq(10); <br>

            <br>

            in a loop every time it is called. But there is no

            corresponding call

            to delete. <br>

            <br>

            Adding a loop to call delete on the previously allocated

            TruncatedSeq

            objects is not enough to solve the problem since

            TruncatedSeq is itself

            allocating an array without freeing it. Adding a destructor

            to

            TruncatedSeq that frees the allocated array solves the

            issue. <br>

            <br>

            With these two fixes the memory leak seems to go away. <br>

            <br>

            Thanks, <br>

            Bengt <br>

          </blockquote>

          <br>

        </blockquote>

        <br>

      </blockquote>

      <br>

    </blockquote>

    <br>

  </body>

</html>