<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Ramki,<br>
    <br>
    The logarithmic BOT is taking big steps back to<br>
    find the start of a large object.  But something is odd.<br>
    <br>
    Is there a reason why the BOT entries would decrease for a while<br>
    and then increase without a value that gives the start of an object?<br>
    <br>
    &_array->_offset_array[index] = 0xffffffff6550f600
    "DDDDDDDDDDDDDD ..."<br>
    <br>
    and then down the road.<br>
    <br>
     print &_array->_offset_array[0x110600U],
    _array->_offset_array[0x10f600U ><br>
    &_array->_offset_array[0x110600U] = 0xffffffff65510600
    "CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC<br>
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCDDDDDDDDDDDDDDDD<br>
    DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
    DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
    DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
    DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
    DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
    DDDDDDDDDDDDDDDDDDDDDDDDDD<br>
    <br>
    'D' is 0x44<br>
    <br>
    N_words is 0x40<br>
    <br>
    Jon<br>
    <br>
    <div class="moz-cite-prefix">On 05/12/2015 08:46 AM, Srinivas
      Ramakrishna wrote:<br>
    </div>
    <blockquote
      cite="mid:065F7E11-FCFB-4F1C-8977-59765A01A94C@gmail.com"
      type="cite">
      <meta http-equiv="content-type" content="text/html; charset=utf-8">
      <div>Thanks for that background Jon; I'll try and look at the 8uxx
        code and run James's example when I get a few spare cycles
        (which won't be soon, unfortunately).</div>
      <div><br>
      </div>
      <div>-- ramki<br>
        <br>
        ysr1729</div>
      <div><br>
        On May 8, 2015, at 14:25, Jon Masamitsu <<a
          moz-do-not-send="true" href="mailto:jon.masamitsu@oracle.com">jon.masamitsu@oracle.com</a>>
        wrote:<br>
        <br>
      </div>
      <blockquote type="cite">
        <div>
          <meta content="text/html; charset=utf-8"
            http-equiv="Content-Type">
          Ramki,<br>
          <br>
          I compared the memory/blockOffsetTable.* files between 6u35
          and 7 and there <br>
          were essentially no differences.  The .cpp file had some
          changes in assertions.<br>
          The other two were the same except for comments.<br>
          <br>
          Between 6 and 6u35 (didn't pin point it), we changed from the
          rather<br>
          weak N_powers=3  :-) to N_powers=14.<br>
          <br>
          There seem to be two problematic cases.<br>
          <br>
          1)  Spike in ParNew times that lasts for 2 (or a few) GC's<br>
          2) Continually growing ParNew times<br>
          <br>
          From just reading the CR<br>
          <br>
          <a moz-do-not-send="true" class="moz-txt-link-freetext"
            href="https://bugs.openjdk.java.net/browse/JDK-8079274">https://bugs.openjdk.java.net/browse/JDK-8079274</a><br>
          <br>
          the test case seems to reproduce 1).  The test case<br>
          has a large object that lives till the application exits.<br>
          If it were a broken logarithmic BOT, I would not expect<br>
          the ParNew pauses to shrink again.  <br>
          <br>
          Jon<br>
          <br>
          <br>
          <br>
          <div class="moz-cite-prefix">On 5/8/2015 11:49 AM, Srinivas
            Ramakrishna wrote:<br>
          </div>
          <blockquote
cite="mid:CABzyjykFe3LSdS1Y0XAFbS+OA=e3qge78=1q-0W_vm_ffVYTPg@mail.gmail.com"
            type="cite">
            <div dir="ltr">This is indeed interesting...
              <div><br>
              </div>
              <div>Jesper, did you figure out why (as James seems to
                note) the problem is seen in JDK 8 but not in JDK 6
                (he's using ParNew in both cases). Also, isn't the BOT
                logarithmic, which will allow threads to "quickly"
                traverse back to the start of even a large object? Did
                something change in the BOT that might have led to what
                James sees as a regression, or did the range that he
                characterizes as a "blind spot" merely move from one
                place to another?</div>
              <div><br>
              </div>
              <div>Unfortunately, I have not been keeping up with the
                changes in this area and am not working with JDK 6 code
                much, so can't offer any clues, but am surprised that
                this would be a regression. (I would expect the same
                performance pathologies in this code to exist in 6 and
                forward -- has it been established conclusively that
                this is indeed a regression?)</div>
              <div><br>
              </div>
              <div>thanks!</div>
              <div>-- ramki</div>
            </div>
            <div class="gmail_extra"><br>
              <div class="gmail_quote">On Fri, May 8, 2015 at 6:40 AM,
                James Lent <span dir="ltr"><<a
                    moz-do-not-send="true" href="mailto:jlent@tivo.com"
                    target="_blank">jlent@tivo.com</a>></span> wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div bgcolor="#FFFFFF" text="#000000"> The explanation
                    of the problem added by Jesper Wilhelmsson on <span>
                      <span title="2015-05-06 08:59">2015-05-06 08:59 </span></span>to
                    JDK-8079274 is interesting, but, not very
                    satisfying. <br>
                    <br>
                    1) The problem affects more than this one specific
                    breakpoint.  It impacts objects in at least the
                    range of about 480M to 512M.  The comment "A larger
                    object probably triggers some higher level code that
                    are able to skip scanning of this part of the
                    memory" does not inspire confidence. It makes even
                    more concerned that we do not understand this issue
                    and that there could be other triggers.  It seems
                    like "the higher level code" has a 32M blind spot. 
                    Further I can create scenarios where the GC code
                    takes progressively longer to clear the issue. 
                    Presumably it completes the lower block scan during
                    each GC cycle.  Why does it take longer and longer
                    each time (my guess is it has something to do with
                    the number of objects moved to survivor space).  If
                    I cache 1 in 100 instead of 1 in 10 then there are
                    11 progressively slower ParNewGC (after several fast
                    one) and the issue takes minutes to clear: <br>
                    <br>
                    $ /usr/lib/jvm/java-7-oracle/bin/java -verbose:gc
                    -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
                    -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Xmx4G
                    -Xms4G LongParNewPause 536870384 100 100 <br>
                    0.308: [GC0.308: [ParNew:
                    272640K->3029K(306688K), 0.0113180 secs]
                    796927K->527316K(4160256K), 0.0114040 secs]
                    [Times: user=0.05 sys=0.00, real=0.02 secs] <br>
                    0.367: [GC0.367: [ParNew:
                    275669K->7766K(306688K), 0.0140840 secs]
                    799956K->532053K(4160256K), 0.0141630 secs]
                    [Times: user=0.04 sys=0.01, real=0.02 secs] <br>
                    0.430: [GC0.430: [ParNew:
                    280406K->10057K(306688K), 0.0120810 secs]
                    804693K->534344K(4160256K), 0.0121690 secs]
                    [Times: user=0.05 sys=0.00, real=0.02 secs] <br>
                    0.490: [GC0.490: [ParNew:
                    282697K->15383K(306688K), 0.0128540 secs]
                    806984K->539670K(4160256K), 0.0129390 secs]
                    [Times: user=0.05 sys=0.00, real=0.02 secs] <br>
                    0.551: [GC0.551: [ParNew:
                    288023K->21006K(306688K), 0.0146480 secs]
                    812310K->545294K(4160256K), 0.0147650 secs]
                    [Times: user=0.05 sys=0.01, real=0.01 secs] <br>
                    0.613: [GC0.613: [ParNew:
                    293646K->26805K(306688K), 0.0141250 secs]
                    817934K->551092K(4160256K), 0.0142050 secs]
                    [Times: user=0.05 sys=0.00, real=0.01 secs] <br>
                    0.675: [GC0.675: [ParNew:
                    299445K->24258K(306688K), 0.4816840 secs]
                    823732K->551436K(4160256K), 0.4817620 secs]
                    [Times: user=1.90 sys=0.00, real=0.48 secs] <br>
                    1.205: [GC1.205: [ParNew:
                    296898K->17914K(306688K), 3.4216550 secs]
                    824076K->547788K(4160256K), 3.4217180 secs]
                    [Times: user=13.50 sys=0.00, real=3.42 secs] <br>
                    4.674: [GC4.674: [ParNew:
                    290554K->17433K(306688K), 6.3406820 secs]
                    820428K->550031K(4160256K), 6.3407410 secs]
                    [Times: user=25.00 sys=0.02, real=6.34 secs] <br>
                    11.062: [GC11.062: [ParNew:
                    290073K->17315K(306688K), 10.9493130 secs]
                    822671K->552636K(4160256K), 10.9493890 secs]
                    [Times: user=42.98 sys=0.03, real=10.95 secs] <br>
                    22.059: [GC22.059: [ParNew:
                    289955K->17758K(306688K), 16.0539650 secs]
                    825276K->555798K(4160256K), 16.0540250 secs]
                    [Times: user=62.44 sys=0.09, real=16.05 secs] <br>
                    38.161: [GC38.161: [ParNew:
                    290398K->17745K(306688K), 20.6900820 secs]
                    828438K->558512K(4160256K), 20.6901420 secs]
                    [Times: user=80.79 sys=0.08, real=20.69 secs] <br>
                    58.898: [GC58.898: [ParNew:
                    290385K->21826K(306688K), 24.6781700 secs]
                    831152K->565310K(4160256K), 24.6782710 secs]
                    [Times: user=95.58 sys=0.05, real=24.68 secs] <br>
                    83.625: [GC83.625: [ParNew:
                    294466K->21834K(306688K), 30.3501800 secs]
                    837950K->568044K(4160256K), 30.3502610 secs]
                    [Times: user=115.80 sys=0.11, real=30.35 secs] <br>
                    114.022: [GC114.022: [ParNew:
                    294474K->21836K(306688K), 33.6648800 secs]
                    840684K->570772K(4160256K), 33.6649390 secs]
                    [Times: user=127.38 sys=0.14, real=33.67 secs] <br>
                    147.736: [GC147.736: [ParNew:
                    294476K->18454K(306688K), 38.9828430 secs]
                    843412K->570117K(4160256K), 38.9829040 secs]
                    [Times: user=149.14 sys=0.15, real=38.99 secs] <br>
                    186.766: [GC186.766: [ParNew:
                    291094K->26602K(306688K), 39.8726730 secs]
                    842757K->580977K(4160256K), 39.8727330 secs]
                    [Times: user=156.35 sys=0.11, real=39.88 secs] <br>
                    226.686: [GC226.686: [ParNew:
                    299242K->26612K(306688K), 0.0174470 secs]
                    853617K->583714K(4160256K), 0.0175300 secs]
                    [Times: user=0.06 sys=0.00, real=0.02 secs] <br>
                    <br>
                    2) The suggestion to use only 1 thread is not
                    acceptable to me.  Use parallel GC with one thread? 
                    Further it did not work for me (in fact it made
                    things worse).  "user" time does now equal "real"
                    time (and the new size is 1/4 as big) so I think the
                    parameter took effect: <br>
                    <br>
                    $ /usr/lib/jvm/java-8-oracle/bin/java -verbose:gc
                    -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
                    -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Xmx4G
                    -Xms4G -XX:ParallelGCThreads=1 LongParNewPause
                    536870384 100 10 <br>
                    0.226: [GC (Allocation Failure) 0.226: [ParNew:
                    68160K->6980K(76672K), 0.0489756 secs]
                    592447K->531267K(4185792K), 0.0490718 secs]
                    [Times: user=0.05 sys=0.00, real=0.05 secs] <br>
                    0.291: [GC (Allocation Failure) 0.291: [ParNew:
                    75140K->8512K(76672K), 29.8707281 secs]
                    599427K->539479K(4185792K), 29.8707954 secs]
                    [Times: user=29.80 sys=0.00, real=29.88 secs] <br>
                    30.180: [GC (Allocation Failure) 30.180: [ParNew:
                    76672K->8093K(76672K), 30.6357699 secs]
                    607639K->545772K(4185792K), 30.6358257 secs]
                    [Times: user=30.57 sys=0.00, real=30.63 secs] <br>
                    60.837: [GC (Allocation Failure) 60.837: [ParNew:
                    76253K->6830K(76672K), 69.2115255 secs]
                    613932K->552169K(4185792K), 69.2115783 secs]
                    [Times: user=69.05 sys=0.01, real=69.22 secs] <br>
                    130.068: [GC (Allocation Failure) 130.068: [ParNew:
                    74990K->8226K(76672K), 132.5118903 secs]
                    620329K->560381K(4185792K), 132.5119459 secs]
                    [Times: user=132.21 sys=0.00, real=132.52 secs] <br>
                    262.603: [GC (Allocation Failure) 262.603: [ParNew:
                    76386K->8512K(76672K), 0.0400138 secs]
                    628541K->568748K(4185792K), 0.0400687 secs]
                    [Times: user=0.04 sys=0.00, real=0.04 secs] <br>
                    <br>
                    3) Applications should not be expected to tune the
                    GC settings to account for the size of one static
                    object.  The connection between the size of this
                    object and the GC performance was not easy to even
                    find.  Further the parameter
                    ParGCCardsPerStrideChunk is pretty low level and not
                    one I think many application developers are familiar
                    with.<br>
                    <br>
                    Error: VM option 'ParGCCardsPerStrideChunk' is
                    diagnostic and must be enabled via
                    -XX:+UnlockDiagnosticVMOptions.<span class=""><br>
                      Error: Could not create the Java Virtual Machine.<br>
                      Error: A fatal exception has occurred. Program
                      will exit.<br>
                      <br>
                    </span> 4) I can not reproduce this issue in Oracle
                    6.  This appears to be a regression.  How can fixing
                    a regression be called an enhancement request? <br>
                    <span class=""> <br>
                      <br>
                      <br>
                      <br>
                      <br>
                      <br>
                      <br>
                      <hr> <font size="1" color="Gray" face="Arial"><br>
                        This email and any attachments may contain
                        confidential and privileged material for the
                        sole use of the intended recipient. Any review,
                        copying, or distribution of this email (or any
                        attachments) by others is prohibited. If you are
                        not the intended recipient, please contact the
                        sender immediately and permanently delete this
                        email and any attachments. No employee or agent
                        of TiVo Inc. is authorized to conclude any
                        binding agreement on behalf of TiVo Inc. by
                        email. Binding agreements with TiVo Inc. may
                        only be made by a signed written agreement.<br>
                      </font> </span></div>
                </blockquote>
              </div>
              <br>
            </div>
          </blockquote>
          <br>
        </div>
      </blockquote>
    </blockquote>
    <br>
  </body>
</html>