<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#ffffff" text="#000000">

    Hi John -- i looked at the webrev again and the new changes look

    fine to me.<br>

    I did have some comments regarding one issue which I had previously<br>

    convinced myself was correct, but I have forgotten the reasoning, so

    perhaps<br>

    you can help jog my memory...<br>

    <br>

    On 9/12/2011 4:42 PM, John Cuthbertson wrote:<br>

    <blockquote cite="mid:4E6E98DD.7000706@oracle.com" type="cite">On

      09/12/11 06:22, Bengt Rutisson wrote:

      <br>

      ...<br>

      <blockquote type="cite">- Why is it necessary to go through the

        references found during CM in

        G1CollectedHeap::g1_process_strong_roots?

        <br>

            if

        (!_process_strong_tasks->is_task_claimed(G1H_PS_refProcessor_oops_do))

        {

        <br>

              // We need to treat the discovered reference lists of the

        <br>

              // concurrent mark ref processor as roots and keep entries

        <br>

              // (which are added by the marking threads) on them live

        <br>

              // until they can be processed at the end of marking.

        <br>

        ref_processor_cm()->weak_oops_do(&buf_scan_non_heap_roots);

        <br>

            }

        <br>

        <br>

      </blockquote>

      Reference objects, currently, can only be discovered by one

      reference processor and there is currently no mechanism for moving

      a reference from one RP to another. References discovered by the

      CM reference processor are processed during remark. So if we have

      a reference object that has already been discovered by the CM

      reference processor, which is also in the collection set, then it

      will not be 'discovered' by the STW reference processor. But the

      routine ReferenceProcessor::discover_reference() will return false

      (meaning that the calling code does not need to treat the

      reference object as a regular oop) and the strongly reachable

      graph from the reference object will not be walked during the

      evacuation clause.  So we have to ensure that the reference and

      it's referent are preserved until marking completes; the CM ref

      processor will then process the reference. If we had some way of

      encoding the reference processor instance into the discovered

      reference object and then also check against that in

      ReferenceProcessor::discover_reference() then we could do without

      this code.

      <br>

    </blockquote>

    <br>

    I agree with the reasoning you provide above, but i wonder now if

    your<br>

    code does do so today in its current form. Consider the following

    scenario:-<br>

    <br>

    The CM reference processor has in its discovered list 3 reference

    objects A->B->C->D.<br>

    Let us say that A, B and D are not in the collection set and C

    happens to be<br>

    in the collection set of a partially young collection.<br>

    <br>

    Now the code above will check the head of this particular discovered

    list and<br>

    find that it's pointing to an object not in the discovered list and

    will (if i understand<br>

    the code correctly) immediately stop. C, which is a Reference object

    that was copied and<br>

    assumed to have been discovered, and thus did not have its referent

    and next fields<br>

    scanned can end up with dangling pointers if those objects were also

    unlucky enough to<br>

    be in the collection set.<br>

    <br>

    Am I missing something here that prevents this scenario (or allows

    one to scan<br>

    and preserve the objects that C refers to by some other means)?<br>

    <br>

    I am pretty sure we had discussed this before and I had convinced

    myself that this worked at<br>

    that time, but I am having a hard time remembering what the

    reasoning was. (If nothing<br>

    else, that reasoning should be written into the comment here for a

    future time when we<br>

    would have forgotten this email thread and your response to my

    question :-)<br>

    <br>

    Now, if my fears are correct that the above code would sometimes

    fail to preserve objects that<br>

    we need to, we'd instead need to walk down the<b> entire</b> length

    of each discovered list<br>

    preserving their contents, and for very large old generations that

    has the potential<br>

    of becoming a serious scaling bottleneck (although clearly no worse

    asymptotically<br>

    than the remark pause which would need to do work that is

    proportional to the same value)<br>

    -- i just fear that one is a partially young pause of which there

    may be potentially<br>

    many for each remark pause in a concurrent cycle.<br>

    <br>

    What if, instead, one realized that this is necessary only for

    partially young collections<br>

    and one then treated any Reference object coming from a non-young

    region<br>

    normally. That is, just like other collectors' (young) reference

    processors do,<br>

    G1's STW young reference processor would never discover an object

    that was not<br>

    in the young gen. In that case, it would just copy all of (in my

    example above)<br>

    C's followers at the time it copied C, and we would not need to do

    anything special<br>

    to preserve C's followers.<br>

    This also seems more naturally analogous to the "span" containment

    checks<br>

    that appear in the other collectors. So, while the exact mechanics

    of how to<br>

    embed this containment predicate into the reference processor

    (replacing or<br>

    generalizing the current role of span without compromising

    performance for<br>

    all the other collectors) can be worked out, it appears as though

    conceptually<br>

    this may lead to more uniform (wrt other existing collectors) and

    simpler code.<br>

    <br>

    And a comment regarding ...<br>

    <blockquote cite="mid:4E6E98DD.7000706@oracle.com" type="cite">

      <blockquote type="cite">- Did you do any tests with

        -XX:RefDiscoveryPolicy=1?

        <br>

        <br>

      </blockquote>

      No. Ramki has said that it has bit-rotted and he wishes to remove

      it.

      <br>

    </blockquote>

    <br>

    I'll file a CR to remove it, if one does not already exist.<br>

    I'll check on the hotspot-gc-use list that no one uses it (if they

    do,<br>

    they must be running buggy code;  I do seriously believe this has

    organically bit-rotted...)<br>

    We might have to do the full CCC monty on this one, though, before

    we can<br>

    remove it (or we must fix it). I'll look into it and update.<br>

    <br>

    -- ramki<br>

    <br>

    <br>

  </body>

</html>