RFR(M/L): 6484982: G1: process references during evacuation pauses

Tue Sep 13 10:36:20 UTC 2011

Hi John,

Comments inline...

On 2011-09-13 01:42, John Cuthbertson wrote:
> Hi Bengt,
>
> Thanks for the review. Please see the replies inline....
>
> On 09/12/11 06:22, Bengt Rutisson wrote:
>>
>> Hi John,
>>
>> Stefan and I have been looking at this webrev today. Overall it looks 
>> good! This will be a great improvement for G1.
>>
>> We have not quite gotten through all the code yet, but I hope that 
>> we've covered most of the important changes.
>>
>> Here are some comments:
>>
>> referenceProcessor.cpp
>> - Not strictly related to your change, but it would be nice to change 
>> the name of _discoveredSoftRefs to _discoveredRefs and introduce a 
>> separate _discoveredSoftRefs. It is quite confusing that 
>> _discoveredSoftRefs is being used for all reference types.
>>
>
> I like this suggestion but I would rather do the renaming as a 
> separate RFE because the change is already ~1600 lines. We also would 
> need to check every reference to _discoveredSoftRefs and determine if 
> we mean the soft ref lists only or the handle to the entire array 
> list. Most are obvious though.

Sounds fine. Will you file a CR?

>> - Why does iter.make_active() need the pre-barrier? We do 
>> iter.make_referent_alive() just after. That will mark and push the 
>> referent object, right?
>
> make_active() operates on the reference object - not the referent. 
> Reference objects are linked through their _next field and I think we 
> may have to keep the next pending reference object alive. The only way 
> we may have to do this is through the SATB barrier.

OK. We missed that. Thanks for explaining that.

A related question. Do we really need to call make_active() at all? It 
seems like we only enqueue references on the discovered lists if next == 
NULL (see the second if statement in 
ReferenceProcessor::discover_reference()). The next field is not set 
until we put it on the pending list.

>>
>> - ReferenceProcessor::balance_queues() got convoluted by the extra 
>> levels of if statements. A suggestion is to make a  set_discovered 
>> method like this:
>>
>>   set_discovered(oop ref, oop value) {
>>     if (_discovered_list_needs_barrier) {
>>       java_lang_ref_Reference::set_discovered(ref, value);
>>     } else {
>>       java_lang_ref_Reference::set_discovered_raw(ref, value);
>>     }
>>   }
>>
>>   and then us it as:
>>
>>   if (ref_lists[to_idx].head() == NULL) {
>>     // to list is empty. Make a loop at the end.
>>     set_discovered(move_tail, move_tail);
>>   } else {
>>     set_discovered(move_tail, ref_lists[to_idx].head());
>>   }
>>
>>
>
> Good suggestion. Done.

Great! Thanks!

>> g1CollectedHeap.cpp
>>
>> - What does this comment mean? 
>> instanceRefKlass::process_discovered_references does not exist.
>>     // We want to discover references, but not process them yet.
>>     // This mode is disabled in
>>     // instanceRefKlass::process_discovered_references if the
>>     // generation does some collection work, or
>>     // instanceRefKlass::enqueue_discovered_references if the
>>     // generation returns without doing any work.
>>
>
> I think the comment is out of date. It has been in the G1 sources 
> since I as long as I can remember. Removed.
>
>> - The setup of the STW reference processor uses an is_alive_closure. 
>> This seems to be correct and efficient, but why do the other full and 
>> STW reference processors in HotSpot skip this?
>
> The is_alive_non_header is an optimization for collectors that use an 
> external data structure (bitmap) rather than the object's header to 
> record liveness information. Only G1 and CMS have this external data 
> structure. So when we try to 'discover' a reference we can see if its 
> referent is already alive and, if so, skip the 'discovery' and treat 
> the reference as a regular oop.What this buys us is less reference 
> objects to process. The other collectors use an object's mark word to 
> record liveness info during the GC. With these other GCs you can't 
> normally tell if the referent is alive while walking the heap marking 
> (and doing discovery) until the reachable object graph has been walked.

This was a good explanation. Now we understand why it is called 
is_alive_non_header. However, considering that we misunderstood it and 
were not sure if it was required for completeness we think it would be 
good to rename it and/or add comments that this is an optimization.

BTW, wouldn't ParallelOld be able to use this as well? And shouldn't 
young generational collectors be able to filter all references out that 
points to referents outside the collection set?

>
>>
>> - Why is it necessary to go through the references found during CM in 
>> G1CollectedHeap::g1_process_strong_roots?
>>     if 
>> (!_process_strong_tasks->is_task_claimed(G1H_PS_refProcessor_oops_do)) {
>>       // We need to treat the discovered reference lists of the
>>       // concurrent mark ref processor as roots and keep entries
>>       // (which are added by the marking threads) on them live
>>       // until they can be processed at the end of marking.
>>       ref_processor_cm()->weak_oops_do(&buf_scan_non_heap_roots);
>>     }
>>
> Reference objects, currently, can only be discovered by one reference 
> processor and there is currently no mechanism for moving a reference 
> from one RP to another. References discovered by the CM reference 
> processor are processed during remark. So if we have a reference 
> object that has already been discovered by the CM reference processor, 
> which is also in the collection set, then it will not be 'discovered' 
> by the STW reference processor. But the routine 
> ReferenceProcessor::discover_reference() will return false (meaning 
> that the calling code does not need to treat the reference object as a 
> regular oop) and the strongly reachable graph from the reference 
> object will not be walked during the evacuation clause.  So we have to 
> ensure that the reference and it's referent are preserved until 
> marking completes; the CM ref processor will then process the 
> reference. If we had some way of encoding the reference processor 
> instance into the discovered reference object and then also check 
> against that in ReferenceProcessor::discover_reference() then we could 
> do without this code.

Right. The lists will contain references that are in the collection set 
as well as outside. weak_oops_do() will handle the head of the lists and 
all directly reachable references in the collection set. We assume that 
a reference inside the collection set that is linked to from a reference 
outside will be found through the remember set.

>> - The G1_DEBUG define makes the code really difficult to read. Could 
>> this be removed?
>
> OK. It was really helpful while debugging but I suppose it's served 
> it's purpose.
>>
>> - Have not looked in detail at the large new block of code in 
>> g1CollectedHeap.cpp. One question though, would it be possible to 
>> somehow simplify the setup in G1ParTask, G1STWRefProcTaskProxy, 
>> G1ParPreserveCMReferentsTask and process_discovered_references()?
>>
>
> The problem I see is that these are all stack allocated objects so I 
> think the only way to simplify this would be to have a stack allocated 
> wrapper object - G1ParCopyInitializer(.....). Can you think of an 
> alternative way? I'm not sure a consensus could be reached about this.

We think it would be worth trying a wrapper object out. Just to see if 
it looks better.

>> - Did you do any tests with -XX:RefDiscoveryPolicy=1?
>>
> No. Ramki has said that it has bit-rotted and he wishes to remove it.

Interesting. Is there a CR to remove the ReferentBasedDiscovery policy?

>>
>> Some minor things:
>>
>> referenceProcessor.hpp
>> - Would be nice if subclasses_of_ref() could be called something with 
>> "number_of" or similar. That indicates better how it is being used.
>
> OK. Done.
>
>>
>> thread.cpp
>> - only bracket changes. Don't include in thins change.
>
> I had debugging code in here and so left the 'correctly' formatted 
> code after removing the debugging statements. Since I've already been 
> thanked for doing this - I would leave it.

Sorry for being persistent but we think this is not a good enough reason 
to leave it in.

>>
>> referenceProcessor.cpp
>> - The code that is commented out in 
>> ReferenceProcessor::weak_oops_do() and 
>> ReferenceProcessor::clean_up_discovered_references() could probably 
>> be removed by now.
>
> Possibly - but again I would leave it in. I think it nicely explains 
> the format of _discoveredSoftRefs and a possible alternative (and 
> correct) implementation if a particular C++ compiler is able to unroll 
> the nested loop.

Of the 7 places that use "_max_num_q * subclasses_of_ref" in for loops 
there are only two that have the comment. It would probably be better to 
add a comment where the datastructure is set up.

The comment is aslo confusing since it says "Should this instead be". 
Which we interpret as a question regarding the quality of the code below it.

>>
>> - You removed inline from void DiscoveredListIterator::remove(), will 
>> that impact performance?
>
> Possibly. The include dependencies are such that I've been unable to 
> move these routines into referenceProcessor.hpp or a 
> referenceProcessor.inline.hpp as compilation errors occur when I 
> include javaClasses.hpp and systemDictionary.hpp.

OK.

>>
>> g1CollectedHeap.cpp
>> - What does "ser" stand for in "stw_rp_disc_ser"?
>
> Serial. We've made discovery for the STW reference processor non MT.
>
>>
>> - Would be nice to move these two lines into enable_discovery(). (May 
>> need a parameter to disable the check for NoRefDiscovery to work 
>> properly.)
>>
>>   assert(!ref_processor_stw()->discovery_enabled(), "Precondition");
>>   ref_processor_stw()->verify_no_references_recorded();
>>
> OK. Again I would rather this was a separate change for precisely the 
> reason you mention.

We think it fits well into this change. But if you don't want to do it 
now, could you please file a CR to make sure that we don't forget to do it?

>> - Are these asserts really necessary?
>>   void G1ParScanThreadState::trim_queue() {
>>     assert(_evac_cl != NULL, "not set");
>>     assert(_evac_failure_cl != NULL, "not set");
>>     assert(_partial_scan_cl != NULL, "not set");
>
> Ensuring that the PSS was fully initialized before calling these 
> closures was very helpful. Are they necessary? No. Are they defensive? 
> Yes. I vote to leave them in.

Ok, but if you add the G1ParCopyInitializer(.....) discussed above the 
risk of having uninitialized closures would be minimal. Since the 
closures are not being explicitly used in trim_queue() it seems strange 
to assert their values here.

>> - Maybe G1KeepAliveClosure should be called something else. It does 
>> not keep objects alive, it just adjusts the pointer to the forwarded 
>> location.
>
> I like the "keep alive" part of the name as it it ties in with being 
> used as the "keep_alive" parameter. G1UpdatePtrKeepAliveClosure is 
> perhaps a more accuarate, if verbose, name. Unless you have a strong 
> objection (or a really good alternative name) I vote to leave it.
>>
>>
>> g1CollectedHeap.hpp:
>> - This comment is a bit confusing:
>>   //   Will enqueue any non-live discovered references on the
>>   //    STW ref processor's discovered lists.
>>
>>  Maybe "Will enqueue any non-live discovered reference that the STW 
>> ref processor discovered.
>
> OK. Changed:
>
>  // At the end of a full GC we:
>  //  * Enqueue any reference objects discovered by the STW ref processor
>  //    that have non-live referents. This has the side-effect of
>  //    making the STW ref processor inactive by disabling discovery.

Great. Thanks!

>>
>> heapRegion.hpp
>> - I guess #if WHASSUP should be removed...
>>
>>
>
> Are you referring to line 593? I thought I had removed it.I can't see 
> it in my workspace.

Ah. Yes sorry. Got a bit confused at the end of the review. Thanks for 
removing this!

Stefan and Bengt