RFR(s): 8029255: G1: Reference processing should not enqueue references on the shared SATB queue

Per Liden per.liden at oracle.com
Mon Dec 2 13:44:14 UTC 2013


Hi Thomas, thanks for looking at this.

Comments inline below.

On 11/29/2013 12:37 PM, Thomas Schatzl wrote:
> Hi,
>
>    some comments:
>
> On Thu, 2013-11-28 at 17:50 +0100, Per Liden wrote:
>> Summary: A side effect of fixing JDK-8029162 (G1: Shared SATB queue
>> never enabled) is that the reference processor will start to add
>> references to the shared SATB queue when enqueuing discovered
>> references. This will later cause ConcurrentMark::verify_no_cset_oops()
>> to fail because we now have enqueued references in the SATB qeueue which
>> points into the collection set (which will be empty after the
>> collection). This patch makes sure we avoid doing pre-barriers, by
>> instead doing raw stores followed by a post-barrier.
>>
>> This patch also removes an unused constructor and the unnecessary
>> caching of the barrier set in the ReferenceProcessor. So use of _bs was
>> replaced by oopDesc::bs(). Further, the cached barrier set was only
>
> Not using the cached values may cause performance problems if multiple
> threads access it and there is a frequently modified global in the same
> cache line.
> I do not think/hope it is a problem here though.

Agreed. Since oopDesc::bs() is used inside oop_store() so we're already 
using it in the reference processor in a number of places. Also note 
that the cached barrier set could also suffer from false sharing as 
nothing prevents ReferenceProcessor::_bs from living in a cacheline 
which isn't also heavily modified by other threads.

>
>> set/used if discovered_list_needs_barrier was true, but with this change
>> we will need the barrier set in other cases too.
>>
>> Testing done: jprt, kitchensink (10 hours), gcbasher (10 hours)
>>
>> http://cr.openjdk.java.net/~pliden/8029255/webrev.0/
>>
>> https://bugs.openjdk.java.net/browse/JDK-8029255
>>
>
> I think I understand the reason for the change, but I do not think the
> change is good.
>
> Background (for others too, it took me a while to understand the
> reference processing code):
> The problem why the code cannot use the normal barrier code in all these
> cases is that collection uses the next and discovered fields of
> java.lang.Reference for its own purpose, putting references from the
> cset into the queues.
>
> In this process we loose the previous values too, that's why the code
> cannot just apply the normal barrier - before activating the shared
> reference queue in 8029162 it was not a problem because the "old"
> references were dropped immediately when trying to add.
>
> It seems that this is not an issue in regards to SATB as these fields
> are handled somewhat specially always anyway.
>
> I would prefer if setting the "next" field of the reference were
> encapsulated into a method like ReferenceProcessor::set_discovered (e.g.
> ReferenceProcessor::set_next()), and used instead of inlining
> set_raw/apply-post-barrier code all the time.

That would be nice, but as far as I can see all three cases where some 
sort of set_next is done all have different barrier needs, so it I'm not 
sure it would be a simplification in the end. Same is true for 
ReferenceProcessor::set_discovered, it's only used in two places but we 
do set the discovered field in many other paths, but those paths have 
different needs (set by oopDesc::atomic_exchange_oop, with/without 
barriers, etc). I'd almost suggest that we remove 
ReferenceProcessor::set_discovered to avoid any confusion. But actually, 
I'd rather touch as little as possible of this code at this stage, as I 
have the feeling I might not understand of all aspects of it and 
cleaning it up is a fairly substantial project in itself. The changes 
I've done should only affect G1 as it's only a matter of removing some 
pre-barriers.

>
> There is already a missing call to the post-barrier in
> referenceProcessor.cpp:366  when storing the self-loop value (That is
> benign since it does not add a new object to the object graph, but if
> that is kept, the reason should be documented).

Yes, will add a comment to clarifying that. The post-barrier was left 
out intentionally as it would just end up being a noop.

>
> Other issues:
> - (this is more a question than an issue)
> DiscoveredListIterator::make_active() has some special handling for
> setting the next field of java.lang.Reference. In particular it does not
> call the post-barrier because this will fail CT verification in G1.
> What's the difference between that field and the discovered field in
> that respect? Also the original code executed the post-barrier on the
> next field (e.g. referenceProcessor.cpp:366), so if there were a
> problem, it should already occur, shouldn't it?

Good question. Looking at InstanceRefKlass::update_nonstatic_oop_maps() 
where the oop map is set to ignore all fields except "queue", so I'm not 
sure I understand why any post-barriers would be needed as the dirty 
card would basically be ignored anyway, no?

>
> - there are places where the code for
> ReferenceProcessor::set_discovered() is inlined for no apparent reason
> (e.g. at least referenceProcessor.cpp:374, and
> referenceProcessor.cpp:1256). Could this change (or the one proposed
> below) also clean this up?

Note that ReferenceProcessor::set_discovered() has slightly different 
semantics as it only does a barrier if _discovered_list_needs_barrier is 
enabled, which it is only for G1's CM reference processor.

>
> - with the change in 8029162 I do not think the code that is enabled by
> setting _pending_list_uses_discovered_field to false works any more (if
> it ever worked with G1, see the very first thought in this list). It
> will trigger the same assertion that this fix wants to fix.
> If the "old behavior" is not needed any more (I think so), please remove
> that code first in another CR. The alternative is to fix it I guess.

Yes, I concluded that this code is never used anymore, so I ignored it 
in this change and added it to the list of "things to cleanup/remove". I 
can register a separate CR for cleaning that out.

/Per




More information about the hotspot-gc-dev mailing list