RFR (S): 8182703: Correct G1 barrier queue lock orderings

Mon Jul 17 00:33:38 UTC 2017

> On Jul 11, 2017, at 8:07 AM, Erik Österlund <erik.osterlund at oracle.com> wrote:
>> This suggests a potential (though seemingly hard to avoid) fragility
>> resulting from the lowered lock rank.
> 
> Note that this does not matter for JavaThreads (including compiler threads), for concurrent refinement threads or concurrent marking threads, nor does it matter for any thread when marking is not active.
> 
> So it seems to me that the worst consequence of this is possibly worse latency for operations coinciding in time with concurrent marking, that have large amounts of mutations or resurrections, and are not performed by JavaThreads (including compiler threads) or GC threads (that are performing the concurrent marking) or concurrent refinement threads (that have nothing to do with SATB), that are running concurrently with each other.
> 
> That does not seem to be a huge problem in my book. If it was, and an unknown bunch of non-JavaThreads are heavily mutating or resurrecting objects concurrent to marking, such that contention is inflicted on the shared queue lock for the shared SATB queue, then the right solution for that seems to be to give such threads their own local queue, rather than to reduce the time spent under the surprisingly hot shared queue lock.

I think this part of the reply misses my point, though later
discussion is on the right track.

The rank for any locks in the filtering or mutator assist code can be
anything not higher than the CBL lock ranks, since filtering and
mutator assist are invoked in related contexts.  Any locks in the
filtering code must be lower than the shared queue lock ranks.

Reducing the CBL and shared queue ranks to allow them to be locked in
more contexts implicitly imposes additional requirements on the
filtering and mutator assist code, especially the latter, which is not
presently invoked while holding the shared queue lock.  Code which
would have been "easily" safe before this change may now be not so
easy, or may even be broken.  In this discussion we've already
identified two places that require further repair before we can start
taking advantage of these reduced lock ranks.  And future changes in
those areas may be more difficult than with the old lock ranks.

But since I agree with the rationale for reducing the ranks of these
locks, it seems we need to accept these additional costs (some known
additional work needed, and restrictions on future changes).  But we
should remember these costs exist (RFEs for the additional work, maybe
some comments on the filtering and mutator assist API functions
discussing the issue).

>> The present SATB filtering doesn't seem to acquire any locks, but it's
>> a non-trivial amount of code spread over multiple files, so would be
>> easy to miss something or break it in that respect.  Reducing the lock
>> ranks requires being very careful with the SATB filtering code.
> 
> IMO, adding any lock into the SATB barrier which is used all over hotspot in some very shady places arguably requires being very careful regardless of my changes. So I am going to assume whoever does that for whatever reason is going to be careful.
> 
>> The "mutator" help for dirty card queue processing is not presently
>> done for the shared queue, but I think could be today.  I'm less sure
>> about that with lowered queue lock ranks; I *think* there aren't any
>> relevant locks there (other than the very rare shared queue lock in
>> refine_card_concurrently), but that's a substantially larger and more
>> complex amount of code than SATB queue filtering.
> 
> As discussed with Thomas earlier in this thread, there are indeed locks blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. If collaborative refinement was to be performed on non-Java threads (and non-concurrent refinement threads), then this lock would have to decrease to the access rank first. But we concluded that warrants a new RFE with separate analysis.
> 
> As with the SATB queues though, I do not know what threads would be causing such trouble? It is not JavaThreads (including compiler threads), concurrent refinement threads, concurrent marking threads. That does not leave us with a whole lot of threads to cause that contention on the shared queue lock. And as with the SATB queues, if there are such threads that cause such contention on the shared queue lock, then the right fix seems to be to give them their own local queue and stop taking the shared queue lock in the first place.

A native thread copying a jweak to a (strong) jobject uses the shared
queue.  I don't think we're going to fix that by giving native threads
their own queues.

A Java thread calls into C++, takes a low-rank lock, and while holding
that lock touches a queue.  Everything in the queue touching needs to
be ranked lower than that lock, including filter and mutator assist
code.  That this isn't permitted today is beside the point; this seems
to me to be exactly the sort of situation this change is intended to
permit.

Since I think the rank reductions are a necessary (though not sufficient)
step, call it Reviewed.