RFR (S): 8182703: Correct G1 barrier queue lock orderings

Mon Jul 17 08:49:45 UTC 2017

Hi Kim,

Thank you for the review!
I have some comments though...

On 2017-07-17 02:33, Kim Barrett wrote:
>> On Jul 11, 2017, at 8:07 AM, Erik Österlund <erik.osterlund at oracle.com> wrote:
>>> This suggests a potential (though seemingly hard to avoid) fragility
>>> resulting from the lowered lock rank.
>> Note that this does not matter for JavaThreads (including compiler threads), for concurrent refinement threads or concurrent marking threads, nor does it matter for any thread when marking is not active.
>>
>> So it seems to me that the worst consequence of this is possibly worse latency for operations coinciding in time with concurrent marking, that have large amounts of mutations or resurrections, and are not performed by JavaThreads (including compiler threads) or GC threads (that are performing the concurrent marking) or concurrent refinement threads (that have nothing to do with SATB), that are running concurrently with each other.
>>
>> That does not seem to be a huge problem in my book. If it was, and an unknown bunch of non-JavaThreads are heavily mutating or resurrecting objects concurrent to marking, such that contention is inflicted on the shared queue lock for the shared SATB queue, then the right solution for that seems to be to give such threads their own local queue, rather than to reduce the time spent under the surprisingly hot shared queue lock.
> I think this part of the reply misses my point, though later
> discussion is on the right track.
>
> The rank for any locks in the filtering or mutator assist code can be
> anything not higher than the CBL lock ranks, since filtering and
> mutator assist are invoked in related contexts.  Any locks in the
> filtering code must be lower than the shared queue lock ranks.
>
> Reducing the CBL and shared queue ranks to allow them to be locked in
> more contexts implicitly imposes additional requirements on the
> filtering and mutator assist code, especially the latter, which is not
> presently invoked while holding the shared queue lock.  Code which
> would have been "easily" safe before this change may now be not so
> easy, or may even be broken.  In this discussion we've already
> identified two places that require further repair before we can start
> taking advantage of these reduced lock ranks.  And future changes in
> those areas may be more difficult than with the old lock ranks.

1) I agree - more work is needed to free the unethically caged heap oop 
store. Some constraints have been removed, but there are a few more.
2) I disagree that we can not already take immediate advantage of this. 
My main problem is the SATB queues required for the weak oop load 
barriers in hotspot. They are now free, and therefore I can take 
immediate advantage of these changes.
3) I think that to the greatest extent possible, lock ranks should 
follow the way we intend to lock, rather than letting the ranks affect 
the way we lock. The deadlock detection system was designed to have 
false positives. Therefore we should first figure out if we have a true 
possible deadlock or a false positive. In case of false positives, I 
think we should try pretty hard not to compromise solid locking schemes 
in order to fight false positives of the deadlock detection system. I 
understand this is sometimes difficult, but I think it is a good idea in 
general.

> But since I agree with the rationale for reducing the ranks of these
> locks, it seems we need to accept these additional costs (some known
> additional work needed, and restrictions on future changes).  But we
> should remember these costs exist (RFEs for the additional work, maybe
> some comments on the filtering and mutator assist API functions
> discussing the issue).

I am glad we agree here. I will file RFEs.

>>> The present SATB filtering doesn't seem to acquire any locks, but it's
>>> a non-trivial amount of code spread over multiple files, so would be
>>> easy to miss something or break it in that respect.  Reducing the lock
>>> ranks requires being very careful with the SATB filtering code.
>> IMO, adding any lock into the SATB barrier which is used all over hotspot in some very shady places arguably requires being very careful regardless of my changes. So I am going to assume whoever does that for whatever reason is going to be careful.
>>
>>> The "mutator" help for dirty card queue processing is not presently
>>> done for the shared queue, but I think could be today.  I'm less sure
>>> about that with lowered queue lock ranks; I *think* there aren't any
>>> relevant locks there (other than the very rare shared queue lock in
>>> refine_card_concurrently), but that's a substantially larger and more
>>> complex amount of code than SATB queue filtering.
>> As discussed with Thomas earlier in this thread, there are indeed locks blocking this. The HeapRegionRemSet::_m lock is currently a leaf lock. If collaborative refinement was to be performed on non-Java threads (and non-concurrent refinement threads), then this lock would have to decrease to the access rank first. But we concluded that warrants a new RFE with separate analysis.
>>
>> As with the SATB queues though, I do not know what threads would be causing such trouble? It is not JavaThreads (including compiler threads), concurrent refinement threads, concurrent marking threads. That does not leave us with a whole lot of threads to cause that contention on the shared queue lock. And as with the SATB queues, if there are such threads that cause such contention on the shared queue lock, then the right fix seems to be to give them their own local queue and stop taking the shared queue lock in the first place.
> A native thread copying a jweak to a (strong) jobject uses the shared
> queue.  I don't think we're going to fix that by giving native threads
> their own queues.

I am not sure what threads you are referring to here. But I guess that 
is okay.

> A Java thread calls into C++, takes a low-rank lock, and while holding
> that lock touches a queue.  Everything in the queue touching needs to
> be ranked lower than that lock, including filter and mutator assist
> code.  That this isn't permitted today is beside the point; this seems
> to me to be exactly the sort of situation this change is intended to
> permit.

As mentioned earlier, I specifically need the SATB enqueue barriers to 
be free. I want the heap oop store to be free too, but that is not 
blocking me.

> Since I think the rank reductions are a necessary (though not sufficient)
> step, call it Reviewed.

Thank you for the review.

/Erik