RFR (L): 8230706: Waiting on completion of strong nmethod processing causes long pause times with G1
    Thomas Schatzl 
    thomas.schatzl at oracle.com
       
    Sat Oct 19 13:06:09 UTC 2019
    
    
  
Hi all,
  there is a new webrev at 
http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2/ (full only,
there is no point in providing a diff)
since I like this solution a lot as it removes a lot of additional
post-processing.
Testing has been a bit of a headache: interference between strong and
weak processing is extremely rare, so I had to make it pretty common by
1) only a single thread doing strong processing
2) the weak processing stage has to be moved right after the root
processing so they overlap with a lot higher probability
hs-tier 1-5 passes with and without these changes, with a noticable
amount of overlap according to additional log messages. That change can
be looked at at 
http://cr.openjdk.java.net/~tschatzl/8230706/webrev.2.testing/ .
Obviously I am not going to push this.
Surprisingly there had to be no changes to Shenandoah as it does not
use the claim mechanism changed here, implementing something else.
Shenandoah also passed vmTestbase/gc with these changes with no
problem.
Below this email is a copy of Kim's suggestion about the state machine
again for reference. I also added documentation about why and how the
code is supposed to work.
Thanks,
  Thomas
On Wed, 2019-10-09 at 17:23 -0400, Kim Barrett wrote:
> > On Oct 8, 2019, at 7:48 PM, Kim Barrett <kim.barrett at oracle.com>
> > wrote:
> > src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> > 3874   if (collector_state()->in_initial_mark_gc()) {
> > 3875     remark_strong_nmethods(per_thread_states);
> > 3876   }
> > 
> > I think this additional task and the associated pending strong
> > nmethod
> > sets in the pss can be eliminated by using a 2-bit tag and a more
> > complex state machine earlier.
> 
> I thought about this some more and have some improvements to the
> previous pseudo-code, including eliminating the loop in
> strong_processor.  More careful consideration of the possible states
> showed them to be more limited than I'd previously thought they were.
> I hadn't noticed the benefit from delaying weak_processor's push onto
> the global list and combining it with the transition to the "weak
> done" state.
> 
> States, encoded in the link member of nmethod N:
> - unclaimed: NULL
> - weak: N, tag 00
> - weak done: NEXT, tag 01
> - weak, need strong: N, tag 10
> - strong: NEXT, tag 11
> 
> where NEXT is the next nmethod in the global list, or N if it is the
> last entry, e.g. self-loop indicates end of list.
> 
> weak_processor(n):
>     if n->link != NULL:
>         # already claimed; nothing to do here.
>         return
>     elif not replace_if_null(tagged(n, 0), &n->link):
>         # just claimed by another thread; nothing to do here.
>         return
>     # successfully claimed for weak processing.
>     assert n->link == tagged(n, 0)
>     do_weak_processing(n)
>     # push onto global list.  self-loop end of list to avoid tagged
> NULL.
>     # not pushing onto global list until ready to mark weak
> processing
>     # done significantly simplifies the set of states.
>     next = xchg(n, &_list_head) 
>     if next == NULL: next = n 
>     # try to install end of list + weak done tag.
>     if cmpxchg(tagged(next, 1), &n->link, tagged(n, 0)) == tagged(n,
> 0):
>         return
>     # failed, which means some other thread added strong request.
>     assert n->link == tagged(n, 2)
>     # do deferred strong processing.
>     n->link = tagged(next, 3)
>     do_strong_processing(n)
> 
> strong_processor(n):
>     raw_next = cmpxchg(tagged(n, 3), &n->link, NULL)
>     if raw_next == NULL:
>         # successfully claimed for strong processing.
>         do_strong_processing(n)
>         # push onto global list.  self-loop end of list to avoid
> tagged NULL.
>         next = xchg(n, &_list_head)
>         if next == NULL: next = n
>         n->link = tagged(next, 3)
>         return
>     # claim failed.  figure out why and handle it.
>     next = strip_tag(raw_next)
>     if raw_next == next:          # (raw_next - next) == 0
>         # claim failed because being weak processed (state ==
> "weak").
> 	# try to request deferred strong processing.
>         assert next == tagged(n, 0)
>         raw_next = cmpxchg(tagged(n, 2), &n->link, next)
>         if (raw_next == next):
>             # successfully requested deferred strong processing.
>             return
>         # failed because of a concurrent transition.
> 	# no longer in "weak" state.
>         next = strip_tag(raw_next)
>     if (raw_next - next) >= 2:
>         # already claimed for strong processing or requested for
> such.
>         return
>     # weak processing is complete.
>     # raw_next: tag == 1, NEXT == next list entry or N    
>     if cmpxchg(tagged(NEXT, 3), &N->link, raw_next) == raw_next:
>         # claimed "weak done" to "strong".
>         do_strong_processing(N)
>     # if claim failed then some other thread got it.
> 
    
    
More information about the hotspot-gc-dev
mailing list