RFR: 8221102: Allow GC threads to participate in threads claiming protocol

Roman Kennke rkennke at redhat.com
Thu Mar 21 22:55:46 UTC 2019


Hi Kim,

> Please review this change to the thread claiming protocol for parallel
> iteration over threads. Previously, the protocol used a 1bit claim
> counter plus zero as a "never claimed" marker. The new protocol uses a
> uintx counter, while retaining zero as a "never claimed" marker.
> 
> JDK-8219613 changed use use of the claiming protocol, allowing all
> threads to be included and claimed by a parallel iteration.  Doing so
> introduced a bug.
> 
> Not all parallel iteration paths visit every thread. In particular,
> possibly_parallel_threads_do (PPTD) only visits Java threads and the
> VM thread. Using both all-thread parallel iterations and PPTD could
> lead to some threads being improperly skipped, because of the 1bit
> claim counter. For example, if an all-thread iteration is followed by
> an even number of PPTD then another all-thread iteration, the second
> all-thread iteration will incorrectly skip NonJavaThreads other than
> the VMThread.
> 
> (Testing these changes with an instrumented claim function that
> detected cases that would be in trouble with the old 1bit counter
> found *many* occurrences.  It is somewhat surprising that this doesn't
> seem to have produced lots of failures since JDK-8219613.)
> 
> We address this by expanding the claim counter to a uintx-sized
> counter. There is still a theoretical possibility of collision, which
> we deal with by resetting the claim value of all threads back to zero
> when the global counter overflows. (On a 32bit platform, overflow of
> the claim counter probably takes months. And even that is likely more
> often than needed, since a collision needs MAX_UINTX-1 PPTD iterations
> between all-thread iterations, which isn't how iterations are used.)
> 
> Some renaming was done to generalize from the old "parity"
> nomenclature, and because it's not just about "oops_do" (and wasn't
> before this change). Hopefully there won't be *too* much bike-shedding
> on the names.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8221102
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8221102/open.00/
> 
> Testing:
> mach5 tier1-5.

This would work.

I am going back and forth with this in my head. I believe the simpler, 
more usable and perhaps less surprising approach might be to simply 
always iterate all threads, and do not even let other code use the 
claiming protocol, and care about its idiosyncrasies:

http://cr.openjdk.java.net/~rkennke/JDK-8221102/webrev.01/

This way the claim token would always go in lock-step. The overhead of 
iterating a few more non-Java-threads (somewhat likely doing nothing) 
seems not very significant.

This could probably be taken a little further and hiding more of the 
claiming protocol. Thread::claim_oops_do() doesn't exactly need to be 
public anymore. Thread::thread_claim_parity() doesn't appear to be used 
anywhere outside Thread, except for an assert which could be done 
somehow else, and change_thread_claim_parity() could be done via friend 
classes or so.

I don't have a strong preference for your vs my solution, except that I 
want this bug fixed. :-) Just wanted to throw it out so that others can 
chime into the discussion.

Roman


More information about the hotspot-dev mailing list