RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues

Roman Kennke rkennke at redhat.com
Mon Mar 25 13:26:27 UTC 2019


This thread went a bit off. May I propose this for review:

http://cr.openjdk.java.net/~rkennke/JDK-8220671/webrev.07/

it passes tier1 tests locally, and I submitted it to jdk/submit but that 
seems to have other hiccups.

WDYT?

Roman

> In Shenandoah testing we discovered an initialization race: A non-Java
> GC thread (we have observed it on the StringDedupThread) may be
> initialized concurrently while Java and GC are already up an running,
> but not (yet) participate in safepointing.
> 
> BS::on_thread_attach() usually does propagate global GC state to
> thread-local GC state, in this case the SATB active flag.
> 
> When doing this concurrently, while not participating in safepointing,
> this may propagate the wrong state, and subsequently lead to heap
> corruption (e.g. because we missed some SATB updates).
> 
> This is related to JDK-8219613 because before that change,
> non-Java-threads would simply use a shared SATB queue instead.
> 
> The bug appeared in Shenandoah testing, but I don't see why it wouldn't
> affect G1 too. It's probably not run with aggressive enough tests to
> make it happen. (We run Shenandoah in aggressive mode, which starts
> continuous GCing right at the start.)
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8220671
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8220671/webrev.03/
> 
> The problem seems specific to the StringDedupThread (for now), and so is
> the solution: in the StringDedupThread's pre_run() and post_run(), take
> the STSJoiner. This ensures that the thread doesn't accidentally crosses
> safepoints while initializing or exiting, and thus loosing SATB updates.
> 
> I tried a couple of other approaches like:
> http://cr.openjdk.java.net/~rkennke/JDK-8220671/webrev.02/
> 
> But we also need to protect the addition/removal of the thread to the
> global NonJavaThread list.
> 
> Testing: Running the offending test (TestStringDedupStress.java) 20x in
> a row. It used to fail ~1 of 5 runs before. Now it all passes. Also,
> hotspot_gc_shenandoah passes. tier1 is fine too. Will push it through
> jdk-submit next.
> 
> Can I please get reviews?
> 
> Thanks,
> Roman
> 



More information about the hotspot-gc-dev mailing list