RFR: 8205577: parallel/TestPrintGCDetailsVerbose.java fails assertion

David Holmes david.holmes at oracle.com
Wed Jun 27 05:44:36 UTC 2018


Hi Kim,

Not printing seems extremely reasonable.

Thanks,
David

On 27/06/2018 3:03 PM, Kim Barrett wrote:
> Please review this fix of an assertion failure during ParallelGC young
> collections.
> 
> We have a task (thread) walking the code cache, copying and forwarding
> oop references in those cached nmethods containing scavengable
> (e.g. young) oops.  After the oops in an nmethod are processed, a
> second pass is made to determine whether any of the oops are still
> scavengable, removing the nmethod from the list containing scavengable
> oops.  That speeds up future collection cycles.
> 
> That second pass check logs the values of any scavengable oops
> referred to by the nmethod.  This is where things go wrong.  If the
> object being printed is a j.l.Class, deep in the guts of the printing
> there is an assert that the mirror of the klass of the object is that
> object.  But if the klass mirror has not yet been forwarded, we can be
> comparing a forward object with the not yet forward version, and the
> assertion fails.
> 
> There is another task that is running in parallel with the code cache
> walk that is processing the CLD graph, but it may not have reached the
> relevant klass yet, so the klass's mirror hasn't been forwarded yet.
> 
> But there's another complication that makes this crash even more
> difficult to trigger.  The ParallelGC young collection's code cache
> walk promotes nmethod-referenced oops, and that promotion makes the
> object no longer scavengable.  So to trigger the crash, some third
> thread must have copied a young class to survivor space, then the code
> cache walk processes an nmethod referring to that class and prints the
> still young class, all before the CLD graph walk has forwarded the
> klass's mirror.
> 
> This problem has been lurking all along, but JDK-8203837 changed the
> printing from only occurring when TraceScavenge was true to printing
> when gc+nmethod=trace logging is enabled.  As there were no tests that
> enabled TraceScavenge, we didn't previously notice the problem.  But
> there are tests that enable gc+nmethod=trace, enabling the problematic
> printing.
> 
> The solution being taken here is to change that logging to no longer
> attempt to print the object.  Printing an object while in the middle
> of a copying GC, especially one that is parallel, seems quite risky,
> so let's just not do that.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8205577
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8205577/open.00/
> 
> Testing:
> mach5 tier1,2,3
> 
> I haven't been able to reproduce the failure; as discussed above, it's
> very timing sensitive.  But the scenario described above accounts for
> both the problem and the timing of its appearance.
> 
> 


More information about the hotspot-dev mailing list