[9] RFR(S): 8074553: Crash with assert(!is_unloaded()) failed: should not call follow on unloaded nmethod

Tobias Hartmann tobias.hartmann at oracle.com
Fri Apr 1 09:35:28 UTC 2016


Hi Mikael,

On 01.04.2016 10:24, Mikael Gerdin wrote:
> Hi Tobias,
> 
> On 2016-04-01 09:28, Tobias Hartmann wrote:
>> Hi Mikael,
>>
>> On 31.03.2016 18:13, Mikael Gerdin wrote:
>>> Hi,
>>>
>>> On 2016-03-31 17:49, Vladimir Kozlov wrote:
>>>> On 3/31/16 6:22 AM, Tobias Hartmann wrote:
>>>>> Hi,
>>>>>
>>>>> please review the following patch:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8074553
>>>>> http://cr.openjdk.java.net/~thartmann/8074553/webrev.00/
>>>>>
>>>>> While the code cache sweeper processes a nmethod in
>>>>> NMethodSweeper::process_nmethod(), safepoints may happen and the GC
>>>>> may unload the currently processed nmethod. To prevent this, the
>>>>> sweeper uses a NMethodMarker which saves the nmethod in
>>>>> CodeCacheSweeperThread::_scanned_nmethod. The nmethod is then passed
>>>>> to the GC through a CodeBlobClosure in
>>>>> CodeCacheSweeperThread::oops_do() to keep it alive when the GC
>>>>> iterates over all threads.
>>>>>
>>>>> The problem is that G1 calls nmethods_do() on all threads in the
>>>>> remark phase (see G1RemarkThreadsClosure::do_thread()) which is not
>>>>> overwritten by the sweeper thread. Since the currently processed
>>>>> nmethod is not passed through nmethods_do() by any thread, it is
>>>>> unloaded and we later hit the assert when encountering the nmethod
>>>>> through oops_do().
>>>>>
>>>>> Mikael Gerdin and Stefan Karlsson (thanks again!) suggested to
>>>>> overwrite nmethods_do() as well in CodeCacheSweeperThread and pass
>>>>> _scanned_nmethod to the closure. I also modified
>>>>> Threads::nmethods_do() to ignore the sweeper thread because we want to
>>>>> avoid marking the _scanned_nmethod as seen on the stack when scanning
>>>>> stacks from the sweeper (the nmethod may already be zombie and only
>>>>> referenced by the sweeper).
>>>>
>>>> I did not get this. If you exclude CodeCacheSweeperThread in
>>>> Threads::nmethods_do() then CodeCacheSweeperThread::nmethods_do() will
>>>> not be called. What is the point?
>>>
>>> The GC code in question iterates over the threads and calls nmethods_do on each JavaThread and the VMThread after claiming them with atomic operations to achieve parallelism.
>>>
>>> There is still something a bit fishy here though.
>>> Thread::nmethods_do is not virtual, so one must be careful to downcast one's Thread* to JavaThread before calling it. And since Tobias' change does not make that or JavaThread::nmethods_do (which actually shadows and does not override the methods) virtual we can't reach the new code unless the GC code is changed to downcast to CodeCacheSweeperThread before calling nmethods_do.
>>
>> Thanks for taking a look. Of course, you are right. I just assumed that Thread::nmethods_do() is virtual and missed that it's not.
>>
>>> I still believe that what Tobias is attempting to do is a necessary fix but this only shows how hard this is to reproduce.
>>>
>>> Perhaps Thread::nmethods_do should simply be removed (along with any calls to it) and JavaThread::nmethods_do should then be made virtual.
>>
>> Yes, I agree. Here is the new webrev:
>> http://cr.openjdk.java.net/~thartmann/8074553/webrev.01/
>>
>> I removed the empty Thread::nmethods_do() and all the (static) calls to it.
> 
> I kind of agree with Vladimir that it's unfortunate to have to check for the sweeper thread in Threads::nmethods_do() but I don't have a good suggestion for an easy fix.
> 
> From my point of view this fix is ready to go in but perhaps it would be a good idea to also think about if this could be cleaned up somehow.
> 
> For me - as a GC person - it's hard to see why the sweeper thread should be a JavaThread at all, it seems like more of a VM-only thread to me.

Yes, I agree that it is kind of counterintuitive that the sweeper thread is a JavaThread. I think this is because before JDK-8046809 [1] sweeping was done by the CompilerThreads which are JavaThreads as well.

We could make the CodeCacheSweeperThread a subclass of NamedThread similar to ConcurrentGCThread but the problem is that the GC calls oops_do() only on the JavaThreads and the VMThread (see Threads::oops_do() or ThreadRootsMarkingTask::do_it()). That code would have to be modified to include the sweeper thread as well which seems a bit hacky to me.

What do you think?

Thanks,
Tobias

[1] https://bugs.openjdk.java.net/browse/JDK-8046809

> 
> /Mikael
> 
>>
>> Thanks,
>> Tobias
>>
>>>
>>> /Mikael
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Unfortunately, this bug is extremely hard to reproduce (it showed up
>>>>> 18 times since early 2015). I was able to reproduce it only once after
>>>>> thousands of runs and therefore not often enough to verify the fix.
>>>>> However, I'm very confident that this solves the problem.
>>>>>
>>>>> Tested with JPRT and RBT (running).
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>


More information about the hotspot-dev mailing list