Preliminary review for 8028107: Kitchensink crashed with EAV

Thu Nov 28 17:45:29 PST 2013

Cool! Wow, that’s a very nice find. I wonder if that the only place this happens…

igor

On Nov 27, 2013, at 5:51 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:

> https://bugs.openjdk.java.net/browse/JDK-8028107
> 
> It is very rare case. Last time it was 1 year ago:
> 
> https://bugs.openjdk.java.net/browse/JDK-7199505
> 
> nmethod unloading (nmethod::make_unloaded()) happens only during safepoint when it contains pointers (embedded or in oop map, relocation info) to dead java objects (see nmethod::can_unload()).
> 
> It can only happens when a nmethod is NOT on call stack. GCs have path over nmethods on stack and they are treated as strong roots. Igor and I looked on GC code and verified it.
> 
> I instrumented nmethod::make_unloaded() to see if we may unloading nmethods on stack and ran kitchensink for few day. No failures.
> 
> The only place left is compiled inline caches. We DO clean up them at the end of safepoint if unloading happens:
> 
>  if (is_in_use()) {
>    // Transitioning directly from live to unloaded -- so
>    // we need to force a cache clean-up; remember this
>    // for later on.
>    CodeCache::set_needs_cache_clean(true);
>  }
> 
> But there is only place where we may create new compiledIC after safepoint when we are resolving call from compiled code: SharedRuntime::resolve_sub_helper().
> 
> In this method we record nmethod in local var:
> 
>  CompiledICInfo virtual_call_info;
> 
>    CompiledIC::compute_monomorphic_entry(callee_method, h_klass,
>                     is_optimized, static_bound, virtual_call_info,
>                     CHECK_(methodHandle()));
> 
> then we take a lock (which is safepoint):
> 
>  // grab lock, check for deoptimization and potentially patch caller
>  {
>    MutexLocker ml_patch(CompiledIC_lock);
> 
> and then create compiledIC based on info in CompiledICInfo:
> 
>        CompiledIC* inline_cache = CompiledIC_before(caller_nm, caller_frame.pc());
>        if (inline_cache->is_clean()) {
>          inline_cache->set_to_monomorphic(virtual_call_info);
>        }
> 
> at this point nmethod could be unloaded already during above lock/safepoint. But call site is patched and will be used to call this nmethod.
> 
> Note, we can't compute entry point under the same lock because:
> 
>  // Compute entry points. This might require generation of C2I converter
>  // frames, so we cannot be holding any locks here. Furthermore, the
>  // computation of the entry points is independent of patching the call.
> 
> 
> Please, comment or find any flaws in my logic. The bug is impossible to reproduce.
> 
> I am suggesting next fix.
> 
> src/share/vm/runtime/sharedRuntime.cpp
> @@ -1258,12 +1258,14 @@
> 
>     // Now that we are ready to patch if the Method* was redefined then
>     // don't update call site and let the caller retry.
> -
> -    if (!callee_method->is_old()) {
> +    // Don't update call site if nmethod was unloaded during safepoint
> +    // which may happened during locking above.
> +    if (!callee_method->is_old() && (callee_method->code() == callee_nm)) {
> #ifdef ASSERT
>       // We must not try to patch to jump to an already unloaded method.
>       if (dest_entry_point != 0) {
> -        assert(CodeCache::find_blob(dest_entry_point) != NULL,
> +        CodeBlob* cb = CodeCache::find_blob(dest_entry_point);
> +        assert((cb != NULL) && cb->is_nmethod() && ((nmethod*)cb)->is_in_use(),
>                "should not unload nmethod while locked");
>       }
> #endif
> 
> 
> Thanks,
> Vladimir