Preliminary review for 8028107: Kitchensink crashed with EAV

Wed Nov 27 17:51:28 PST 2013

https://bugs.openjdk.java.net/browse/JDK-8028107

It is very rare case. Last time it was 1 year ago:

https://bugs.openjdk.java.net/browse/JDK-7199505

nmethod unloading (nmethod::make_unloaded()) happens only during 
safepoint when it contains pointers (embedded or in oop map, relocation 
info) to dead java objects (see nmethod::can_unload()).

It can only happens when a nmethod is NOT on call stack. GCs have path 
over nmethods on stack and they are treated as strong roots. Igor and I 
looked on GC code and verified it.

I instrumented nmethod::make_unloaded() to see if we may unloading 
nmethods on stack and ran kitchensink for few day. No failures.

The only place left is compiled inline caches. We DO clean up them at 
the end of safepoint if unloading happens:

   if (is_in_use()) {
     // Transitioning directly from live to unloaded -- so
     // we need to force a cache clean-up; remember this
     // for later on.
     CodeCache::set_needs_cache_clean(true);
   }

But there is only place where we may create new compiledIC after 
safepoint when we are resolving call from compiled code: 
SharedRuntime::resolve_sub_helper().

In this method we record nmethod in local var:

   CompiledICInfo virtual_call_info;

     CompiledIC::compute_monomorphic_entry(callee_method, h_klass,
                      is_optimized, static_bound, virtual_call_info,
                      CHECK_(methodHandle()));

then we take a lock (which is safepoint):

   // grab lock, check for deoptimization and potentially patch caller
   {
     MutexLocker ml_patch(CompiledIC_lock);

and then create compiledIC based on info in CompiledICInfo:

         CompiledIC* inline_cache = CompiledIC_before(caller_nm, 
caller_frame.pc());
         if (inline_cache->is_clean()) {
           inline_cache->set_to_monomorphic(virtual_call_info);
         }

at this point nmethod could be unloaded already during above 
lock/safepoint. But call site is patched and will be used to call this 
nmethod.

Note, we can't compute entry point under the same lock because:

   // Compute entry points. This might require generation of C2I converter
   // frames, so we cannot be holding any locks here. Furthermore, the
   // computation of the entry points is independent of patching the call.


Please, comment or find any flaws in my logic. The bug is impossible to 
reproduce.

I am suggesting next fix.

src/share/vm/runtime/sharedRuntime.cpp
@@ -1258,12 +1258,14 @@

      // Now that we are ready to patch if the Method* was redefined then
      // don't update call site and let the caller retry.
-
-    if (!callee_method->is_old()) {
+    // Don't update call site if nmethod was unloaded during safepoint
+    // which may happened during locking above.
+    if (!callee_method->is_old() && (callee_method->code() == callee_nm)) {
  #ifdef ASSERT
        // We must not try to patch to jump to an already unloaded method.
        if (dest_entry_point != 0) {
-        assert(CodeCache::find_blob(dest_entry_point) != NULL,
+        CodeBlob* cb = CodeCache::find_blob(dest_entry_point);
+        assert((cb != NULL) && cb->is_nmethod() && 
((nmethod*)cb)->is_in_use(),
                 "should not unload nmethod while locked");
        }
  #endif


Thanks,
Vladimir