RFR: JDK-8066859 java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: Reference Handler thread died

Peter Levart peter.levart at gmail.com
Thu May 7 10:25:21 UTC 2015



On 05/07/2015 09:06 AM, Laurent Bourgès wrote:
>
> Peter,
>
> I looked at Cleaner by curiosity and it seems to be not catching the 
> oome from thunk.run !
>
> If oome1 is thrown by thunk.run at line 150 then it is catched at line 
> 157 but your new try/catch block (oome2) only encapsulates the 
> doPriviledge block.
>
> If this block also throws a new oome2 due to the first oome1 (no 
> memory left), it will work but I would have prefered a more explicit 
> solution and check oome1 first ...
>
> My 2 cents (I am not a reviewer).
>
> Laurent
>

Laurent,

You have a point and I asked myself the same question. The question is 
how to treat OOME thrown from thunk.run(). Current behavior is to exit() 
JVM for any exception (Throwable). I maintained that semantics. I only 
added a handler for OOME thrown in the handler of the 1st exception. I 
might have just exit()-ed the VM if OOME is thrown, but leaving no trace 
and just exiting VM would not help anyone diagnose what went wrong. So I 
opted for keeping the VM running for a while by delaying the handling of 
1st exception to "better times". If better times never come, then the 
application is probably stuck anyway.

An alternative would be to catch OOME from thunk.run() and ignore it 
(printing it out would be ugly if VM is left to run), but that would 
silently ignore OOMEs thrown from thunk.run() and noone would notice 
that Cleaner(s) might not have clean-ed up the resources they should.

The complete fix would be to inspect the code paths of all 
Cleaner.thunk's run() methods and see if and where they can throw 
exceptions (OOMEs in particular) and whether they can be prevented. I 
did that and found myself wandering deeply in the hotspot code that I 
don't understand completely. Cleaner's are used in the following places:

- in java.lang.invoke.CallSite, to invalidate the dependent nmethods 
when the context class is GC-ed - that one was added recently. 
Intermittent failures of the test predate it's addition, so I would not 
suspect this one immediately.

- in sun.misc.Perf, to detach the memory of a native ByteBuffer obtained 
from Perf.attach() when the ByteBuffer is GC-ed. The Java code-path does 
not have any allocations and the native code that does detaching is in 
hotspot/src/share/vm/prims/perf.cpp:

static JNINativeMethod perfmethods[] = {
     ...
   {CC"detach",              CC"("BB")V", FN_PTR(Perf_Detach)},So c
     ...

PERF_ENTRY(void, Perf_Detach(JNIEnv *env, jobject unused, jobject buffer))

   PerfWrapper("Perf_Detach");

   if (!UsePerfData) {
     // With -XX:-UsePerfData, detach is just a NOP
     return;
   }

   void* address = 0;
   jlong capacity = 0;

   // get buffer address and capacity
   {
    ThreadToNativeFromVM ttnfv(thread);
    address = env->GetDirectBufferAddress(buffer);
    capacity = env->GetDirectBufferCapacity(buffer);
   }

   PerfMemory::detach((char*)address, capacity, CHECK);

PERF_END

- in sun.nio.ch.IOVecWrapper, to deallocate native memory associated 
with the wrapper when it is GC-ed. The Java code-path does not perform 
any allocations and finally just calls native 
Unsafe.freeMemory(allocationAddress) where allocationAddress was 
obtained from Unsafe.allocateMemory(size). The native code for 
Unsafe.freeMemory is in hotspot/src/share/vm/prims/unsafe.cpp:

static JNINativeMethod methods[] = {
     ...
     {CC"freeMemory",         CC"("ADR")V", FN_PTR(Unsafe_FreeMemory)},
     ...

UNSAFE_ENTRY(void, Unsafe_FreeMemory(JNIEnv *env, jobject unsafe, jlong 
addr))
   UnsafeWrapper("Unsafe_FreeMemory");
   void* p = addr_from_java(addr);
   if (p == NULL) {
     return;
   }
   os::free(p);
UNSAFE_END

- in sun.nio.fs.NativeBuffer, to deallocate native memory allocated for 
NativeBuffer by Unsafe.allocateMemory(size) with exactly the same 
Cleaner.thunk (Deallocator) as used in sun.nio.ch.IOVecWrapper.

Can anyone confirm whether the above two native methods can throw any 
exception or not?

Anyway. If none of the Cleaner.thunk's run() methods can throw any 
exception, then my handling of OOME is redundant and a code-path never 
taken. But I would still leave it there in case some new Cleaner use 
comes along which is not verified yet...

Regards, Peter




More information about the core-libs-dev mailing list