(Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread
Peter Levart
peter.levart at gmail.com
Thu May 2 18:27:15 UTC 2013
On 04/30/2013 04:57 PM, Thomas Schatzl wrote:
> Hi all,
>
> the webrev at http://cr.openjdk.java.net/~tschatzl/7038914/webrev/
> presents a first stab at the CR "7038914: VM could throw uncaught OOME
> in ReferenceHandler thread".
>
> The problem is that under very heavy memory pressure, there is the
> reference handler throws an exception with the message "Exception:
> java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in
> thread "Reference Handler".
>
> The change improves handling of out-of-memory conditions in the
> ReferenceHandler thread. Instead of crashing the thread, and then
> disabling reference processing, it catches this exception and continues.
>
> I'd like to discuss the change as I'm not really familiar with JDK
> coding style, handling of such situations and have some questions about
> it.
>
> Bugs.sun
> http://bugs.sun.com/view_bug.do?bug_id=7038914
>
> JBS:
> https://jbs.oracle.com/bugs/browse/JDK-7038914
>
> Proposed webrev:
> http://cr.openjdk.java.net/~tschatzl/7038914/webrev/
>
> - first, I could not reliably reproduce the issue using the information
> in the CR. Only via code review (and an idea from Bengt Rutisson -
> thanks!) I implemented a nice way to reproduce an OOME in the reference
> handler. This involves implementing a custom
> java.lang.ref.ReferenceQueue and overriding the enqueue() method, and
> doing some allocation that causes an OOME within that method.
> My current theory is that synchronization/locking allocates some objects
> on the java heap, which are very small, so an OOME in that thread can be
> caused. I walked the locking code, but could not find a java heap
> allocation there (ObjectMonitor seems to be a C heap object) - maybe I
> overlooked it. Probably somebody else knows?
Hi Tomas,
I don't know if this is the case here, but what if the ReferenceHandler
thread is interrupted while wait()-ing and the construction of
InterruptedException triggers OOME?
Regards, Peter
> It cannot be the invocation of the Cleaner.clean() methods above the
> enqueuing since it has it's own try-catch block already.
> Anyway, since the reproducer I wrote shows the same symptoms as reported
> in the CR, I hope that this test case is sufficient to be regarded as a
> reproducer and the change as a fix.
>
> - the actual change in java/lang/ref/Reference as mentioned involves
> putting the entire main enqueuing procedure within a try-catch block.
> It only catches OOME to decrease the possibility to catch anything that
> should not be caught.
> The problem is that this fix does not (and cannot) really fix bad
> programming in anyone overriding java.lang.ref.ReferenceQueue.enqueue(),
> i.e. if the OOME condition is before the actual execution of the
> original enqueue() method, i.e. corruption of the queue may be still
> possible.
> On the other hand, since overriding ReferenceQueue.enqueue() requires
> putting the custom ReferenceQueue into the boot class path, I assume
> that people doing that are aware of possible issues.
>
> - handling the OOME: in the catch block of the I put a block
>
> // avoid crashing the reference handler thread,
> // but provide for some diagnosability
> assert false : e.toString();
>
> to provide some diagnosability in the case of an exception (when
> running with assertions). I copied that from other code that tries to
> catch similar problems in the clean() method of the Cleaners. There are
> other variants of managing this in the jdk, some involving calling
> system.exit(). I thought that was too drastic, so I didn't do that, but
> what is the appropriate way to handle this situation?
>
> - if the use of locks or the synchronization keyword is indeed the
> problem, I think it is possible to use nonblocking synchronization that
> is known to not allocate any memory for managing the reference queues
> instead. However I think to guard against misbehaving ReferenceQueue
> implementations you'd still want to have a try-catch block here.
>
> - is the location of the test correct? I.e. in the jdk
> test/java/lang/ref directory? Or is the correct place for that the
> hotspot test directories?
>
> Since this is (seems to be) a JDK only change, and this is my first time
> changing the JDK, I hope core-libs-dev is the right mailing list.
> Otherwise please direct me to the the appropriate one.
>
> Thanks,
> Thomas
>
More information about the core-libs-dev
mailing list