RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread
Peter Levart
peter.levart at gmail.com
Thu May 9 11:16:39 UTC 2013
Hi David,
On 05/09/2013 12:02 PM, David Holmes wrote:
> Hi Peter,
>
> Thanks for clarifying the test details. A few follow ups:
>
> - The reference class does get initialized early in the VM startup
> well before Main will be loaded. But the use of the weakref doesn't hurt.
Ok, so if this is guaranteed, we can remove the weakref construction.
>
> - 500ms may not be enough on some platforms, particularly some
> embedded systems. Ideally we could code up a handshake using another
> weakref that would guarantee that the handler thread really survived
> past the OOM. But at some point this just becomes a no-reg-hard
> situation :)
If we used a weakref then there should be a delay between the
referenceHandlerThread.interrupt() and the release of the
WeakReference's referent, otherwise the WeakReference could be cleared
and enqueued before Reference Handler attempts to throw
InterruptedException (this really happens - I tried without delay and
the clearing/enqueueing was quicker than interrupt). After this initial
delay (currently set to 500ms) releasing the referent and waiting for
WeakReference to be enqueued (while executing System.gc()) is analogous
to testing for referenceHandlerThread.isAlive() - the only difference
being the code that has to be executed in Reference Handler while
unwinding the stack until the state of thread changes to TERMINATED. Can
this be delayed for a long time?
>
> - I'm not certain that setting waste=null is sufficient to guarantee
> the next allocation will succeed under all GC's. GC folk can
> confim/deny that.
We can pre-allocate the test-fail exception before doing fillHeap() so
that we can conditionally throw it later, like this:
public class Test {
static Object[] fillHeap() {
Object[] first = null, last = null;
int size = 1 << 20;
while (size > 0) {
try {
Object[] array = new Object[size];
if (first == null) {
first = array;
} else {
last[0] = array;
}
last = array;
} catch (OutOfMemoryError oome) {
size = size >>> 1;
}
}
return first;
}
public static void main(String[] args) throws Exception {
ThreadGroup tg = Thread.currentThread().getThreadGroup();
for (
ThreadGroup tgn = tg;
tgn != null;
tg = tgn, tgn = tg.getParent()
)
;
Thread[] threads = new Thread[tg.activeCount()];
Thread referenceHandlerThread = null;
int n = tg.enumerate(threads);
for (int i = 0; i < n; i++) {
if ("Reference Handler".equals(threads[i].getName())) {
referenceHandlerThread = threads[i];
}
}
if (referenceHandlerThread == null) {
throw new IllegalStateException("Couldn't find Reference
Handler thread.");
}
// pre-allocate failure so that we don't cause OOME later
Exception failure = new Exception("Reference Handler thread
died.");
Object waste = fillHeap();
referenceHandlerThread.interrupt();
// allow referenceHandlerThread some time to throw OOME
Thread.sleep(500L);
if (waste != null && // keep 'waste' reference alive until this
point
!referenceHandlerThread.isAlive()// check that the handler
is still alive
) {
throw failure;
}
}
}
Regards, Peter
>
> Thanks,
> David
>
> On 9/05/2013 6:35 PM, Peter Levart wrote:
>> On 05/09/2013 07:53 AM, David Holmes wrote:
>>> Hi Thomas,
>>>
>>> On 9/05/2013 1:28 AM, Thomas Schatzl wrote:
>>>> Hi,
>>>>
>>>> please review the latest webrev for this patch that is available at
>>>>
>>>> http://cr.openjdk.java.net/~tschatzl/7038914/webrev.2/
>>>>
>>>> As mentioned, it incorporates the fix and reproducer from Peter
>>>> Levart.
>>>
>>> Fix is fine.
>>>
>>> I'm not sure about the test (sorry I didn't pay it attention earlier).
>>> Have you instrumented the code to verify that the test actually
>>> triggers an OOME? It may be possible that if the OOME did kill the
>>> reference thread that we wouldn't necessarily detect it using the
>>> sleeps etc. Also I don't understand the need for the actual weakRef
>>> usage and System.gc() etc. If the heap is full then the interrupt will
>>> wake the reference handler thread and the allocation will trigger the
>>> OOME. It doesn't matter if there are any references to process. The
>>> main logic might then reduce to:
>>>
>>> referenceHandlerThread.interrupt();
>>> for (int i = 0; i < 10; i++) {
>>> if (!referenceHandlerThread.isAlive())
>>> throw new Exception("Reference-handler thread died");
>>> Thread.sleep(1000);
>>> }
>>>
>>> which just gives it 10s to die else assumes all ok. ?
>>>
>>> But I can live with existing test (it just might vacuously pass)
>>>
>>> Cheers,
>>> David
>>
>> Hi David, Thomas,
>>
>> Yes, this was just a left-over from my bug reproducer that demonstrated
>> the situation where WeakReference was cleared, but nothing was enqueue-d
>> into the ReferenceQueue. Reference Handler thread otherwise just sleeps
>> in lock.wait() most of the time if there's nothing to do and
>> interrupting it's thread will trigger allocation of InterruptedException
>> and consequently OOME nevertheless.
>>
>> One think to watch: Reference Handler thread is started in the
>> j.l.r.Reference static initializer, so the Reference class must be
>> initialized. I don't know if there is any guarantee that this is done
>> before entering main(). So there should be something explicit in main()
>> method that ensures it.
>>
>> Also, throwing test-fail exception should be done after releasing the
>> heap or we'll just trigger another OOME without any message to describe
>> the situation.
>>
>> Waiting in a loop for ReferenceHandler thread is not needed since the
>> test will normally succeed and so the whole loop will execute to the
>> end. Only when the test fails the loop will exit faster. In my
>> experience, the thread dies between 10-20ms after interrupting, so
>> waiting for about 500ms is enough I think. Also no System.gc() and
>> additional waiting is needed to report the outcome - just releasing the
>> reference to 'waste' variable is enuugh in my experience to restore the
>> heap into working condition.
>>
>> Here's the updated test that does all that:
>>
>>
>> public class OOMEInReferenceHandler {
>> static Object[] fillHeap() {
>> Object[] first = null, last = null;
>> int size = 1 << 20;
>> while (size > 0) {
>> try {
>> Object[] array = new Object[size];
>> if (first == null) {
>> first = array;
>> }
>> else {
>> last[0] = array;
>> }
>> last = array;
>> }
>> catch (OutOfMemoryError oome) {
>> size = size >>> 1;
>> }
>> }
>> return first;
>> }
>>
>> public static void main(String[] args) throws Exception {
>> // ensure ReferenceHandler thread is started
>> new WeakReference<Object>(new Object());
>>
>> ThreadGroup tg = Thread.currentThread().getThreadGroup();
>> for (ThreadGroup tgn = tg;
>> tgn != null;
>> tg = tgn, tgn = tg.getParent());
>>
>> Thread[] threads = new Thread[tg.activeCount()];
>> Thread referenceHandlerThread = null;
>> int n = tg.enumerate(threads);
>> for (int i = 0; i < n; i++) {
>> if("Reference Handler".equals(threads[i].getName())) {
>> referenceHandlerThread = threads[i];
>> }
>> }
>>
>> if (referenceHandlerThread == null) {
>> throw new IllegalStateException("Couldn't find Reference
>> Handler thread.");
>> }
>>
>> Object waste = fillHeap();
>>
>> referenceHandlerThread.interrupt();
>>
>> Thread.sleep(500L);
>>
>> // release waste so we can allocate and throw normal exceptions
>> waste = null;
>>
>> // check that the handler is still alive
>> if (!referenceHandlerThread.isAlive()) {
>> throw new Exception("Reference Handler thread died.");
>> }
>> }
>> }
>>
>>
>>
>> Regards, Peter
>>
>>>
>>>> Bugs.sun.com:
>>>> http://bugs.sun.com/view_bug.do?bug_id=7038914
>>>>
>>>> JIRA:
>>>> https://jbs.oracle.com/bugs/browse/JDK-7038914
>>>>
>>>> Testing:
>>>> JPRT, with the new reproducer passing
>>>>
>>>> In need of reviewers and a sponsor for this patch; as mentioned
>>>> earlier
>>>> I will set Peter (plevart) as author for this patch, i.e. send a
>>>> patchset to the sponsor for this change with that author set.
>>>>
>>>> Thanks,
>>>> Thomas
>>>>
>>
More information about the hotspot-gc-dev
mailing list