RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread
David Holmes
david.holmes at oracle.com
Thu May 9 11:33:22 UTC 2013
On 9/05/2013 9:16 PM, Peter Levart wrote:
> Hi David,
>
> On 05/09/2013 12:02 PM, David Holmes wrote:
>> Hi Peter,
>>
>> Thanks for clarifying the test details. A few follow ups:
>>
>> - The reference class does get initialized early in the VM startup
>> well before Main will be loaded. But the use of the weakref doesn't hurt.
>
> Ok, so if this is guaranteed, we can remove the weakref construction.
>
>>
>> - 500ms may not be enough on some platforms, particularly some
>> embedded systems. Ideally we could code up a handshake using another
>> weakref that would guarantee that the handler thread really survived
>> past the OOM. But at some point this just becomes a no-reg-hard
>> situation :)
>
> If we used a weakref then there should be a delay between the
> referenceHandlerThread.interrupt() and the release of the
> WeakReference's referent, otherwise the WeakReference could be cleared
> and enqueued before Reference Handler attempts to throw
> InterruptedException (this really happens - I tried without delay and
> the clearing/enqueueing was quicker than interrupt). After this initial
> delay (currently set to 500ms) releasing the referent and waiting for
> WeakReference to be enqueued (while executing System.gc()) is analogous
> to testing for referenceHandlerThread.isAlive() - the only difference
> being the code that has to be executed in Reference Handler while
> unwinding the stack until the state of thread changes to TERMINATED. Can
> this be delayed for a long time?
What I was thinking was that after we interrupt we then create a new
weakref and we loop until the ref gets cleared, or we detect the
reference thread is not alive (with a sleep in between). One of those
two conditions must become true.
>>
>> - I'm not certain that setting waste=null is sufficient to guarantee
>> the next allocation will succeed under all GC's. GC folk can
>> confim/deny that.
>
> We can pre-allocate the test-fail exception before doing fillHeap() so
> that we can conditionally throw it later, like this:
So when we do the throw there is logic in the launcher that will try to
print the stacktrace, which may also encounter OOME unless we have freed
the heap. So I think we want to release memory, I just wasn't certain
that simply setting waste=null would guarantee it.
David
------
> public class Test {
> static Object[] fillHeap() {
> Object[] first = null, last = null;
> int size = 1 << 20;
> while (size > 0) {
> try {
> Object[] array = new Object[size];
> if (first == null) {
> first = array;
> } else {
> last[0] = array;
> }
> last = array;
> } catch (OutOfMemoryError oome) {
> size = size >>> 1;
> }
> }
> return first;
> }
>
> public static void main(String[] args) throws Exception {
> ThreadGroup tg = Thread.currentThread().getThreadGroup();
> for (
> ThreadGroup tgn = tg;
> tgn != null;
> tg = tgn, tgn = tg.getParent()
> )
> ;
>
> Thread[] threads = new Thread[tg.activeCount()];
> Thread referenceHandlerThread = null;
> int n = tg.enumerate(threads);
> for (int i = 0; i < n; i++) {
> if ("Reference Handler".equals(threads[i].getName())) {
> referenceHandlerThread = threads[i];
> }
> }
>
> if (referenceHandlerThread == null) {
> throw new IllegalStateException("Couldn't find Reference
> Handler thread.");
> }
>
> // pre-allocate failure so that we don't cause OOME later
> Exception failure = new Exception("Reference Handler thread
> died.");
>
> Object waste = fillHeap();
>
> referenceHandlerThread.interrupt();
>
> // allow referenceHandlerThread some time to throw OOME
> Thread.sleep(500L);
>
> if (waste != null && // keep 'waste' reference alive until this
> point
> !referenceHandlerThread.isAlive()// check that the handler
> is still alive
> ) {
> throw failure;
> }
> }
> }
>
>
>
> Regards, Peter
>
>>
>> Thanks,
>> David
>>
>> On 9/05/2013 6:35 PM, Peter Levart wrote:
>>> On 05/09/2013 07:53 AM, David Holmes wrote:
>>>> Hi Thomas,
>>>>
>>>> On 9/05/2013 1:28 AM, Thomas Schatzl wrote:
>>>>> Hi,
>>>>>
>>>>> please review the latest webrev for this patch that is available at
>>>>>
>>>>> http://cr.openjdk.java.net/~tschatzl/7038914/webrev.2/
>>>>>
>>>>> As mentioned, it incorporates the fix and reproducer from Peter
>>>>> Levart.
>>>>
>>>> Fix is fine.
>>>>
>>>> I'm not sure about the test (sorry I didn't pay it attention earlier).
>>>> Have you instrumented the code to verify that the test actually
>>>> triggers an OOME? It may be possible that if the OOME did kill the
>>>> reference thread that we wouldn't necessarily detect it using the
>>>> sleeps etc. Also I don't understand the need for the actual weakRef
>>>> usage and System.gc() etc. If the heap is full then the interrupt will
>>>> wake the reference handler thread and the allocation will trigger the
>>>> OOME. It doesn't matter if there are any references to process. The
>>>> main logic might then reduce to:
>>>>
>>>> referenceHandlerThread.interrupt();
>>>> for (int i = 0; i < 10; i++) {
>>>> if (!referenceHandlerThread.isAlive())
>>>> throw new Exception("Reference-handler thread died");
>>>> Thread.sleep(1000);
>>>> }
>>>>
>>>> which just gives it 10s to die else assumes all ok. ?
>>>>
>>>> But I can live with existing test (it just might vacuously pass)
>>>>
>>>> Cheers,
>>>> David
>>>
>>> Hi David, Thomas,
>>>
>>> Yes, this was just a left-over from my bug reproducer that demonstrated
>>> the situation where WeakReference was cleared, but nothing was enqueue-d
>>> into the ReferenceQueue. Reference Handler thread otherwise just sleeps
>>> in lock.wait() most of the time if there's nothing to do and
>>> interrupting it's thread will trigger allocation of InterruptedException
>>> and consequently OOME nevertheless.
>>>
>>> One think to watch: Reference Handler thread is started in the
>>> j.l.r.Reference static initializer, so the Reference class must be
>>> initialized. I don't know if there is any guarantee that this is done
>>> before entering main(). So there should be something explicit in main()
>>> method that ensures it.
>>>
>>> Also, throwing test-fail exception should be done after releasing the
>>> heap or we'll just trigger another OOME without any message to describe
>>> the situation.
>>>
>>> Waiting in a loop for ReferenceHandler thread is not needed since the
>>> test will normally succeed and so the whole loop will execute to the
>>> end. Only when the test fails the loop will exit faster. In my
>>> experience, the thread dies between 10-20ms after interrupting, so
>>> waiting for about 500ms is enough I think. Also no System.gc() and
>>> additional waiting is needed to report the outcome - just releasing the
>>> reference to 'waste' variable is enuugh in my experience to restore the
>>> heap into working condition.
>>>
>>> Here's the updated test that does all that:
>>>
>>>
>>> public class OOMEInReferenceHandler {
>>> static Object[] fillHeap() {
>>> Object[] first = null, last = null;
>>> int size = 1 << 20;
>>> while (size > 0) {
>>> try {
>>> Object[] array = new Object[size];
>>> if (first == null) {
>>> first = array;
>>> }
>>> else {
>>> last[0] = array;
>>> }
>>> last = array;
>>> }
>>> catch (OutOfMemoryError oome) {
>>> size = size >>> 1;
>>> }
>>> }
>>> return first;
>>> }
>>>
>>> public static void main(String[] args) throws Exception {
>>> // ensure ReferenceHandler thread is started
>>> new WeakReference<Object>(new Object());
>>>
>>> ThreadGroup tg = Thread.currentThread().getThreadGroup();
>>> for (ThreadGroup tgn = tg;
>>> tgn != null;
>>> tg = tgn, tgn = tg.getParent());
>>>
>>> Thread[] threads = new Thread[tg.activeCount()];
>>> Thread referenceHandlerThread = null;
>>> int n = tg.enumerate(threads);
>>> for (int i = 0; i < n; i++) {
>>> if("Reference Handler".equals(threads[i].getName())) {
>>> referenceHandlerThread = threads[i];
>>> }
>>> }
>>>
>>> if (referenceHandlerThread == null) {
>>> throw new IllegalStateException("Couldn't find Reference
>>> Handler thread.");
>>> }
>>>
>>> Object waste = fillHeap();
>>>
>>> referenceHandlerThread.interrupt();
>>>
>>> Thread.sleep(500L);
>>>
>>> // release waste so we can allocate and throw normal exceptions
>>> waste = null;
>>>
>>> // check that the handler is still alive
>>> if (!referenceHandlerThread.isAlive()) {
>>> throw new Exception("Reference Handler thread died.");
>>> }
>>> }
>>> }
>>>
>>>
>>>
>>> Regards, Peter
>>>
>>>>
>>>>> Bugs.sun.com:
>>>>> http://bugs.sun.com/view_bug.do?bug_id=7038914
>>>>>
>>>>> JIRA:
>>>>> https://jbs.oracle.com/bugs/browse/JDK-7038914
>>>>>
>>>>> Testing:
>>>>> JPRT, with the new reproducer passing
>>>>>
>>>>> In need of reviewers and a sponsor for this patch; as mentioned
>>>>> earlier
>>>>> I will set Peter (plevart) as author for this patch, i.e. send a
>>>>> patchset to the sponsor for this change with that author set.
>>>>>
>>>>> Thanks,
>>>>> Thomas
>>>>>
>>>
>
More information about the core-libs-dev
mailing list