RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

Peter Levart peter.levart at gmail.com
Tue May 7 14:10:15 UTC 2013


On 05/07/2013 03:26 PM, Thomas Schatzl wrote:
> Hi,
>
> On Tue, 2013-05-07 at 15:12 +0200, Peter Levart wrote:
>> On 05/07/2013 09:51 AM, Thomas Schatzl wrote:
>>> Hi all,
>>>
>>> On Tue, 2013-05-07 at 12:31 +1000, David Holmes wrote:
>>>> Catching ThreadDeath is futile. If someone is invoking stop() then you
>>>> can encounter the ThreadDeath anywhere and it is impossible to write
>>>> completely robust code in the face of such an async exception. So please
>>>> let's not even go there. stop() is long deprecated and should never be used.
>>>>
>>>> Backing up I think the try/catch(IE|OOME) around wait() is the most
>>>> reasonable solution here. Anyone messing with instrumentation or
>>>> overriding etc can break things - so be it - don't do that.
>>>> StackOverflowError can also completely break many things - again it is
>>>> effectively an async exception and writing async-exception-safe Java
>>>> code is impractical if not impossible.
>>>     I can understand this reasoning.
>>>
>>> I provided a new patch (this time for review)
>>> http://cr.openjdk.java.net/~tschatzl/7038914/webrev.1/ which implements
>>> this change as suggested.
>>>
>>> Regarding regression testing, I marked this bug as "noreg-other" with
>>> the explanation that it is too hard to write a proper regression test,
>>> and the note that any test would involve using methods that we don't
>>> give any guarantees for (overriding package private jdk methods,
>>> instrumentation).
>> Hi Thomas,
>>
>> Does the bug reproducer I sent to the list not work for you? The test
>> can check the return value of refQueue.poll() and decide if it passes or
>> not (null return means the ReferenceHandler thread has died and the bug
>> is here, non-null return means thread still works and there is no bug).
> I will check the code again, but unfortunately I think it does not help
> a lot.
>
> The problem of reproducing this issue is trying to get the
> ReferenceHandler to die, i.e. have the OOME occur in the reference
> handler thread.
>
> The allocation of the InterruptedException is such a small allocation so
> that in almost all of the cases of OOME, its allocation still succeeds
> or is not the actual cause for the OOME. So the probability that the
> java application threads get the OOME to handle is much higher,
> especially in the stress tests.
>
> There is a message emitted by the VM reading "java.lang.OutOfMemoryError
> thrown from the UncaughtExceptionHandler in thread "Reference Handler""
> that is sufficient to detect the problem itself (at least if you enable
> some flags).
>
> I will look at it again and report back if it can be used in some way.

On my computer the test always produced the same result. So it's pretty 
reliable. The trick is in fillHeap() method that fills the heap so that 
even "new Object[1]" throws OOME. Throwable object takes at least the 
same space as Object[1];

Regards, Peter

>
> Thanks,
>    Thomas
>
>




More information about the hotspot-gc-dev mailing list