RFR: 8016341 java/lang/ref/OOMEInReferenceHandler.java failing intermittently

David Holmes david.holmes at oracle.com
Mon Jul 8 07:19:56 UTC 2013

Hi Peter,

On 8/07/2013 5:01 PM, Peter Levart wrote:
> On 07/05/2013 01:22 AM, David Holmes wrote:
>> Hi Peter,
>> On 2/07/2013 5:19 PM, Peter Levart wrote:
>>> Looking at original code once again, I think this was actually a bug.
>>> The WeakReference instance constructed in (old) line 82, can be GCed
>>> right away, since nobody is using the local variable after assignment.
>> Of course. Doh! I was to busy thinking about the lifetime of the
>> referent object to consider the reference itself.
>>> If WeakReference is GCed it can not be enqueued. The promotion of local
>>> variable into a field is one way to fix this. The other would be to use
>>> the local variable somewhere down the code path, like for example in a
>>> final throw statement:
>>>   110          throw new IllegalStateException("Reference Handler thread
>>> stuck. weakRef.get(): " + weakRef.get());
>>> This would also reveal some more info about the WeakReference when
>>> there's no sure answer after 10 seconds and could be added to the test
>>> anyway.
>> Okay I've modified the test as suggested. Updated webrev at same
>> location:
>> http://cr.openjdk.java.net/~dholmes/8016341/webrev/
>> In testing it though I simply exposed the remaining flaws in the
>> ReferenceHandler code. We can still kill the ReferenceHandler thread
>> with an OOME when it tries to load the Cleaner class (running with a
>> 5M heap triggers this nicely if you use G1):
>>                // Fast path for cleaners
>>                 if (r instanceof Cleaner) {
>>                     ((Cleaner)r).clean();
>>                     continue;
>>                 }
>> and if that passes the clean() might throw OOME (it internally tries
>> to do a System.exit if an exception occurs but will likely encounter
>> another OOME trying to create the PrivilegedAction).
>> Even the:
>>                ReferenceQueue q = r.queue;
>>                 if (q != ReferenceQueue.NULL) q.enqueue(r);
>> might throw OOME because enqueue() might have to load the
>> FinalReference class.
>> So really catching the OOME around the wait() only patches a small
>> hole. We can't simply put a try/catch in the for(;;) loop because that
>> doesn't address the problem that if the class loading throws OOME then
>> subsequent attempts to load that class will also fail. We would have
>> to preload all possible classes. Even then we might just send the
>> ReferenceHandler thread into a busy loop of throwing OOME catching it
>> and retrying!
>> So I can fix the test to deal with the Xcomp issue but we may still
>> see intermittent failures, and the ReferenceHandler thread may still
>> die from OOME.
> Hi David,
> I think the test is fine now.
> Regarding the resilience of ReferenceHandler thread, I think we should
> try to preload FinalReference, Cleaner and InterruptedException in the
> initialization phase of ReferenceHandler thread. There should be no
> danger of OOME in related cases then.

We would need more than that as we need to ensure there is zero 
classloading needed at any point where OOME might be thrown. That would 
include PrivilegedAction and any transitive closure therefrom.

> As far as Cleaner's exception
> handler is concerned, I think it is not universally wise to just exit
> the VM when any exception is thrown by the Cleaner's thunk.run() method.
> What if that exception is OOME? That does not mean there's something
> wrong with Cleaner.thunk's code. Not only will this kill
> ReferenceHandler thread, but entire VM.

Yes "fatal errors" are not uncommon in the JDK, or at least not 
completely rare. The general question is whether continuing might do 
more harm than good given that some "critical" action has failed. Hard 
to know in general so aborting is fairly common response (ala hotspot 
running out of C heap memory).


> If the purpose of exiting VM was attracting attention to the possible
> bug in Cleaner.thunk's code, then this absolutely works, but wouldn't
> simple message to System.err be enough? Like for example:
>      public void clean() {
>          if (!remove(this))
>              return;
>          try {
>              thunk.run();
>          } catch (final Throwable x) {
>              try {
>                  new Error("Cleaner caught exception", x)
>                      .printStackTrace();
>              } catch (OutOfMemoryError oome) {
>                  // can't do much
>              }
>          }
>      }
> Regards, Peter
>> David
>> -----
>>> Regards, Peter
>>> On 07/02/2013 06:38 AM, David Holmes wrote:
>>>> This recently added test was found to fail under some conditions -
>>>> namely client compiler with -Xcomp. It seems that the use of all local
>>>> variables enabled the compiler to optimize things in a way that
>>>> stopped the weakref from being enqueued as expected. Simple fix was to
>>>> make the weakref a field.
>>>> http://cr.openjdk.java.net/~dholmes/8016341/webrev/
>>>> Thanks,
>>>> David

More information about the core-libs-dev mailing list