Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

Peter Levart peter.levart at gmail.com
Mon Jan 20 08:00:14 UTC 2014


On 01/20/2014 02:51 AM, David Holmes wrote:
> Hi Peter,
>
> On 17/01/2014 11:24 PM, Peter Levart wrote:
>> On 01/17/2014 02:13 PM, Peter Levart wrote:
>>>>> // Fast path for cleaners
>>>>> boolean isCleaner = false;
>>>>> try {
>>>>>   isCleaner = r instanceof Cleaner;
>>>>> } catch (OutofMemoryError oome) {
>>>>>   continue;
>>>>> }
>>>>>
>>>>> if (isCleaner) {
>>>>>   ((Cleaner)r).clean();
>>>>>   continue;
>>>>> }
>>>>>
>>>>
>>>> Hi David, Kalyan,
>>>>
>>>> I've caught-up now. Just thinking: is "instanceof Cleaner" throwing
>>>> OOME as a result of loading the Cleaner class? Wouldn't the above
>>>> code then throw some error also in ((Cleaner)r) - the checkcast,
>>>> since Cleaner class would not be successfully initialized?
>>>
>>> Well, no. The above code would just skip Cleaner processing in this
>>> situation. And will never be doing it again after the heap is freed...
>>> So it might be good to load and initialize Cleaner class as part of
>>> ReferenceHandler initialization to ensure correct operation...
>>
>> Well, yes and no. Let me try once more:
>>
>> Above code will skip Cleaner processing if the 1st time "instanceof
>> Cleaner" is executed, OOME is thrown as a consequence of full heap while
>> loading and initializing the Cleaner class.
>
> Yes - I was assuming that this would not fail the very first time and 
> so the Cleaner class would already be loaded. Failing to be able to 
> load the Cleaner class was one of the potential issues flagged earlier 
> with this problem. I was actually assuming that Cleaner would be 
> loaded already due to some actual Cleaner subclasses being used, but 
> this does not happen as part of the default initialization. :( The 
> irony being that if the Cleaner class is not loaded then r can not be 
> an instance of Cleaner and so we would fail to load the class in a 
> case where we didn't need it anyway.
>
> What I wanted to focus on here was an OOME from the instanceof itself, 
> but as you say that might trigger classloading of Cleaner (which is 
> not what I was interested in).
>
>> The 2nd time the "instanceof
>> Cleaner" is executed after such OOME, the same line would throw
>> NoClassDefFoundError as a consequence of referencing a class that failed
>> initialization. Am I right?
>
> instanceof is not one of the class initialization triggers, so we 
> should not see an OOME generated due to a class initialization 
> exception and so the class will not be put into the Erroneous state 
> and so subsequent attempts to use the class will not automatically 
> trigger NoClassdefFoundError.
>
> If OOME occurs during actual loading/linking of the class Cleaner it 
> is unclear what would happen on subsequent attempts. OOME is not a 
> LinkageError that must be rethrown on subsequent attempts, and it is 
> potentially a transient condition, so I would expect a re-load attempt 
> to be allowed. However we are now deep into the details of the VM and 
> it may well depend on the exact place from which the OOME originates.
>
> The bottom line with the current problem is that there are multiple 
> non-obvious paths by which the ReferenceHandler can encounter an OOME. 
> In such cases we do not want the ReferenceHandler to terminate - which 
> implies catching the OOME and continuing. However we also do not want 
> to silently skip Cleaner processing or reference queue processing - as 
> that would lead to hard to diagnoze bugs. But trying to report the 
> problem may not be possible due to being out-of-memory. It may be that 
> we need to break things up into multiple try/catch blocks, where each 
> catch does a System.gc() and then reports that the OOME occurred. Of 
> course the reporting must still be in a try/catch for the OOME. Though 
> at some point letting the ReferenceHandler die may be the only way to 
> "report" a major memory problem.
>
> David

Hm... If I give -verbose:class option to run a simple test program:

public class Test { public static void main(String... a) {} }

I see Cleaner class being loaded before Test class. I don't see by which 
tread or if it might get loaded after main() starts, but I suspect that 
loading of Cleaner is not a problem here. Initialization of Cleaner 
class is not performed by ReferenceHandler thread as you pointed out. 
The instanceof does not trigger it and if it returns true then Cleaner 
has already been initialized. So there must be some other cause for 
instanceof throwing OOME...

What do you say about this variant of ReferenceHandler.run() method:

         public void run() {
             for (;;) {
                 Reference r;
                 Cleaner c;
                 synchronized (lock) {
                     r = pending;
                     if (r != null) {
                         // instanceof operator might throw OOME 
sometimes. Just retry after
                         // yielding - might have better luck next time...
                         try {
                             c = r instanceof Cleaner ? (Cleaner) r : null;
                         } catch (OutOfMemoryError x) {
                             Thread.yield();
                             continue;
                         }
                         pending = r.discovered;
                         r.discovered = null;
                     } else {
                         // The waiting on the lock may cause an OOME 
because it may try to allocate
                         // exception objects, so also catch OOME here 
to avoid silent exit of the
                         // reference handler thread.
                         //
                         // Explicitly define the order of the two 
exceptions we catch here
                         // when waiting for the lock.
                         //
                         // We do not want to try to potentially load 
the InterruptedException class
                         // (which would be done if this was its first 
use, and InterruptedException
                         // were checked first) in this situation.
                         //
                         // This may lead to the VM not ever trying to 
load the InterruptedException
                         // class again.
                         try {
                             try {
                                 lock.wait();
                             } catch (OutOfMemoryError x) { }
                         } catch (InterruptedException x) { }
                         continue;
                     }
                 }

                 // Fast path for cleaners
                 if (c != null) {
                     c.clean();
                     continue;
                 }

                 ReferenceQueue q = r.queue;
                 if (q != ReferenceQueue.NULL) q.enqueue(r);
             }
         }



... it tries to not consume and skip Cleaner instances when OOME is caught.

I don't think ReferenceHandler is to make responsible for reporting 
OOMEs. Full heap is a global condition and ReferenceHandler is the last 
to accuse for it.

Regards, Peter

>
> David
>
>> Regards, Peter
>>




More information about the core-libs-dev mailing list