Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently
Per Liden
per.liden at oracle.com
Tue Mar 22 09:24:19 UTC 2016
Hi Peter,
On 2016-03-21 16:30, Peter Levart wrote:
> Hi Per,
>
> May I point you to my proposed change in Reference(Handler) for JDK 9,
> being discussed in the thread about JDK-8149925. It will hopefully
> remove the special-casing of sun.misc.Cleaner, change the way how
> pending references are being enqueued by ReferenceHandler thread and how
> other thread(s) can synchronize with it. Since you seem to have a great
> knowledge of VM part of things, I would very much like to hear what you
> think of that change. Here's the latest webrev:
>
> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.08.part2/
>
>
> (see Reference.java and Bits.java for an example of how this
> synchronization with ReferenceHandler thread is to be used)
One thing I like about this approach is that it's only the
ReferenceHandler thread that pops of elements from the pending list and
enqueues them. That simplifies things a lot.
From a GC perspective I would however like to get away from the shared
pending list and the pending list lock entirety and instead provide a VM
downcall to get the pending list. The goal would of course be to have a
more robust way of transferring the pending list to Java land, instead
of today's secret handshake which is easy to get wrong. Also, not
requiring the pending list lock (which is a Java monitor) to be held
during a GC would also simplify things a lot on the GC side. E.g. the
ReferencePendingListLockerThread could be removed completely.
So, I imagine the ReferenceHandler could do something like this:
while (true) {
// getPendingReferences() is a downcall to the VM which
// blocks until the pending list becomes non-empty and
// returns the whole list, transferring it to from VM-land
// to Java-land in a safe and robust way.
Reference<Object> pending = getPendingReferences();
// Enqueue the references
while (pending != null) {
Reference<Object> r = pending;
pending = r.discovered;
r.discovered = null;
ReferenceQueue<? super Object> q = r.queue;
if (q != ReferenceQueue.NULL) {
q.enqueue(r);
}
}
}
I haven't thought through the details when it comes having additional
Java threads helping out with Cleaners. The ReferenceHandler would be
free to use whatever lists/locks is wants to handle this and the GC
wouldn't know anything about it. But, with the above approach at least
the interface between the ReferenceHandler and the VM would be pretty
clear and hard(er) to misuse.
cheers,
Per
>
> Regards, Peter
>
> On 03/21/2016 04:13 PM, Peter Levart wrote:
>> Hi Per, David,
>>
>> As things stand, there is a very good chance that sun.misc.Cleaner
>> will go away in JDK9, so all this speculation about the source of
>> OOME(s) can be put to rest. But for JDK 8u, I agree that this should
>> be sorted out.
>>
>> My feeling is that (instanceof Cleaner) can not result in allocation
>> and therefore can not trigger OOME if the Cleaner class is already
>> loaded at that time. I think that we were chasing the wrong rabbit. As
>> I have found later, there is a much more probable cause for
>> ReferenceHandler thread dying with OOME after the fix to catch OOME
>> from lock.wait(). It is triggered by the invocation of Cleaner.clean()
>> later down in the code. I even created a reproducer for it. See my
>> last two comments of the following issue:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8066859
>>
>> (but don't look at the proposed fix since it is not very good)
>>
>>
>> I think that for JDK 8u we could revert the code and do (instanceof
>> Cleaner) checks outside the synchronized block and in addition, find a
>> way to handle the OOME thrown from Cleaner.clean().
>>
>> What do you think?
>>
>>
>> Regards, Peter
>>
>>
>> On 03/21/2016 02:41 PM, Per Liden wrote:
>>> Hi David,
>>>
>>> On 2016-03-21 13:49, David Holmes wrote:
>>>> Hi Per,
>>>>
>>>> On 21/03/2016 10:20 PM, Per Liden wrote:
>>>>> Hi Peter & David,
>>>>>
>>>>> (Resurrecting an old thread here...)
>>>>>
>>>>> On 2014-01-22 03:19, David Holmes wrote:
>>>>>> Hi Peter,
>>>>>>
>>>>>> On 22/01/2014 12:00 AM, Peter Levart wrote:
>>>>>>> Hi, David, Kalyan,
>>>>>>>
>>>>>>> Summing up the discussion, I propose the following patch for
>>>>>>> ReferenceHandler:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> I can live with it, though it maybe that once Cleaner has been
>>>>>> preloaded
>>>>>> instanceof can no longer throw OOME. Can't be 100% sure. And there's
>>>>>> some duplication/verbosity in the commentary that could be trimmed
>>>>>> down :)
>>>>>
>>>>> While investigating a Reference pending list issue on the GC side of
>>>>> things I looked at the ReferenceHandler thread and noticed something
>>>>> which made me uneasy. The fix for JDK-8022321 added pre-loading of the
>>>>> Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
>>>>> inside the try/catch with a comment that it "sometimes" can throw an
>>>>> OOME. I understand this was done because we're not 100% sure if a OOME
>>>>> can still happen here, despite the pre-loading.
>>>>>
>>>>> However, if it can throw an OOME that means it's allocating, which in
>>>>> turn means it can provoke a GC. If that happens, it looks to me
>>>>> like we
>>>>> have a bug here. The ReferenceHandler thread is not allowed to
>>>>> provoke a
>>>>> GC while it's holding on to the pending list lock, since the pending
>>>>> list might be updated during a GC and "pending = r.discovered" will
>>>>> than
>>>>> overwrite something other than "r", silently dropping any newly
>>>>> discovered References which will never be discovered by the the GC
>>>>> again.
>>>>
>>>> Then the code was completely broken because it was obviously capable of
>>>> allocating whilst holding the lock. There is nothing in the Java
>>>> code to
>>>> indicate allocation should not happen and no way that Java code can
>>>> directly control that! We were only fixing the problem of the exception
>>>> killing the thread, not trying to address an undisclosed illegal
>>>> allocation problem!
>>>
>>> JDK-8022321 did indeed fix a real issue. It might also have
>>> unintentionally introduced a new one. Prior to JDK-8022321 we knew
>>> that the ReferenceHandler couldn't provoke a GC while manipulating
>>> the pending list, since the code was:
>>>
>>> synchronized (lock) {
>>> if (pending != null) {
>>> r = pending;
>>> pending = r.discovered;
>>> r.discovered = null;
>>> } else {
>>> ....
>>> }
>>> }
>>>
>>> The manipulation of the pending list is built on some secret/ugly
>>> rules and handshakes between the GC and the ReferenceHandler, which
>>> only works because we control of both.
>>>
>>>>
>>>> How would a GC thread update pending if the ReferenceHandlerThread
>>>> holds
>>>> the lock?
>>>
>>> The pending list lock is grabbed by the Java thread issuing the VM
>>> operation, on behalf of the GC to allow the GC the manipulate the
>>> pending list. If the thread issuing the VM operation is the
>>> ReferenceHandler, then the monitor is taken recursively, which is ok
>>> as long as ReferenceHandler isn't in the middle of unlinking an element.
>>>
>>>>
>>>>> On the other hand, if an OOME can never happen (i.e. no GC) here then
>>>>> we're good the comment is just incorrect. The instanceof check
>>>>> could be
>>>>> moved out of the try/catch block again, like it was prior to this
>>>>> change, just to make it obvious that we will not be able to cause new
>>>>> allocations inside the critical section. Or at a minimum, the comment
>>>>> saying OOME can still happen should be adjusted.
>>>>
>>>> I found it very difficult to determine with 100% certainty whether or
>>>> not the instanceof could ever trigger an allocation and hence
>>>> potentially an OOME.
>>>
>>> I agree, it's not obvious.
>>>
>>> cheers,
>>> Per
>>>
>>>>
>>>> With JVMCI it is now easier to imagine that compilation of this code by
>>>> a JVMCI compiler might lead to allocation while the lock is held!
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> Thoughts?
>>>>>
>>>>> thanks,
>>>>> Per
>>>>>
>>>>> Btw, to the best of my knowledge, the pre-loading of Cleaner should
>>>>> avoid any GC activity from instanceof, but I can't say that am a 100%
>>>>> sure either.
>>>>>
>>>>>>
>>>>>> Any specific reason to use Unsafe to do the preload rather than
>>>>>> Class.forName ? Does this force Unsafe to be loaded earlier than it
>>>>>> otherwise would?
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>>
>>>>>>> all 10 java/lang/ref tests pass on my PC (including
>>>>>>> OOMEInReferenceHandler).
>>>>>>>
>>>>>>> I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
>>>>>>> with this code and report any failure.
>>>>>>>
>>>>>>>
>>>>>>> Thanks, Peter
>>>>>>>
>>>>>>> On 01/21/2014 08:57 AM, David Holmes wrote:
>>>>>>>> On 21/01/2014 4:54 PM, Peter Levart wrote:
>>>>>>>>>
>>>>>>>>> On 01/21/2014 03:22 AM, David Holmes wrote:
>>>>>>>>>> Hi Peter,
>>>>>>>>>>
>>>>>>>>>> I do not see Cleaner being loaded prior to the main class on
>>>>>>>>>> either
>>>>>>>>>> Windows or Linux. Which platform are you on? Did you see it
>>>>>>>>>> loaded
>>>>>>>>>> before the main class or as part of executing it?
>>>>>>>>>
>>>>>>>>> Before. The main class is empty:
>>>>>>>>>
>>>>>>>>> public class Test { public static void main(String... a) {} }
>>>>>>>>>
>>>>>>>>> Here's last few lines of -verbose:class:
>>>>>>>>>
>>>>>>>>> [Loaded java.util.TimeZone from
>>>>>>>>> /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
>>>>>>>>> [Loaded sun.util.calendar.ZoneInfo from
>>>>>>>>> /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
>>>>>>>>> [Loaded sun.util.calendar.ZoneInfoFile from
>>>>>>>>> /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
>>>>>>>>> [Loaded sun.util.calendar.ZoneInfoFile$1 from
>>>>>>>>> /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
>>>>>>>>> [Loaded java.io.DataInput from
>>>>>>>>> /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
>>>>>>>>> [Loaded java.io.DataInputStream from
>>>>>>>>> /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
>>>>>>>>> *[Loaded sun.misc.Cleaner from
>>>>>>>>> /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
>>>>>>>>
>>>>>>>> Curious. I wonder what the controlling factor is ??
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm on linux, 64bit and using official EA build 121 of JDK 8...
>>>>>>>>>
>>>>>>>>> But if I try with JDK 7u45, I don't see it. So perhaps it would be
>>>>>>>>> good
>>>>>>>>> to trigger Cleaner loading and initialization as part of
>>>>>>>>> ReferenceHandler initialization to play things safe.
>>>>>>>>
>>>>>>>>
>>>>>>>> If we do that for Cleaner we may as well do it for
>>>>>>>> InterruptedException too.
>>>>>>>>
>>>>>>>>>> Also, it is not that I think ReferenceHandler is responsible for
>>>>>>>>>> reporting OOME, but that it is responsible for reporting that
>>>>>>>>>> it was
>>>>>>>>>> unable to perform a clean or enqueue because of OOME.
>>>>>>>>>
>>>>>>>>> This would be necessary if we skipped a Reference because of OOME,
>>>>>>>>> but
>>>>>>>>> if we just re-try until we eventually succeed, nothing is lost,
>>>>>>>>> nothing
>>>>>>>>> to report (but a slow response)...
>>>>>>>>
>>>>>>>> Agreed - just trying to clarify things.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Your suggested approach seems okay though I'm not sure why we
>>>>>>>>>> shouldn't help things along by calling System.gc() ourselves
>>>>>>>>>> rather
>>>>>>>>>> than just yielding and hoping things will get cleaned up
>>>>>>>>>> elsewhere.
>>>>>>>>>> But for the present purposes your approach will suffice I think.
>>>>>>>>>
>>>>>>>>> Maybe my understanding is wrong but isn't the fact that OOME is
>>>>>>>>> rised a
>>>>>>>>> consequence of that VM has already attempted to clear things up
>>>>>>>>> (executing a GC round synchronously) but didn't succeed to make
>>>>>>>>> enough
>>>>>>>>> free space to satisfy the allocation request? If this is only how
>>>>>>>>> some
>>>>>>>>> collectors/allocators are implemented and not a general rule,
>>>>>>>>> then we
>>>>>>>>> should put a System.gc() in place of Thread.yield(). Should we
>>>>>>>>> also
>>>>>>>>> combine that with Thread.yield()? I'm concerned of a possibility
>>>>>>>>> that we
>>>>>>>>> spin, consume too much CPU (ReferenceHandler thread has MAX
>>>>>>>>> priority) so
>>>>>>>>> that other threads dont' get enough CPU time to proceed and clean
>>>>>>>>> things
>>>>>>>>> up (we hope other threads will also get OOME and release things as
>>>>>>>>> their
>>>>>>>>> stacks unwind...).
>>>>>>>>
>>>>>>>> You are probably right about the System.gc() - OOME should be
>>>>>>>> thrown
>>>>>>>> after GC fails to create space, so it really needs some other
>>>>>>>> thread
>>>>>>>> to drop live references to allow further space to be reclaimed.
>>>>>>>>
>>>>>>>> But note that Thread.yield() can behave badly on some linux systems
>>>>>>>> too, so spinning is still a possibility - but either way this would
>>>>>>>> only be "really bad" on a uniprocessor system where yield() is
>>>>>>>> unlikely to misbehave.
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards, Peter
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>> On 20/01/2014 6:42 PM, Peter Levart wrote:
>>>>>>>>>>> On 01/20/2014 09:00 AM, Peter Levart wrote:
>>>>>>>>>>>> On 01/20/2014 02:51 AM, David Holmes wrote:
>>>>>>>>>>>>> Hi Peter,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 17/01/2014 11:24 PM, Peter Levart wrote:
>>>>>>>>>>>>>> On 01/17/2014 02:13 PM, Peter Levart wrote:
>>>>>>>>>>>>>>>>> // Fast path for cleaners
>>>>>>>>>>>>>>>>> boolean isCleaner = false;
>>>>>>>>>>>>>>>>> try {
>>>>>>>>>>>>>>>>> isCleaner = r instanceof Cleaner;
>>>>>>>>>>>>>>>>> } catch (OutofMemoryError oome) {
>>>>>>>>>>>>>>>>> continue;
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> if (isCleaner) {
>>>>>>>>>>>>>>>>> ((Cleaner)r).clean();
>>>>>>>>>>>>>>>>> continue;
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi David, Kalyan,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've caught-up now. Just thinking: is "instanceof Cleaner"
>>>>>>>>>>>>>>>> throwing
>>>>>>>>>>>>>>>> OOME as a result of loading the Cleaner class? Wouldn't the
>>>>>>>>>>>>>>>> above
>>>>>>>>>>>>>>>> code then throw some error also in ((Cleaner)r) - the
>>>>>>>>>>>>>>>> checkcast,
>>>>>>>>>>>>>>>> since Cleaner class would not be successfully initialized?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Well, no. The above code would just skip Cleaner
>>>>>>>>>>>>>>> processing in
>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> situation. And will never be doing it again after the
>>>>>>>>>>>>>>> heap is
>>>>>>>>>>>>>>> freed...
>>>>>>>>>>>>>>> So it might be good to load and initialize Cleaner class as
>>>>>>>>>>>>>>> part of
>>>>>>>>>>>>>>> ReferenceHandler initialization to ensure correct
>>>>>>>>>>>>>>> operation...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, yes and no. Let me try once more:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Above code will skip Cleaner processing if the 1st time
>>>>>>>>>>>>>> "instanceof
>>>>>>>>>>>>>> Cleaner" is executed, OOME is thrown as a consequence of full
>>>>>>>>>>>>>> heap
>>>>>>>>>>>>>> while
>>>>>>>>>>>>>> loading and initializing the Cleaner class.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes - I was assuming that this would not fail the very
>>>>>>>>>>>>> first time
>>>>>>>>>>>>> and
>>>>>>>>>>>>> so the Cleaner class would already be loaded. Failing to be
>>>>>>>>>>>>> able to
>>>>>>>>>>>>> load the Cleaner class was one of the potential issues flagged
>>>>>>>>>>>>> earlier with this problem. I was actually assuming that
>>>>>>>>>>>>> Cleaner
>>>>>>>>>>>>> would
>>>>>>>>>>>>> be loaded already due to some actual Cleaner subclasses being
>>>>>>>>>>>>> used,
>>>>>>>>>>>>> but this does not happen as part of the default
>>>>>>>>>>>>> initialization. :(
>>>>>>>>>>>>> The irony being that if the Cleaner class is not loaded then r
>>>>>>>>>>>>> can
>>>>>>>>>>>>> not be an instance of Cleaner and so we would fail to load the
>>>>>>>>>>>>> class
>>>>>>>>>>>>> in a case where we didn't need it anyway.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What I wanted to focus on here was an OOME from the instanceof
>>>>>>>>>>>>> itself, but as you say that might trigger classloading of
>>>>>>>>>>>>> Cleaner
>>>>>>>>>>>>> (which is not what I was interested in).
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The 2nd time the "instanceof
>>>>>>>>>>>>>> Cleaner" is executed after such OOME, the same line would
>>>>>>>>>>>>>> throw
>>>>>>>>>>>>>> NoClassDefFoundError as a consequence of referencing a class
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> failed
>>>>>>>>>>>>>> initialization. Am I right?
>>>>>>>>>>>>>
>>>>>>>>>>>>> instanceof is not one of the class initialization triggers,
>>>>>>>>>>>>> so we
>>>>>>>>>>>>> should not see an OOME generated due to a class initialization
>>>>>>>>>>>>> exception and so the class will not be put into the Erroneous
>>>>>>>>>>>>> state
>>>>>>>>>>>>> and so subsequent attempts to use the class will not
>>>>>>>>>>>>> automatically
>>>>>>>>>>>>> trigger NoClassdefFoundError.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If OOME occurs during actual loading/linking of the class
>>>>>>>>>>>>> Cleaner it
>>>>>>>>>>>>> is unclear what would happen on subsequent attempts. OOME is
>>>>>>>>>>>>> not a
>>>>>>>>>>>>> LinkageError that must be rethrown on subsequent attempts, and
>>>>>>>>>>>>> it is
>>>>>>>>>>>>> potentially a transient condition, so I would expect a re-load
>>>>>>>>>>>>> attempt to be allowed. However we are now deep into the
>>>>>>>>>>>>> details of
>>>>>>>>>>>>> the VM and it may well depend on the exact place from which
>>>>>>>>>>>>> the
>>>>>>>>>>>>> OOME
>>>>>>>>>>>>> originates.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The bottom line with the current problem is that there are
>>>>>>>>>>>>> multiple
>>>>>>>>>>>>> non-obvious paths by which the ReferenceHandler can
>>>>>>>>>>>>> encounter an
>>>>>>>>>>>>> OOME. In such cases we do not want the ReferenceHandler to
>>>>>>>>>>>>> terminate
>>>>>>>>>>>>> - which implies catching the OOME and continuing. However we
>>>>>>>>>>>>> also do
>>>>>>>>>>>>> not want to silently skip Cleaner processing or reference
>>>>>>>>>>>>> queue
>>>>>>>>>>>>> processing - as that would lead to hard to diagnoze bugs. But
>>>>>>>>>>>>> trying
>>>>>>>>>>>>> to report the problem may not be possible due to being
>>>>>>>>>>>>> out-of-memory.
>>>>>>>>>>>>> It may be that we need to break things up into multiple
>>>>>>>>>>>>> try/catch
>>>>>>>>>>>>> blocks, where each catch does a System.gc() and then
>>>>>>>>>>>>> reports that
>>>>>>>>>>>>> the
>>>>>>>>>>>>> OOME occurred. Of course the reporting must still be in a
>>>>>>>>>>>>> try/catch
>>>>>>>>>>>>> for the OOME. Though at some point letting the
>>>>>>>>>>>>> ReferenceHandler
>>>>>>>>>>>>> die
>>>>>>>>>>>>> may be the only way to "report" a major memory problem.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> Hm... If I give -verbose:class option to run a simple test
>>>>>>>>>>>> program:
>>>>>>>>>>>>
>>>>>>>>>>>> public class Test { public static void main(String... a) {} }
>>>>>>>>>>>>
>>>>>>>>>>>> I see Cleaner class being loaded before Test class. I don't
>>>>>>>>>>>> see by
>>>>>>>>>>>> which tread or if it might get loaded after main() starts,
>>>>>>>>>>>> but I
>>>>>>>>>>>> suspect that loading of Cleaner is not a problem here.
>>>>>>>>>>>> Initialization
>>>>>>>>>>>> of Cleaner class is not performed by ReferenceHandler thread as
>>>>>>>>>>>> you
>>>>>>>>>>>> pointed out. The instanceof does not trigger it and if it
>>>>>>>>>>>> returns
>>>>>>>>>>>> true
>>>>>>>>>>>> then Cleaner has already been initialized. So there must be
>>>>>>>>>>>> some
>>>>>>>>>>>> other
>>>>>>>>>>>> cause for instanceof throwing OOME...
>>>>>>>>>>>>
>>>>>>>>>>>> What do you say about this variant of ReferenceHandler.run()
>>>>>>>>>>>> method:
>>>>>>>>>>>>
>>>>>>>>>>>> public void run() {
>>>>>>>>>>>> for (;;) {
>>>>>>>>>>>> Reference r;
>>>>>>>>>>>> Cleaner c;
>>>>>>>>>>>> synchronized (lock) {
>>>>>>>>>>>> r = pending;
>>>>>>>>>>>> if (r != null) {
>>>>>>>>>>>> // instanceof operator might throw OOME
>>>>>>>>>>>> sometimes. Just retry after
>>>>>>>>>>>> // yielding - might have better luck
>>>>>>>>>>>> next
>>>>>>>>>>>> time...
>>>>>>>>>>>> try {
>>>>>>>>>>>> c = r instanceof Cleaner ?
>>>>>>>>>>>> (Cleaner)
>>>>>>>>>>>> r :
>>>>>>>>>>>> null;
>>>>>>>>>>>> } catch (OutOfMemoryError x) {
>>>>>>>>>>>> Thread.yield();
>>>>>>>>>>>> continue;
>>>>>>>>>>>> }
>>>>>>>>>>>> pending = r.discovered;
>>>>>>>>>>>> r.discovered = null;
>>>>>>>>>>>> } else {
>>>>>>>>>>>> // The waiting on the lock may cause an
>>>>>>>>>>>> OOME
>>>>>>>>>>>> because it may try to allocate
>>>>>>>>>>>> // exception objects, so also catch
>>>>>>>>>>>> OOME
>>>>>>>>>>>> here
>>>>>>>>>>>> to avoid silent exit of the
>>>>>>>>>>>> // reference handler thread.
>>>>>>>>>>>> //
>>>>>>>>>>>> // Explicitly define the order of
>>>>>>>>>>>> the two
>>>>>>>>>>>> exceptions we catch here
>>>>>>>>>>>> // when waiting for the lock.
>>>>>>>>>>>> //
>>>>>>>>>>>> // We do not want to try to potentially
>>>>>>>>>>>> load
>>>>>>>>>>>> the InterruptedException class
>>>>>>>>>>>> // (which would be done if this was its
>>>>>>>>>>>> first
>>>>>>>>>>>> use, and InterruptedException
>>>>>>>>>>>> // were checked first) in this
>>>>>>>>>>>> situation.
>>>>>>>>>>>> //
>>>>>>>>>>>> // This may lead to the VM not ever
>>>>>>>>>>>> trying to
>>>>>>>>>>>> load the InterruptedException
>>>>>>>>>>>> // class again.
>>>>>>>>>>>> try {
>>>>>>>>>>>> try {
>>>>>>>>>>>> lock.wait();
>>>>>>>>>>>> } catch (OutOfMemoryError x) { }
>>>>>>>>>>>> } catch (InterruptedException x) { }
>>>>>>>>>>>> continue;
>>>>>>>>>>>> }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> // Fast path for cleaners
>>>>>>>>>>>> if (c != null) {
>>>>>>>>>>>> c.clean();
>>>>>>>>>>>> continue;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> ReferenceQueue q = r.queue;
>>>>>>>>>>>> if (q != ReferenceQueue.NULL) q.enqueue(r);
>>>>>>>>>>>> }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ... it tries to not consume and skip Cleaner instances when
>>>>>>>>>>>> OOME is
>>>>>>>>>>>> caught.
>>>>>>>>>>>>
>>>>>>>>>>>> I don't think ReferenceHandler is to make responsible for
>>>>>>>>>>>> reporting
>>>>>>>>>>>> OOMEs. Full heap is a global condition and ReferenceHandler
>>>>>>>>>>>> is the
>>>>>>>>>>>> last to accuse for it.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards, Peter
>>>>>>>>>>>
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> I think the following variation is even better. It executes
>>>>>>>>>>> Thread.yield() after catching OOME but outside synchronized
>>>>>>>>>>> block so
>>>>>>>>>>> that given CPU slice can be used by GC threads to make progress
>>>>>>>>>>> enqueueing pending References (they are not able to enqueue them
>>>>>>>>>>> while
>>>>>>>>>>> ReferenceHandler is holding the lock):
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> public void run() {
>>>>>>>>>>> for (;;) {
>>>>>>>>>>> Reference r;
>>>>>>>>>>> Cleaner c;
>>>>>>>>>>> try {
>>>>>>>>>>> try {
>>>>>>>>>>> synchronized (lock) {
>>>>>>>>>>> r = pending;
>>>>>>>>>>> if (r != null) {
>>>>>>>>>>> // 'instanceof' might throw
>>>>>>>>>>> OOME
>>>>>>>>>>> sometimes so do this before
>>>>>>>>>>> // unlinking 'r' from the
>>>>>>>>>>> 'pending'
>>>>>>>>>>> chain...
>>>>>>>>>>> c = r instanceof Cleaner ?
>>>>>>>>>>> (Cleaner) r
>>>>>>>>>>> : null;
>>>>>>>>>>> // unlink 'r' from 'pending'
>>>>>>>>>>> chain
>>>>>>>>>>> pending = r.discovered;
>>>>>>>>>>> r.discovered = null;
>>>>>>>>>>> } else {
>>>>>>>>>>> // The waiting on the lock may
>>>>>>>>>>> cause an
>>>>>>>>>>> OOME because it may try to allocate
>>>>>>>>>>> // exception objects.
>>>>>>>>>>> lock.wait();
>>>>>>>>>>> continue;
>>>>>>>>>>> }
>>>>>>>>>>> }
>>>>>>>>>>> } catch (OutOfMemoryError x) {
>>>>>>>>>>> // Catch OOME from 'r instanceof
>>>>>>>>>>> Cleaner' or
>>>>>>>>>>> 'lock.wait()' 1st so that we don't
>>>>>>>>>>> // try to potentially load the
>>>>>>>>>>> InterruptedException class
>>>>>>>>>>> // (which would be done if this was its
>>>>>>>>>>> first
>>>>>>>>>>> use, and InterruptedException
>>>>>>>>>>> // were checked first) in this
>>>>>>>>>>> situation.
>>>>>>>>>>> // Give other threads CPU time so they
>>>>>>>>>>> hopefully release some objects and GC
>>>>>>>>>>> // clears some heap.
>>>>>>>>>>> // Also prevent CPU intensive
>>>>>>>>>>> spinning in
>>>>>>>>>>> case
>>>>>>>>>>> 'r instanceof Cleaner' above
>>>>>>>>>>> // persistently throws OOME for some
>>>>>>>>>>> time...
>>>>>>>>>>> Thread.yield();
>>>>>>>>>>> // retry
>>>>>>>>>>> continue;
>>>>>>>>>>> }
>>>>>>>>>>> } catch (InterruptedException x) {
>>>>>>>>>>> // Catch InterruptedException from
>>>>>>>>>>> 'lock.wait()'
>>>>>>>>>>> and retry
>>>>>>>>>>> continue;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> // Fast path for cleaners
>>>>>>>>>>> if (c != null) {
>>>>>>>>>>> c.clean();
>>>>>>>>>>> continue;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> ReferenceQueue q = r.queue;
>>>>>>>>>>> if (q != ReferenceQueue.NULL) q.enqueue(r);
>>>>>>>>>>> }
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Regards, Peter
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>
>
More information about the core-libs-dev
mailing list