(Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread
Peter Levart
peter.levart at gmail.com
Mon May 6 11:52:23 UTC 2013
On 05/06/2013 10:02 AM, Thomas Schatzl wrote:
> Hi,
>
> On Sat, 2013-05-04 at 21:23 +0200, Peter Levart wrote:
>> On 05/04/2013 07:38 PM, Vitaly Davidovich wrote:
>>
>>> Oops, that was my mistake - I thought the lock here was a j.u.c.Lock
>>> which of course doesn't even make sense given we're talking about
>>> ObjectMonitor. So disregard that bit.
>>>
>>> Ignoring OOM and continuing just seems very fragile as it means you
>>> somehow know that all state is still consistent. Most java code
>>> isn't async exception safe, so it's hard to reason about state after
>>> OOM. Maybe Reference Handler is OK in that regard though.
>>>
>> I think it is safe to ignore OOME here, since this is the only place
> As my test program shows, this may not be the only place where heap
> allocation can happen within that code.
>
> Alan also mentioned something about instrumentation that can add memory
> allocations basically anywhere.
> As the reference handler code is plain java code, it will be affected as
> other java code.
>
>> where heap allocation happens and it is known what provokes it.
>> Otherwise ReferenceHandler just shuffles existing pointers in existing
>> objects...
> In the current compilers, yes. But what about other compilers/VMs that
> may add more elaborate profiling information here and there? E.g. the
> graal compiler? Not sure if it does, probably it does not, but it seems
> shaky to rely on compiler code generation here.
>
> So is it possible to handle all known issues (especially if the fix is
> similar/known/understandable), not just the original one, here?
>
> I mean, if we fix this issue as you suggested (I am not against that, it
> looks reasonable), I would not know what to do with the test program
> except file another bug against the very same component with the same
> problem, with the same fix suggestion.
>
> I.e. the code still seems to buggy even after applying that fix, and
> with instrumentation I think I could create another reproducer. As
> reference processing is something critical to the VM, I think it is
> prudent to fix all known problems here and now if possible.
>
> Maybe there is some argument against simply wrapping the entire loop
> with a try-catch? Please elaborate if so (maybe I overlooked that?), or
> maybe you could suggest another solution?
>
> Additionally, just fixing only the exact issue here seems somewhat
> wasteful in terms of time (imho) - because next time another developer
> will likely spend lots of time trying to reproduce this again. (As
> mentioned, the former assignee also seemed to be unable to reproduce
> this for a long time).
>
> Thanks,
> Thomas
Hi Thomas,
I agree with you. If instrumentation/different JVM can throw OOME
practicaly anywhere, then we have a problem with every such "critical"
piece of code that is written in Java. What if instrumentation inserts
allocation at the start of each loop? You might think this could be
fixed by:
for (;;) {
// instrumentation added allocation #1 here
try {
for (;;) {
// instrumentation added allocation #2 here
...
}
}
catch (OutOfMemoryError ignore) {}
}
Say you successfully enter inner-loop and run it for a while. Then a
heap shortage situation happens and "allocation #2" throws OOME, which
is caught and ignored. In next iteration of outer-loop "allocation #1"
will probably throw OOME too, since heap is probably still full...
To be 100% sure such "critical" loop never ends, I only see two options:
1. move such loop to different environment (native code?) where
allocations are more predictable or
2. add facility to Java/JVM that can mark portions of Java code where
allocation happens more predictably
The 2nd option could be specified as a kind of annotation on a method,
for example:
private static class ReferenceHandler extends Thread {
@DontInstrument
public void run() {
...
}
For the 1st option, we could create a native counterpart of the
following static method:
public class CriticalLoop {
public static void doForever(Runnable runnable) {
for (;;) {
try {
runnable.run();
}
catch (OutOfMemoryError) {
// wait a little uninterruptibly
}
}
}
}
...and use it in ReferenceHandler to run the critical loop.
Regards, Peter
More information about the core-libs-dev
mailing list