Heap OOM for no apparent reason?

Thu Jun 2 17:46:47 UTC 2011

If your code is not catching the OOM exception you'd
expect to see the stack retrace when the program dies.
If it catches the exception and carries on, you'd want
it to print the exception detail. I don't know of
cases where the exception would just disappear.

In your case the report to stdout/stderr(?)that an OOM occurred and that
the heap is being dumped comes from inside the JVM
because you have asked for -XX:+HeapDumpOnOutOfMemoryError.
After this point, your allocating thread would have gotten
an OOME which it probably caught and swallowed, and hence
the silence wrt the stack retrace you would normally see. You
will want to look at your Infinispan code to see how
it deals with the inability to allocate said large objects.

Recall that object size is limited by the size of and
available space in the largest area (Eden or Old) in your
Java heap. As Kirk noted, the full gc was to attempt allocation
of an object that didn't fit into the available space in
Eden or in Old (so from that you can estimate the size of
the request).

Note also that the JDK libraries will resize hashtables under
you and that can also cause large allocation requests
(but i don't know how they handle OOM's resulting from such
allocations).

-- ramki

On 06/02/11 09:56, Raman Gupta wrote:
> I do tend to think that somewhere a large object or array is being
> created. In particular, Infinispan is one library we are using that
> may be allocating large chunks of memory -- indeed, replacing
> Infinispan with a local cache does seem to "fix" the problem.
> 
> However, more information from the JVM would really be useful in
> isolating the offending code in Infinispan. Ideally,
> 
> a) any large allocations should show up as part of the heap dump if
> the allocation succeeded but then some other subsequent code caused
> the OOM, or
> 
> b) if the allocation itself failed, the OOM exception should include a
> stack trace that would allow me to isolate the allocation point  (as
> it does normally, but for some reason in this case doesn't).
> 
> In this case the heap dump shows plenty of room in heap, and there is
> no stack trace at the OOM, so I don't really have any way to isolate
> the offending allocation point. In which situations does the OOM
> exception get printed without an associated stack trace?
> 
> Cheers,
> Raman
> 
> 
> On 06/02/2011 12:01 PM, Charles K Pepperdine wrote:
>> are you trying to create a humungous object or array? Accidentally?
>>
>> On Jun 2, 2011, at 5:35 PM, Raman Gupta wrote:
>>
>>> I did check the database but didn't find anything relevant. My search
>>> terms may not be optimal, though I did scan through all the results
>>> returned by "java.lang.OutOfMemoryError: Java heap space" as well as
>>> "0K->0K".
>>>
>>> I also suspected a bug in the collector and so I tried the same test
>>> with the G1 collector, with the same OOM result. I didn't save the log
>>> from the G1 test, but I can quite easily redo the test with any set of
>>> JVM parameters that may be helpful in debugging -- the OOM seems to be
>>> easily and consistently reproducible with this application.
>>>
>>> Cheers,
>>> Raman
>>>
>>> On 06/02/2011 09:20 AM, Charles K Pepperdine wrote:
>>>> 2011-05-27T11:35:18.214-0400: 147.823: [Full GC [PSYoungGen: 7662K->0K(72896K)] [PSOldGen: 169093K->62518K(174784K)] 176755K->62518K(247680K) [PSPermGen: 27342K->27342K(55232K)], 0.8074580 secs] [Times: user=0.29 sys=0.03, real=0.81 secs] 
>>>>    UseAdaptiveSizePolicy actions to meet  *** throughput goal ***
>>>>                       GC overhead (%)
>>>>    Young generation:        2.65	  (attempted to grow)
>>>>    Tenured generation:      0.54	  (attempted to grow)
>>>>    Tenuring threshold:    (attempted to decrease to balance GC costs) = 1
>>>> 2011-05-27T11:35:19.021-0400: 148.631: [GC [PSYoungGen: 0K->0K(57856K)] 62518K->62518K(232640K), 0.0009670 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
>>>>    UseAdaptiveSizePolicy actions to meet  *** throughput goal ***
>>>>                       GC overhead (%)
>>>>    Young generation:        2.01	  (attempted to grow)
>>>>    Tenured generation:      0.54	  (no change)
>>>>    Tenuring threshold:    (attempted to decrease to balance GC costs) = 1
>>>> 2011-05-27T11:35:19.022-0400: 148.632: [Full GC [PSYoungGen: 0K->0K(57856K)] [PSOldGen: 62518K->52158K(174784K)] 62518K->52158K(232640K) [PSPermGen: 27342K->26866K(61056K)], 0.3614330 secs] [Times: user=0.32 sys=0.02, real=0.37 secs] 
>>>>
>>>> First Full GC looks normal as PSOldGen is (174784-169093)K = 5691K from being full. Nothing happening in Perm.
>>>> The second is where things start to get weird. I don't see why that GC was called. Stranger still, it's ~800ms *after* the full gc and yet no application thread allocated any memory out of young gen.
>>>> for some reason that "failed" young gen collection triggers an immediate Full GC.
>>>>
>>>> Bug in the collector? Did you check the bug database?
>>>>
>>>> Regards,
>>>> Kirk
>>>>