Heap OOM for no apparent reason?

Thu Jun 2 16:56:03 UTC 2011

I do tend to think that somewhere a large object or array is being
created. In particular, Infinispan is one library we are using that
may be allocating large chunks of memory -- indeed, replacing
Infinispan with a local cache does seem to "fix" the problem.

However, more information from the JVM would really be useful in
isolating the offending code in Infinispan. Ideally,

a) any large allocations should show up as part of the heap dump if
the allocation succeeded but then some other subsequent code caused
the OOM, or

b) if the allocation itself failed, the OOM exception should include a
stack trace that would allow me to isolate the allocation point  (as
it does normally, but for some reason in this case doesn't).

In this case the heap dump shows plenty of room in heap, and there is
no stack trace at the OOM, so I don't really have any way to isolate
the offending allocation point. In which situations does the OOM
exception get printed without an associated stack trace?

Cheers,
Raman

On 06/02/2011 12:01 PM, Charles K Pepperdine wrote:
> are you trying to create a humungous object or array? Accidentally?
> 
> On Jun 2, 2011, at 5:35 PM, Raman Gupta wrote:
> 
>> I did check the database but didn't find anything relevant. My search
>> terms may not be optimal, though I did scan through all the results
>> returned by "java.lang.OutOfMemoryError: Java heap space" as well as
>> "0K->0K".
>>
>> I also suspected a bug in the collector and so I tried the same test
>> with the G1 collector, with the same OOM result. I didn't save the log
>> from the G1 test, but I can quite easily redo the test with any set of
>> JVM parameters that may be helpful in debugging -- the OOM seems to be
>> easily and consistently reproducible with this application.
>>
>> Cheers,
>> Raman
>>
>> On 06/02/2011 09:20 AM, Charles K Pepperdine wrote:
>>> 2011-05-27T11:35:18.214-0400: 147.823: [Full GC [PSYoungGen: 7662K->0K(72896K)] [PSOldGen: 169093K->62518K(174784K)] 176755K->62518K(247680K) [PSPermGen: 27342K->27342K(55232K)], 0.8074580 secs] [Times: user=0.29 sys=0.03, real=0.81 secs] 
>>>    UseAdaptiveSizePolicy actions to meet  *** throughput goal ***
>>>                       GC overhead (%)
>>>    Young generation:        2.65	  (attempted to grow)
>>>    Tenured generation:      0.54	  (attempted to grow)
>>>    Tenuring threshold:    (attempted to decrease to balance GC costs) = 1
>>> 2011-05-27T11:35:19.021-0400: 148.631: [GC [PSYoungGen: 0K->0K(57856K)] 62518K->62518K(232640K), 0.0009670 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
>>>    UseAdaptiveSizePolicy actions to meet  *** throughput goal ***
>>>                       GC overhead (%)
>>>    Young generation:        2.01	  (attempted to grow)
>>>    Tenured generation:      0.54	  (no change)
>>>    Tenuring threshold:    (attempted to decrease to balance GC costs) = 1
>>> 2011-05-27T11:35:19.022-0400: 148.632: [Full GC [PSYoungGen: 0K->0K(57856K)] [PSOldGen: 62518K->52158K(174784K)] 62518K->52158K(232640K) [PSPermGen: 27342K->26866K(61056K)], 0.3614330 secs] [Times: user=0.32 sys=0.02, real=0.37 secs] 
>>>
>>> First Full GC looks normal as PSOldGen is (174784-169093)K = 5691K from being full. Nothing happening in Perm.
>>> The second is where things start to get weird. I don't see why that GC was called. Stranger still, it's ~800ms *after* the full gc and yet no application thread allocated any memory out of young gen.
>>> for some reason that "failed" young gen collection triggers an immediate Full GC.
>>>
>>> Bug in the collector? Did you check the bug database?
>>>
>>> Regards,
>>> Kirk
>>>
>