ParNew promotion failed, no expected OOM.

Jon Masamitsu jon.masamitsu at oracle.com
Wed Apr 11 23:28:07 PDT 2012


Ramki,

I never want to throw an OOM and then have to argue about  whether
the OOM was thrown prematurely.  That would be a bug.  As a consequence
of such an approach, I accept that there will be times when it would
have been more helpful if the OOM was thrown sooner.   That might be  a 
poorer
quality of service but not a bug (I think).

Jon

On 4/11/2012 11:24 AM, Srinivas Ramakrishna wrote:
> I believe this is missing the "gc overhead" threshold for the space limit.
> As I have commented in the past, i think the GC overhead limit should
> consider
> not just the space free in the whole heap, but rather the difference
> between the old gen
> capacity and the sum of the space used in the young gen and the old gen
> after a major
> GC has competed, as a percentage of the old gen capacity. It almost seems
> as though
> you have a largish object in the young gen which will not fit in the space
> free in the old gen,
> o it will never be promoted unless sufficient space clears up in the old
> gen, and from what
> you are describing, that won't happen until your program terminates its
> computation.
>
> I think we need to fix the space criteria for overhead limit to deal
> gracefully
> with these kinds of situations.
>
> On an unrelated note, for such a small heap, you should probably use
> ParallelOldGC rather
> than CMS, but I realize that you didn't explicitly ask for CMS, the mac
> just gave it to you
> because that's the default.
>
> -- ramki
>
> On Wed, Apr 11, 2012 at 7:24 AM, Dawid Weiss<dawid.weiss at gmail.com>  wrote:
>
>> Hi there,
>>
>> We are measuring certain aspects of our algorithm with a test suite
>> which attempts to run close to the physical heap's maximum size. We do
>> it by doing a form of binary search based on the size of data passed
>> to the algorithm, where the lower bound is always "succeeded without
>> an OOM" and the upper bound is "threw an OOM". This works nice but
>> occasionally we experience an effective deadlock in which full GCs are
>> repeatedly invoked, the application makes progress but overall it's
>> several orders of magnitude slower than usual (hours instead of
>> seconds).
>>
>> GC logs look like this:
>>
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0220371
>> secs][CMS: 69016K->69014K(81152K), 0.1370901 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1591765 secs] [Times: user=0.20 sys=0.00, real=0.16 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0170617
>> secs][CMS: 69016K->69014K(81152K), 0.1235417 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1406872 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0191855
>> secs][CMS: 69016K->69014K(81152K), 0.1296462 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1488816 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0232418
>> secs][CMS: 69016K->69014K(81152K), 0.1300695 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1533590 secs] [Times: user=0.20 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0190998
>> secs][CMS: 69016K->69014K(81152K), 0.1319668 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1511436 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0168998
>> secs][CMS: 69017K->69015K(81152K), 0.1359254 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1528776 secs] [Times: user=0.20 sys=0.01, real=0.16 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0214651
>> secs][CMS: 69017K->69015K(81152K), 0.1209494 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1424941 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0200897
>> secs][CMS: 69017K->69015K(81152K), 0.1244227 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1445654 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0203377
>> secs][CMS: 69017K->69015K(81152K), 0.1353857 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1558016 secs] [Times: user=0.19 sys=0.00, real=0.16 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0201951
>> secs][CMS: 69017K->69015K(81152K), 0.1289750 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1492306 secs] [Times: user=0.19 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0206677
>> secs][CMS: 69017K->69015K(81152K), 0.1280734 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1488114 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0150225
>> secs][CMS: 69017K->69015K(81152K), 0.1301056 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1451940 secs] [Times: user=0.19 sys=0.01, real=0.14 secs]
>>
>> The heap limit is intentionally left smallish and the routine where
>> this happens is in fact computational (it does allocate sporadic
>> objects but never releases them until finished).
>>
>> This behavior is easy to reproduce on my Mac (quad core),
>>
>> java version "1.6.0_31"
>> Java(TM) SE Runtime Environment (build 1.6.0_31-b04-414-11M3626)
>> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-414, mixed mode)
>>
>> I read a bit about the nature of "promotion failed" and it's clear to
>> me (or so I think) why this is happening here. My questions are:
>>
>> 1) why isn't OOM being triggered by gc overhead limit? It should
>> easily be falling within the default thresholds,
>> 2) is there anything one can do to prevent situation like the above
>> (other than manually fiddling with limits)?
>>
>> Thanks in advance for any pointers and feedback,
>>
>> Dawid
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120411/73ae2f7f/attachment.html 


More information about the hotspot-gc-use mailing list