ParNew promotion failed, no expected OOM.
Dawid Weiss
dawid.weiss at gmail.com
Wed Apr 11 11:53:00 PDT 2012
> I haven't look at your numbers but the OOM due to the
> GC overhead is thrown very conservatively. In addition to
I realize this but this seems like a good example of when gc overhead
should fire... or so I
think. There doesn't seem to be any space left at all --
69016K->69014K(81152K)
I realize these are full GCs because that's what -verbose:gc reports
(I included the details because I asked for them but otherwise what
you see is just FullGCs and no progress from the application itself).
What's puzzling to me is that this routine only allocates memory (hard
refs, there is nothing to collect) but the garbace collector _does_
drop around 2kb on every full GC... Also, this routine is normally
blazing fast and should either complete or OOM very quickly but
instead stalls as if 99% of the time was spent doing full collections.
I really cannot explain this.
Is there any way to see which objects get _dropped_ on full GC runs?
I'm curious what these dropped objects are.
Dawid
> spending too much time doing GC, the policy looks at how
> much free space is available in the heap. It may be that
> there is enough free space in the heap such that the policy
> does not want to trigger an OOM.
>
> You see the "promotion failure" message when the GC
> policy thinks there is enough space in the old gen to
> support a young collection. It's supposed to be the
> exception case and I wonder a bit why you see
> "promotion failure" messages repeatedly instead of
> just seeing "Full collections" but I can see how the
> policy could get stuck in a situation where it keeps
> thinking there is enough space in the old gen but
> in the end there isn't. Anyway those are basically
> Full collections.
>
> Jon
>
> On 04/11/12 07:24, Dawid Weiss wrote:
>> Hi there,
>>
>> We are measuring certain aspects of our algorithm with a test suite
>> which attempts to run close to the physical heap's maximum size. We do
>> it by doing a form of binary search based on the size of data passed
>> to the algorithm, where the lower bound is always "succeeded without
>> an OOM" and the upper bound is "threw an OOM". This works nice but
>> occasionally we experience an effective deadlock in which full GCs are
>> repeatedly invoked, the application makes progress but overall it's
>> several orders of magnitude slower than usual (hours instead of
>> seconds).
>>
>> GC logs look like this:
>>
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0220371
>> secs][CMS: 69016K->69014K(81152K), 0.1370901 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1591765 secs] [Times: user=0.20 sys=0.00, real=0.16 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0170617
>> secs][CMS: 69016K->69014K(81152K), 0.1235417 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1406872 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0191855
>> secs][CMS: 69016K->69014K(81152K), 0.1296462 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1488816 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0232418
>> secs][CMS: 69016K->69014K(81152K), 0.1300695 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1533590 secs] [Times: user=0.20 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0190998
>> secs][CMS: 69016K->69014K(81152K), 0.1319668 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1511436 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0168998
>> secs][CMS: 69017K->69015K(81152K), 0.1359254 secs]
>> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1528776 secs] [Times: user=0.20 sys=0.01, real=0.16 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0214651
>> secs][CMS: 69017K->69015K(81152K), 0.1209494 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1424941 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0200897
>> secs][CMS: 69017K->69015K(81152K), 0.1244227 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1445654 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0203377
>> secs][CMS: 69017K->69015K(81152K), 0.1353857 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1558016 secs] [Times: user=0.19 sys=0.00, real=0.16 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0201951
>> secs][CMS: 69017K->69015K(81152K), 0.1289750 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1492306 secs] [Times: user=0.19 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0206677
>> secs][CMS: 69017K->69015K(81152K), 0.1280734 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1488114 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
>> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0150225
>> secs][CMS: 69017K->69015K(81152K), 0.1301056 secs]
>> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
>> 0.1451940 secs] [Times: user=0.19 sys=0.01, real=0.14 secs]
>>
>> The heap limit is intentionally left smallish and the routine where
>> this happens is in fact computational (it does allocate sporadic
>> objects but never releases them until finished).
>>
>> This behavior is easy to reproduce on my Mac (quad core),
>>
>> java version "1.6.0_31"
>> Java(TM) SE Runtime Environment (build 1.6.0_31-b04-414-11M3626)
>> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-414, mixed mode)
>>
>> I read a bit about the nature of "promotion failed" and it's clear to
>> me (or so I think) why this is happening here. My questions are:
>>
>> 1) why isn't OOM being triggered by gc overhead limit? It should
>> easily be falling within the default thresholds,
>> 2) is there anything one can do to prevent situation like the above
>> (other than manually fiddling with limits)?
>>
>> Thanks in advance for any pointers and feedback,
>>
>> Dawid
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
More information about the hotspot-gc-use
mailing list