ParNew promotion failed, no expected OOM.
Jon Masamitsu
jon.masamitsu at oracle.com
Wed Apr 11 11:41:35 PDT 2012
Dawid,
I haven't look at your numbers but the OOM due to the
GC overhead is thrown very conservatively. In addition to
spending too much time doing GC, the policy looks at how
much free space is available in the heap. It may be that
there is enough free space in the heap such that the policy
does not want to trigger an OOM.
You see the "promotion failure" message when the GC
policy thinks there is enough space in the old gen to
support a young collection. It's supposed to be the
exception case and I wonder a bit why you see
"promotion failure" messages repeatedly instead of
just seeing "Full collections" but I can see how the
policy could get stuck in a situation where it keeps
thinking there is enough space in the old gen but
in the end there isn't. Anyway those are basically
Full collections.
Jon
On 04/11/12 07:24, Dawid Weiss wrote:
> Hi there,
>
> We are measuring certain aspects of our algorithm with a test suite
> which attempts to run close to the physical heap's maximum size. We do
> it by doing a form of binary search based on the size of data passed
> to the algorithm, where the lower bound is always "succeeded without
> an OOM" and the upper bound is "threw an OOM". This works nice but
> occasionally we experience an effective deadlock in which full GCs are
> repeatedly invoked, the application makes progress but overall it's
> several orders of magnitude slower than usual (hours instead of
> seconds).
>
> GC logs look like this:
>
> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0220371
> secs][CMS: 69016K->69014K(81152K), 0.1370901 secs]
> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1591765 secs] [Times: user=0.20 sys=0.00, real=0.16 secs]
> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0170617
> secs][CMS: 69016K->69014K(81152K), 0.1235417 secs]
> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1406872 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0191855
> secs][CMS: 69016K->69014K(81152K), 0.1296462 secs]
> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1488816 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0232418
> secs][CMS: 69016K->69014K(81152K), 0.1300695 secs]
> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1533590 secs] [Times: user=0.20 sys=0.00, real=0.15 secs]
> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0190998
> secs][CMS: 69016K->69014K(81152K), 0.1319668 secs]
> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1511436 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0168998
> secs][CMS: 69017K->69015K(81152K), 0.1359254 secs]
> 86038K->86038K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1528776 secs] [Times: user=0.20 sys=0.01, real=0.16 secs]
> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0214651
> secs][CMS: 69017K->69015K(81152K), 0.1209494 secs]
> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1424941 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0200897
> secs][CMS: 69017K->69015K(81152K), 0.1244227 secs]
> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1445654 secs] [Times: user=0.18 sys=0.00, real=0.14 secs]
> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0203377
> secs][CMS: 69017K->69015K(81152K), 0.1353857 secs]
> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1558016 secs] [Times: user=0.19 sys=0.00, real=0.16 secs]
> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0201951
> secs][CMS: 69017K->69015K(81152K), 0.1289750 secs]
> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1492306 secs] [Times: user=0.19 sys=0.00, real=0.15 secs]
> [GC [ParNew (promotion failed): 17023K->18700K(19136K), 0.0206677
> secs][CMS: 69017K->69015K(81152K), 0.1280734 secs]
> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1488114 secs] [Times: user=0.18 sys=0.00, real=0.15 secs]
> [GC [ParNew (promotion failed): 17023K->18905K(19136K), 0.0150225
> secs][CMS: 69017K->69015K(81152K), 0.1301056 secs]
> 86039K->86039K(100288K), [CMS Perm : 21285K->21285K(35724K)],
> 0.1451940 secs] [Times: user=0.19 sys=0.01, real=0.14 secs]
>
> The heap limit is intentionally left smallish and the routine where
> this happens is in fact computational (it does allocate sporadic
> objects but never releases them until finished).
>
> This behavior is easy to reproduce on my Mac (quad core),
>
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04-414-11M3626)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-414, mixed mode)
>
> I read a bit about the nature of "promotion failed" and it's clear to
> me (or so I think) why this is happening here. My questions are:
>
> 1) why isn't OOM being triggered by gc overhead limit? It should
> easily be falling within the default thresholds,
> 2) is there anything one can do to prevent situation like the above
> (other than manually fiddling with limits)?
>
> Thanks in advance for any pointers and feedback,
>
> Dawid
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
More information about the hotspot-gc-use
mailing list