CMS CMF

Fri Oct 19 21:32:17 UTC 2012

Hi Ramki,

Thanks for the response... I don't know (for certain) which flags have been set as all I've been handed is this anonymous GC log but everything you mention makes sense in the context of how things have been reported in the log. I wouldn't be so harsh as to call this a bug but it's a point where some improvements maybe found.. so I'll go file a bug ;-)

Regards,
Kirk

On 2012-10-19, at 11:20 PM, Srinivas Ramakrishna <ysr1729 at gmail.com> wrote:

> You probably have CMSScavengeBeforeRemark Enabled. The scavenge fails when initiated from the remark, but i am guessing that the failure is ignored and the remark ploughs on (slowly!). Then a full gc is attempted, but by then a sweep has started and for historical reasons it completes before the full gc can proceed. This points out several malinteractions in the case of CMSScavengeBeforeRemark that had not been fully considered by me when i designed it. In general, having more finely interruptible phases would have been better for CMS (as comments in the code state).... I think a simple fix is to not attempt the remark if the scavenge failed and just bail to a stop-world GC at that point. I think the change would not be too difficult. Please file a bug.
> 
> -- ramki
> 
> On Fri, Oct 19, 2012 at 2:07 PM, Kirk Pepperdine <kirk at kodewerk.com> wrote:
> Hi all,
> 
> I've got this GC log fragment that I'm puzzling over.
> 
> 2833.756: [GC[YG occupancy: 1600484 K (1887488 K)]2833.756: [GC 2833.757: [ParNew (promotion failed): 1600484K->1600484K(1887488K), 2.2601630 secs] 19739658K->20165811K(20761856K), 2.2605170 secs] [Times: user=10.47 sys=0.11, real=2.26 secs]
> 2836.017: [Rescan (parallel) , 2.2167670 secs]2838.234: [weak refs processing, 0.5441620 secs]2838.778: [class unloading, 0.0623720 secs]2838.841: [scrub symbol & string tables, 0.0079860 secs] [1 CMS-remark: 18565327K(18874368K)] 20165811K(20761856K), 5.1880910 secs] [Times: user=42.02 sys=0.25, real=5.19 secs]
> 2838.945: [CMS-concurrent-sweep-start]
> 2849.966: [Full GC 2849.966: [CMS2862.696: [CMS-concurrent-sweep: 19.533/23.751 secs] [Times: user=42.67 sys=4.29, real=23.75 secs]
>  (concurrent mode failure): 17074814K->4842456K(18874368K), 41.9426500 secs] 18962302K->4842456K(20761856K), [CMS Perm : 65645K->65265K(109292K)], 41.9429740 secs]
> 
> So, a ParNew is signaled @ 2833.757 which leads to a promotion failed. I would expect that @ 2833.757 + 2.26 or 2836.017 we'd see the concurrent mode failure being reported. However @ that time we instead see a Rescan/remark. The CMS finally comes @ 2849.966, more than 15 seconds after the allocation failure has been reported. Is there a change here in that the final phases of CMS were allowed to run rather than reverting to a Full GC or is there something else going on here that I've missed on.
> 
> Regards,
> Kirk
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20121019/94e04952/attachment.htm>