CMSWaitDuration unstable behavior

Kirk Pepperdine kirk at kodewerk.com
Thu Aug 9 07:22:42 UTC 2012


Hi,

+1 on Ramki's comment. IME, forcing a scavenge prior to the remark has always been beneficial and for the same underlying reasons I don't see why it wouldn't benefit the initial mark.

Regards,
Kirk

On 2012-08-09, at 1:45 AM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:

> 
> 
> On 8/8/2012 11:56 AM, Srinivas Ramakrishna wrote:
>> ...
>> 
>> PS: Jon, if Michal takes the approach of CMSScavengeBeforeInitialMark, I'd
>> say it would be useful to the broader community (not
>> just ICMS users) if that were integrated into the main-line code, as it
>> would be a via-media for CMS scaling in the absence of the
>> piggybacking RFE which is really the best solution here.
> 
> Agreed.
> 
> 
> Jon
> 
>> thanks!
>> -- ramki
>> 
>> On Wed, Aug 8, 2012 at 8:11 AM, Jon Masamitsu<jon.masamitsu at oracle.com>wrote:
>> 
>>> Michal,
>>> 
>>> The engineer with the most experience on CMS left Oracle
>>> and  I suspect this is not going to get fixed in the way you want.
>>> 
>>> I've create CR 7189971 to capture your comments and it will be
>>> reviewed along with other RFE's for CMS but I would not be
>>> optimistic.
>>> 
>>> Since you are customizing your own VM, did you consider
>>> explicitly invoking a young collection before the initial mark
>>> the way that it is done for the remark phase with the flag
>>> 
>>> CMSScavengeBeforeRemark
>>> 
>>> Jon
>>> 
>>> 
>>> On 8/7/2012 6:16 AM, Frajt, Michal wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> We are using the incremental CMS collector for many years. We have a
>>>> distributed application framework based on the subscribe-unsubscribe model
>>>> where the data unsubscriptions are handled by the application layer just
>>>> forgetting the strong reference to the distributed data. The underlying
>>>> application framework layer is using weak references to trace the data
>>>> requirement from the application layer. We keep the old generation
>>>> processed permanently (incrementally) to get the week references released
>>>> and reported within a short period of time (minutes).
>>>> 
>>>> Unfortunately the incremental mode is missing the support for the
>>>> CMSWaitDuration to place the initial mark phase right after the young space
>>>> collection. With some new gen sizing optimization we went to a situation
>>>> when the new gen is more or less big enough to keep the most of live
>>>> objects with only a few promotions to the old gen. The incremental CMS is
>>>> then started every minute in a random moment with pretty garbaged new gen.
>>>> The initial mark takes 20-50 times more than a single new gen processing
>>>> (40ms new gen, initial mark 1100ms).
>>>> 
>>>> We decided to customize the OpenJDK 6 by adding the incremental mode
>>>> CMSWaitDuration support. We took the same approach as the wait_on_cms_lock
>>>> method does with the CGC_lock object. Unfortunately we realized that the
>>>> CGC_lock mutex is additionally notified in some other situation than the
>>>> young space collection finishing. The young space collection unrelated
>>>> notifications are coming from the desynchronize method invocations. These
>>>> unrelated notifications are causing the wait_on_cms_lock to return earlier
>>>> than required. The initial mark phase is started before the young space
>>>> collection even there is enough wait duration time specified to wait. We
>>>> have fixed it by waiting again if the GenCollectedHeap::heap()->**total_collections()
>>>> counter is not changed after the CGC_long->wait method returns but not
>>>> longer than the CMSWaitDuration in total. The initial mark is then always
>>>> placed (if CMSWaitDuration is long enough) after the young space
>>>> collection. Every initial mark phase takes no longer than 17ms (previously
>>>> 1100ms).
>>>> 
>>>> We tested the CMSWaitDuration behavior in the normal CMS mode. We
>>>> specified the -XX:+**UseCMSInitiatingOccupancyOnly and -XX:**
>>>> CMSInitiatingOccupancyFraction**=10 to force the CMS running permanently
>>>> (shouldConcurrentCollect should be returning true). The CMS initial-mark is
>>>> many times started without waiting for the young space collection which
>>>> makes the initial marking running 20-50 longer. We find this as unstable
>>>> behavior of the CMSWaitDuration implementation related to the problem of
>>>> the wait-notify signaling on the CGC_lock object. We disabled the explicit
>>>> GC invocation (-XX:+DisableExplicitGC) to be sure there is no other reason
>>>> to start the CMS initial mark phase before the young space collection.
>>>> 
>>>> Is there any plan to get the CMSWaitDuration supported in the incremental
>>>> mode and/or get it fixed in the normal mode?
>>>> 
>>>> Thanks,
>>>> Michal Frajt
>>>> 
>>>> 
>>>> 




More information about the hotspot-gc-dev mailing list