RFR (XXS): 7110173: GCNotifier::pushNotification publishes stale data.
Tony Printezis
tony.printezis at oracle.com
Thu Nov 17 14:09:19 UTC 2011
Fred,
I haven't been following this thread closely, but I do have a question
(see below).
On 11/17/2011 8:43 AM, Frederic Parain wrote:
> Hi Mandy,
>
> It would have been nice to have notifications providing
> detailed information about CMS different phases, but it
> doesn't seem easy to do right now. I did the experiments
> with CMS using JConsole, and the result is not really
> satisfactory:
> - several notifications are sent for each CMS cycle
> but the notification lacks the information to identify
> the CMS phase associated with the notification (worse
> than that, they all claim they notify the "end of a major
> GC cycle" which is certainly not true for CMS.
Are you referring to notifications at the end of a concurrent cycle
(i.e., at the end of sweeping)? Depends on how you look at it, but a
concurrent cycle is a major cycle given that it collects the entire
heap. In fact, IIRC, the a concurrent cycle should increase the full gc
count.
Tony
> - according to the TraceCMSMemoryManagerStats constructor
> code, some information reported by the notification
> are inaccurate because the stats recorded for the current
> phase are still in _current_gc_stat while the notification
> code is using _last_gc_stat (so basically some notifications
> contain data from the previous GC cycle instead of data
> from the current cycle)
>
> So I agree that moving the pushNotification to inside the "if
> (countCollection)" statement would be a correct fix for CMS.
>
> +1 for the renaming of the 'countCollection' variable.
>
> Fred
>
> On 11/16/11 7:23 PM, Mandy Chung wrote:
>> John, Frederic,
>>
>> I looked at the implementation and I still think pushNotification
>> should only be called *once* at the end of the GC cycle but not
>> every MemoryManager::gc_end call (might be better to rename
>> gc_end to something else).
>>
>> MemoryManager::gc_end is called at the destructor of
>> TraceMemoryManagerStats and also TraceCMSMemoryManagerStats (
>> a subclass of TraceMemoryManagerStats). For the stop-the-world
>> collectors, there is a clear begin and end of GC cycle. But
>> for CMS, it has 3 stop-the-world phases. That's the reason why
>> it has a list of flags to specify what to record since at different
>> CMS phase, it will record different stats. At the end of entire
>> CMS cycle, it will increment the collector count and gc end time
>> etc. So the GC notification should be pushed only at the end of
>> one collection cycle (for CMS case - would be SWEEPING phase).
>>
>> Below is the code of the TraceCMSMemoryManagerStats constructor
>> that would make it clear:
>>
>> switch (phase) {
>> case CMSCollector::InitialMarking:
>> initialize(true /* fullGC */ ,
>> cause /* cause of the GC */,
>> true /* recordGCBeginTime */,
>> true /* recordPreGCUsage */,
>> false /* recordPeakUsage */,
>> false /* recordPostGCusage */,
>> true /* recordAccumulatedGCTime */,
>> false /* recordGCEndTime */,
>> false /* countCollection */ );
>> break;
>>
>> case CMSCollector::FinalMarking:
>> initialize(true /* fullGC */ ,
>> cause /* cause of the GC */,
>> false /* recordGCBeginTime */,
>> false /* recordPreGCUsage */,
>> false /* recordPeakUsage */,
>> false /* recordPostGCusage */,
>> true /* recordAccumulatedGCTime */,
>> false /* recordGCEndTime */,
>> false /* countCollection */ );
>> break;
>>
>> case CMSCollector::Sweeping:
>> initialize(true /* fullGC */ ,
>> cause /* cause of the GC */,
>> false /* recordGCBeginTime */,
>> false /* recordPreGCUsage */,
>> true /* recordPeakUsage */,
>> true /* recordPostGCusage */,
>> false /* recordAccumulatedGCTime */,
>> true /* recordGCEndTime */,
>> true /* countCollection */ );
>> break;
>>
>> default:
>> ShouldNotReachHere();
>> }
>>
>> I believe moving the pushNotification to inside the "if
>> (countCollection)"
>> statement would be a correct fix. It might be better to rename
>> "countCollection" variable to something like "gcCompleted".
>>
>> I suggest to run
>> jdk/test/com/sun/management/GarbageCollectorMXBean/GarbageCollectionNotificationContentTest.java
>>
>>
>>
>> on CMS and see how many notifications are sent and what data it
>> includes. You may need to modify it to print any other necessary
>> information. I can help further next week once I finish a few
>> things that I committed to complete this week.
>>
>> Mandy
>>
>>
>> On 11/16/11 2:00 AM, Frederic Parain wrote:
>>> Hi,
>>>
>>> The reason why the memory usage data are bundled with the notification
>>> is to prevent data lost. If the notification doesn't provide the
>>> memory usage data, the client has to perform a getLastGcInfo() call
>>> to retrieve the data each time it receives a notification. But several
>>> GC cycles might occur between the time the notification is received and
>>> the time getLastGcInfo() is invoked. And there's no API to get memory
>>> usage data older than the one from the last GC cycle. With the
>>> notification including the memory usage data, the client is ensured
>>> to get data for all GC cycles.
>>>
>>> That said, the notification contains two memory usage datasets, one
>>> created at the beginning of the GC cycle and one created at the end
>>> of the cycle. There's no reason to condition the notification to
>>> the recording of the post GC cycle data recording. It's reasonable
>>> to send a notification for a GC which only collects the pre GC data
>>> for instance. So I think it's ok to move the notification code out
>>> of the recordPostGCUsage==true condition as long as the values in
>>> the non recorded memory usage data set can clearly be identified
>>> as being invalid. Right now they are initialized to zero, so it
>>> looks ok to me.
>>>
>>> Moving the notification code after the "if(countCollection)" block
>>> looks good too.
>>>
>>> However, I'm not an official reviewer, so you'll need more approvals
>>> for this changeset.
>>>
>>> Fred
>>>
>>> On 11/16/11 03:26 AM, Mandy Chung wrote:
>>>> I'm including Frederic and serviceability-dev since this is part of
>>>> the
>>>> instrumentation done for serviceability.
>>>>
>>>> On 11/15/2011 4:59 PM, David Holmes wrote:
>>>>> Hi John,
>>>>>
>>>>> On 16/11/2011 3:33 AM, John Cuthbertson wrote:
>>>>>> Thanks for the review. I did and I did that consciously. As you
>>>>>> say -
>>>>>> with the current code the listener would only be notified if the
>>>>>> _recordPostGCUsage field of the TraceMemoryManagerStats object is
>>>>>> true
>>>>>> (which it is by default) and none of the TraceMemoryManagerStats
>>>>>> instances created by the collectors change this. But there's
>>>>>> nothing to
>>>>>> stop a collector creating a TraceMemoryManagerStats object with
>>>>>> _recordPostGCUsage false and_recordGCEndTime true. In this case
>>>>>> wouldn't
>>>>>> we still want to notify the listener? I may be wrong (in which case
>>>>>> I'll
>>>>>> add the extra guard) but I believe that recording (some) data and
>>>>>> notification should be two independent operations.
>>>>>
>>>>> It depends on what data the listener is expecting to receive. If you
>>>>> push the notification when there hasn't been an update does that make
>>>>> sense? I don't know what the spec is for this.
>>>>>
>>>>
>>>> The GC notifier and pushNotification call was added as this changeset:
>>>> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/78542e2b5e35
>>>>
>>>> The spec for this is:
>>>> http://download.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/GarbageCollectionNotificationInfo.html
>>>>
>>>>
>>>>
>>>>
>>>>> Anyway I just wanted to flag the change in potential behaviour. If
>>>>> no-one from GC has any issue with this then it's fine by me.
>>>>>
>>>>
>>>> I think David is right that it should check recordPostGCUsage==true to
>>>> push a notification. The notification is sent with the last GC
>>>> statistics. If my memory serves correctly, if recordPostGCUsage is
>>>> false, it doesn't update the last GC usage but sending a notification
>>>> when recordPostGCUsage is false seems to be incorrect. Also, the
>>>> recording was specifically added for CMS since CMS has separate phases
>>>> that it needs to record different things.
>>>>
>>>> Frederic would be the best person to comment on this. Frederic - it
>>>> seems to me that the pushNotification can be moved to the end of
>>>> gc_end
>>>> function but within the if (countCollection) statement. Sorry I don't
>>>> have time to check the details out. It'd be good if you can help.
>>>>
>>>> Thanks
>>>> Mandy
>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> JohnC
>>>>>>
>>>>>> On 11/14/11 22:35, David Holmes wrote:
>>>>>>> Hi John,
>>>>>>>
>>>>>>> On 15/11/2011 4:46 AM, John Cuthbertson wrote:
>>>>>>>> Can I have a couple of volnteers to review the fix for this CR?
>>>>>>>> The
>>>>>>>> webrev can be found at:
>>>>>>>> http://cr.openjdk.java.net/~johnc/7110173/webrev.0/.
>>>>>>>
>>>>>>> In the original code the pushNotification is conditional on
>>>>>>> recordPostGCUsage, but you've removed that guard by moving the
>>>>>>> code.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> The issue here was that the routine
>>>>>>>> GCNotifier::pushNotification(),
>>>>>>>> which uses GC data held in GCMemoryManager::_last_gc_stat, was
>>>>>>>> being
>>>>>>>> called before the values in GCMemoryManager::_last_gc_stat were
>>>>>>>> being
>>>>>>>> populated for the current GC. As a result the JVM could pass
>>>>>>>> uninitialized or stale data to a listener. The fix is to move the
>>>>>>>> call
>>>>>>>> to GCNotifier::pushNotification() after the code that populates
>>>>>>>> GCMemoryManager::_last_gc_stat. I also modified the GCStatInfo
>>>>>>>> constructor to fully initialize instances of that class.
>>>>>>>>
>>>>>>>> Testing: The supplied test case on Windows, a crafted test case on
>>>>>>>> Solaris, and the nsk GC tests.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> JohnC
>>>>>>
>>>
>>
>
More information about the hotspot-gc-dev
mailing list