RFR (XXS): 7110173: GCNotifier::pushNotification publishes stale data.

Frederic Parain frederic.parain at oracle.com
Thu Nov 17 13:43:27 UTC 2011


Hi Mandy,

It would have been nice to have notifications providing
detailed information about CMS different phases, but it
doesn't seem easy to do right now. I did the experiments
with CMS using JConsole, and the result is not really
satisfactory:
    - several notifications are sent for each CMS cycle
      but the notification lacks the information to identify
      the CMS phase associated with the notification (worse
      than that, they all claim they notify the "end of a major
      GC cycle" which is certainly not true for CMS.
    - according to the TraceCMSMemoryManagerStats constructor
      code, some information reported by the notification
      are inaccurate because the stats recorded for the current
      phase are still in _current_gc_stat while the notification
      code is using _last_gc_stat (so basically some notifications
      contain data from the previous GC cycle instead of data
      from the current cycle)

So I agree that moving the pushNotification to inside the "if
(countCollection)" statement would be a correct fix for CMS.

+1 for the renaming of the 'countCollection' variable.

Fred

On 11/16/11 7:23 PM, Mandy Chung wrote:
> John, Frederic,
>
> I looked at the implementation and I still think pushNotification
> should only be called *once* at the end of the GC cycle but not
> every MemoryManager::gc_end call (might be better to rename
> gc_end to something else).
>
> MemoryManager::gc_end is called at the destructor of
> TraceMemoryManagerStats and also TraceCMSMemoryManagerStats (
> a subclass of TraceMemoryManagerStats). For the stop-the-world
> collectors, there is a clear begin and end of GC cycle. But
> for CMS, it has 3 stop-the-world phases. That's the reason why
> it has a list of flags to specify what to record since at different
> CMS phase, it will record different stats. At the end of entire
> CMS cycle, it will increment the collector count and gc end time
> etc. So the GC notification should be pushed only at the end of
> one collection cycle (for CMS case - would be SWEEPING phase).
>
> Below is the code of the TraceCMSMemoryManagerStats constructor
> that would make it clear:
>
> switch (phase) {
> case CMSCollector::InitialMarking:
> initialize(true /* fullGC */ ,
> cause /* cause of the GC */,
> true /* recordGCBeginTime */,
> true /* recordPreGCUsage */,
> false /* recordPeakUsage */,
> false /* recordPostGCusage */,
> true /* recordAccumulatedGCTime */,
> false /* recordGCEndTime */,
> false /* countCollection */ );
> break;
>
> case CMSCollector::FinalMarking:
> initialize(true /* fullGC */ ,
> cause /* cause of the GC */,
> false /* recordGCBeginTime */,
> false /* recordPreGCUsage */,
> false /* recordPeakUsage */,
> false /* recordPostGCusage */,
> true /* recordAccumulatedGCTime */,
> false /* recordGCEndTime */,
> false /* countCollection */ );
> break;
>
> case CMSCollector::Sweeping:
> initialize(true /* fullGC */ ,
> cause /* cause of the GC */,
> false /* recordGCBeginTime */,
> false /* recordPreGCUsage */,
> true /* recordPeakUsage */,
> true /* recordPostGCusage */,
> false /* recordAccumulatedGCTime */,
> true /* recordGCEndTime */,
> true /* countCollection */ );
> break;
>
> default:
> ShouldNotReachHere();
> }
>
> I believe moving the pushNotification to inside the "if (countCollection)"
> statement would be a correct fix. It might be better to rename
> "countCollection" variable to something like "gcCompleted".
>
> I suggest to run
> jdk/test/com/sun/management/GarbageCollectorMXBean/GarbageCollectionNotificationContentTest.java
>
>
> on CMS and see how many notifications are sent and what data it
> includes. You may need to modify it to print any other necessary
> information. I can help further next week once I finish a few
> things that I committed to complete this week.
>
> Mandy
>
>
> On 11/16/11 2:00 AM, Frederic Parain wrote:
>> Hi,
>>
>> The reason why the memory usage data are bundled with the notification
>> is to prevent data lost. If the notification doesn't provide the
>> memory usage data, the client has to perform a getLastGcInfo() call
>> to retrieve the data each time it receives a notification. But several
>> GC cycles might occur between the time the notification is received and
>> the time getLastGcInfo() is invoked. And there's no API to get memory
>> usage data older than the one from the last GC cycle. With the
>> notification including the memory usage data, the client is ensured
>> to get data for all GC cycles.
>>
>> That said, the notification contains two memory usage datasets, one
>> created at the beginning of the GC cycle and one created at the end
>> of the cycle. There's no reason to condition the notification to
>> the recording of the post GC cycle data recording. It's reasonable
>> to send a notification for a GC which only collects the pre GC data
>> for instance. So I think it's ok to move the notification code out
>> of the recordPostGCUsage==true condition as long as the values in
>> the non recorded memory usage data set can clearly be identified
>> as being invalid. Right now they are initialized to zero, so it
>> looks ok to me.
>>
>> Moving the notification code after the "if(countCollection)" block
>> looks good too.
>>
>> However, I'm not an official reviewer, so you'll need more approvals
>> for this changeset.
>>
>> Fred
>>
>> On 11/16/11 03:26 AM, Mandy Chung wrote:
>>> I'm including Frederic and serviceability-dev since this is part of the
>>> instrumentation done for serviceability.
>>>
>>> On 11/15/2011 4:59 PM, David Holmes wrote:
>>>> Hi John,
>>>>
>>>> On 16/11/2011 3:33 AM, John Cuthbertson wrote:
>>>>> Thanks for the review. I did and I did that consciously. As you say -
>>>>> with the current code the listener would only be notified if the
>>>>> _recordPostGCUsage field of the TraceMemoryManagerStats object is true
>>>>> (which it is by default) and none of the TraceMemoryManagerStats
>>>>> instances created by the collectors change this. But there's
>>>>> nothing to
>>>>> stop a collector creating a TraceMemoryManagerStats object with
>>>>> _recordPostGCUsage false and_recordGCEndTime true. In this case
>>>>> wouldn't
>>>>> we still want to notify the listener? I may be wrong (in which case
>>>>> I'll
>>>>> add the extra guard) but I believe that recording (some) data and
>>>>> notification should be two independent operations.
>>>>
>>>> It depends on what data the listener is expecting to receive. If you
>>>> push the notification when there hasn't been an update does that make
>>>> sense? I don't know what the spec is for this.
>>>>
>>>
>>> The GC notifier and pushNotification call was added as this changeset:
>>> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/78542e2b5e35
>>>
>>> The spec for this is:
>>> http://download.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/GarbageCollectionNotificationInfo.html
>>>
>>>
>>>
>>>> Anyway I just wanted to flag the change in potential behaviour. If
>>>> no-one from GC has any issue with this then it's fine by me.
>>>>
>>>
>>> I think David is right that it should check recordPostGCUsage==true to
>>> push a notification. The notification is sent with the last GC
>>> statistics. If my memory serves correctly, if recordPostGCUsage is
>>> false, it doesn't update the last GC usage but sending a notification
>>> when recordPostGCUsage is false seems to be incorrect. Also, the
>>> recording was specifically added for CMS since CMS has separate phases
>>> that it needs to record different things.
>>>
>>> Frederic would be the best person to comment on this. Frederic - it
>>> seems to me that the pushNotification can be moved to the end of gc_end
>>> function but within the if (countCollection) statement. Sorry I don't
>>> have time to check the details out. It'd be good if you can help.
>>>
>>> Thanks
>>> Mandy
>>>
>>>> David
>>>> -----
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> JohnC
>>>>>
>>>>> On 11/14/11 22:35, David Holmes wrote:
>>>>>> Hi John,
>>>>>>
>>>>>> On 15/11/2011 4:46 AM, John Cuthbertson wrote:
>>>>>>> Can I have a couple of volnteers to review the fix for this CR? The
>>>>>>> webrev can be found at:
>>>>>>> http://cr.openjdk.java.net/~johnc/7110173/webrev.0/.
>>>>>>
>>>>>> In the original code the pushNotification is conditional on
>>>>>> recordPostGCUsage, but you've removed that guard by moving the code.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> The issue here was that the routine GCNotifier::pushNotification(),
>>>>>>> which uses GC data held in GCMemoryManager::_last_gc_stat, was being
>>>>>>> called before the values in GCMemoryManager::_last_gc_stat were
>>>>>>> being
>>>>>>> populated for the current GC. As a result the JVM could pass
>>>>>>> uninitialized or stale data to a listener. The fix is to move the
>>>>>>> call
>>>>>>> to GCNotifier::pushNotification() after the code that populates
>>>>>>> GCMemoryManager::_last_gc_stat. I also modified the GCStatInfo
>>>>>>> constructor to fully initialize instances of that class.
>>>>>>>
>>>>>>> Testing: The supplied test case on Windows, a crafted test case on
>>>>>>> Solaris, and the nsk GC tests.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> JohnC
>>>>>
>>
>

-- 
Frederic Parain - Oracle
Grenoble Engineering Center - France
Phone: +33 4 76 18 81 17
Email: Frederic.Parain at oracle.com




More information about the hotspot-gc-dev mailing list