jmx-dev RFR 8020467: Inconsistency between usage.getUsed() and isUsageThresholdExceeded() with CMS Old Gen pool
Bengt Rutisson
bengt.rutisson at oracle.com
Wed Oct 23 07:43:30 PDT 2013
Hi Jaroslav,
On 2013-10-23 16:32, Jaroslav Bachorik wrote:
> On 23.10.2013 15:15, Bengt Rutisson wrote:
>>
>> On 2013-10-23 14:55, Jaroslav Bachorik wrote:
>>> Hi Bengt,
>>>
>>> On 23.10.2013 14:40, Bengt Rutisson wrote:
>>>>
>>>> Hi Jaroslav,
>>>>
>>>> A couple of questions.
>>>>
>>>> I don't understand why this is a CMS only problem? Why don't the other
>>>> collectors have the same issue? I guess it is less likely that the
>>>> other
>>>> collectors start (or complete) a GC without a lot of allocation going
>>>> on. But at least G1 should have the same problem.
>>>
>>> I don't really know. If there are other pools that can have the "used"
>>> value 0 before a GC happens then yes, they are susceptible to the same
>>> problem.
>>
>> I think all the "old" pools can have 0 used before a GC happens. But
>> except for CMS and G1 it is less likely that a GC happens unless you do
>> allocations. As long as they keep the 0 used the test will pass. So, my
>> guess is that to be on the safe side all "old" pools should make sure to
>> do a full GC first.
>>
>>>
>>>>
>>>> Also, from the problem description in the CR I would have guessed that
>>>> you want the GC to happen between these two statements:
>>>>
>>>> p.setUsageThreshold(1);
>>>> MemoryUsage u = p.getUsage();
>>>
>>> This is all but a heuristic here. The problem lies in the fact that it
>>> is not possible to retrieve the pool usage and the "threshold
>>> exceeded" flag consistently in one atomic operation. I might get
>>> usable data from the first call and then I don't need to force GC.
>>
>> Right. This is why I think you want to avoid a GC after you have fetched
>> getUsage() but before you do isUsageThresholdExceeded(). With your
>> suggested patch you are explicitly inserting a GC at that point. To me
>> this sounds like the opposite of what you want to do.
>
> I've updated the patch. The GC is called even before the first attempt
> to get the pool memory usage and System.gc() is used to perform GC (no
> MXBean checks). This should simplify the change a bit.
>
> http://cr.openjdk.java.net/~jbachorik/8020467/webrev.02
Thanks for doing this update so quickly!
Have you been able to verify that this change still fixes the issue? I
think it should, but it would be good if we could verify it.
This code worries me a little bit:
114 private static MemoryUsage getUsage(MemoryPoolMXBean p) {
115 MemoryUsage u = null;
116 do {
117 System.gc();
118 u = p.getUsage();
119 } while (u.getUsed() == 0);
120 return u;
121 }
I think one call to System.gc() should be enough. And if it is not, if
we still get 0 as used, I think that another System.gc() call will just
render the same result. Thus, I'm a bit worried that this will be an
endless loop.
Since the test actually handles the case where used is 0, I think it is
enough to just do a single call to System.gc() and then get the usage data.
Thanks,
Bengt
>
> -JB-
>
>>
>>>
>>>>
>>>> Now you have added the GC just after these statements. I thought that
>>>> was what caused the problem. That you read the usage data at one
>>>> point,
>>>> then a GC happens and you compare the cached usage
>>>> data to the new data that p.isUsageThresholdExceeded() will fetch.
>>>>
>>>> Looking at the promoteToOldGen() method I assume that the intent is
>>>> that
>>>> the code should be using the return value. So my guess is that this
>>>> code:
>>>>
>>>> 94 if (p.getName().equals("CMS Old Gen")) {
>>>> 95 promoteToOldGen(p, u);
>>>> 96 }
>>>>
>>>> Should be:
>>>>
>>>> 94 if (p.getName().equals("CMS Old Gen")) {
>>>> 95 u = promoteToOldGen(p, u);
>>>> 96 }
>>>
>>> Indeed. It was meant to re-fetch the usage after GC.
>>
>> OK. Good. With this code I think it should work. Now you make sure to
>> get the GC before you do getUsage().
>>
>>>
>>>>
>>>> With that, I think it might work. But I still don't understand why
>>>> this
>>>> is only a CMS problem.
>>>>
>>>> One more question about the promoteToOldGen() and forceGC() methods. I
>>>> don't really know much about how the different beans work, but are we
>>>> sure that the MemoryPoolMXBeans and GarbageCollectorMXBeans use the
>>>> same
>>>> pool names? That is, are you sure that forceGC() actually will do
>>>> anything?
>>>
>>> They use the pool names as reported by the GC infrastracture so they
>>> should be the same.
>>
>> Ok.
>>
>>>
>>>>
>>>> As for just doing a System.gc() to force a GC I think you can rely on
>>>> that System.gc() does a full GC in Hotspot unless someone sets
>>>> -XX:+DisableExplicitGC on the command line. Considering that you are
>>>> relying on Hotspot specifc names for pools I don't think it is a
>>>> limitation to the test to rely on the Hotspot implementatoin of
>>>> System.gc().
>>>
>>> Good to know. I guess I could simplify the change and just call
>>> System.gc(), after all.
>>
>> Yes, I think that' simpler.
>>
>> Thanks,
>> Bengt
>>
>>>
>>> Thanks,
>>>
>>> -JB-
>>>
>>>>
>>>> Thanks,
>>>> Bengt
>>>>
>>>>
>>>>
>>>>
>>>> On 2013-10-23 10:18, Staffan Larsen wrote:
>>>>> On 23 okt 2013, at 10:12, Jaroslav Bachorik
>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>
>>>>>> On 23.10.2013 10:08, Staffan Larsen wrote:
>>>>>>> I think you can simplify the logic for forcing a GC to just a
>>>>>>> simple
>>>>>>> call to "System.gc();". AFAIK System.gc() will cause a full
>>>>>>> collection to happen for all collectors.
>>>>>> Hm, will it now? I had the impression that it was just hinting
>>>>>> the GC
>>>>>> system to perform GC but it might decide to ignore it. I need to be
>>>>>> sure that the GC was performed before continuing - otherwise I might
>>>>>> get inconsistent data again.
>>>>> According to the spec it's just a hint, but I think the
>>>>> implementation
>>>>> happens to be a force. But better safe than sorry. :)
>>>>>
>>>>> /Staffan
>>>>>
>>>>>> -JB-
>>>>>>
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 23 okt 2013, at 10:02, Jaroslav Bachorik
>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>
>>>>>>>> On 22.10.2013 22:04, Mandy Chung wrote:
>>>>>>>>> Hi Jaroslav,
>>>>>>>>>
>>>>>>>>> On 10/22/13 6:47 AM, Jaroslav Bachorik wrote:
>>>>>>>>>> Please, review the following test fix:
>>>>>>>>>>
>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8020467
>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8020467/webrev.01
>>>>>>>>>>
>>>>>>>>> Have you considered to force GC when getUsed() == 0 regardless of
>>>>>>>>> which
>>>>>>>>> memory pool it is? This will avoid special casing for CMS old
>>>>>>>>> gen in
>>>>>>>>> the test and will handle similar issue in the future for a
>>>>>>>>> different
>>>>>>>>> collector implementation. To make the test reliable, the test
>>>>>>>>> should
>>>>>>>>> still pass if the memory pool has no object in it (G1 survivor
>>>>>>>>> space
>>>>>>>>> case?).
>>>>>>>> Hi Mandy,
>>>>>>>>
>>>>>>>> I don't know whether GC will help for other pools - but I can
>>>>>>>> enable it for all pools - it should not hurt.
>>>>>>>>
>>>>>>>> The test should pass even with on object in the monitored pool
>>>>>>>> since the pool should not report an exceeded threshold.
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>>
>>>>>>>>> Mandy
>>>>>>>>>
>>>>>>>>>> The test tries to make sure that the "pool usage threshold"
>>>>>>>>>> trigger
>>>>>>>>>> and the reported pool memory usage are not contradicting each
>>>>>>>>>> other.
>>>>>>>>>> The problem is that it is not possible to get the "pool usage
>>>>>>>>>> threshold exceeded" flag and the pool memory usage atomicly in
>>>>>>>>>> regard
>>>>>>>>>> to the GC. Specifically, when "CMS Old Gen" pool is examined
>>>>>>>>>> and the
>>>>>>>>>> usage is retrieved before a GC promotes some objects to the old
>>>>>>>>>> gen
>>>>>>>>>> but the usage threshold is checked after the GC has promoted
>>>>>>>>>> some
>>>>>>>>>> instance into the old gen the test will fail.
>>>>>>>>>>
>>>>>>>>>> The patch makes sure that there are some instances promoted in
>>>>>>>>>> "CMS
>>>>>>>>>> Old Gen" before checking the "pool usage threshold" to get
>>>>>>>>>> semi-consistent view.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> -JB-
>>>>
>>>
>>
>
More information about the serviceability-dev
mailing list