jmx-dev RFR 8020467: Inconsistency between usage.getUsed() and isUsageThresholdExceeded() with CMS Old Gen pool

Bengt Rutisson bengt.rutisson at oracle.com
Wed Oct 23 06:15:44 PDT 2013


On 2013-10-23 14:55, Jaroslav Bachorik wrote:
> Hi Bengt,
>
> On 23.10.2013 14:40, Bengt Rutisson wrote:
>>
>> Hi Jaroslav,
>>
>> A couple of questions.
>>
>> I don't understand why this is a CMS only problem? Why don't the other
>> collectors have the same issue? I guess it is less likely that the other
>> collectors start (or complete) a GC without a lot of allocation going
>> on. But at least G1 should have the same problem.
>
> I don't really know. If there are other pools that can have the "used" 
> value 0 before a GC happens then yes, they are susceptible to the same 
> problem.

I think all the "old" pools can have 0 used before a GC happens. But 
except for CMS and G1 it is less likely that a GC happens unless you do 
allocations. As long as they keep the 0 used the test will pass. So, my 
guess is that to be on the safe side all "old" pools should make sure to 
do a full GC first.

>
>>
>> Also, from the problem description in the CR I would have guessed that
>> you want the GC to happen between these two statements:
>>
>> p.setUsageThreshold(1);
>> MemoryUsage u = p.getUsage();
>
> This is all but a heuristic here. The problem lies in the fact that it 
> is not possible to retrieve the pool usage and the "threshold 
> exceeded" flag consistently in one atomic operation. I might get 
> usable data from the first call and then I don't need to force GC.

Right. This is why I think you want to avoid a GC after you have fetched 
getUsage() but before you do isUsageThresholdExceeded(). With your 
suggested patch you are explicitly inserting a GC at that point. To me 
this sounds like the opposite of what you want to do.

>
>>
>> Now you have added the GC just after these statements. I thought that
>> was what caused the problem. That you read the usage data at one point,
>> then a GC happens and you compare the cached usage
>> data to the new data that p.isUsageThresholdExceeded() will fetch.
>>
>> Looking at the promoteToOldGen() method I assume that the intent is that
>> the code should be using the return value. So my guess is that this 
>> code:
>>
>>    94         if (p.getName().equals("CMS Old Gen")) {
>>    95             promoteToOldGen(p, u);
>>    96         }
>>
>> Should be:
>>
>>    94         if (p.getName().equals("CMS Old Gen")) {
>>    95             u = promoteToOldGen(p, u);
>>    96         }
>
> Indeed. It was meant to re-fetch the usage after GC.

OK. Good. With this code I think it should work. Now you make sure to 
get the GC before you do getUsage().

>
>>
>> With that, I think it might work. But I still don't understand why this
>> is only a CMS problem.
>>
>> One more question about the promoteToOldGen() and forceGC() methods. I
>> don't really know much about how the different beans work, but are we
>> sure that the MemoryPoolMXBeans and GarbageCollectorMXBeans use the same
>> pool names? That is, are you sure that forceGC() actually will do 
>> anything?
>
> They use the pool names as reported by the GC infrastracture so they 
> should be the same.

Ok.

>
>>
>> As for just doing a System.gc() to force a GC I think you can rely on
>> that System.gc() does a full GC in Hotspot unless someone sets
>> -XX:+DisableExplicitGC on the command line. Considering that you are
>> relying on Hotspot specifc names for pools I don't think it is a
>> limitation to the test to rely on the Hotspot implementatoin of
>> System.gc().
>
> Good to know. I guess I could simplify the change and just call 
> System.gc(), after all.

Yes, I think that' simpler.

Thanks,
Bengt

>
> Thanks,
>
> -JB-
>
>>
>> Thanks,
>> Bengt
>>
>>
>>
>>
>> On 2013-10-23 10:18, Staffan Larsen wrote:
>>> On 23 okt 2013, at 10:12, Jaroslav Bachorik
>>> <jaroslav.bachorik at oracle.com> wrote:
>>>
>>>> On 23.10.2013 10:08, Staffan Larsen wrote:
>>>>> I think you can simplify the logic for forcing a GC to just a simple
>>>>> call to "System.gc();". AFAIK System.gc() will cause a full
>>>>> collection to happen for all collectors.
>>>> Hm, will it now? I had the impression that it was just hinting the GC
>>>> system to perform GC but it might decide to ignore it. I need to be
>>>> sure that the GC was performed before continuing - otherwise I might
>>>> get inconsistent data again.
>>> According to the spec it's just a hint, but I think the implementation
>>> happens to be a force. But better safe than sorry. :)
>>>
>>> /Staffan
>>>
>>>> -JB-
>>>>
>>>>> /Staffan
>>>>>
>>>>> On 23 okt 2013, at 10:02, Jaroslav Bachorik
>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>
>>>>>> On 22.10.2013 22:04, Mandy Chung wrote:
>>>>>>> Hi Jaroslav,
>>>>>>>
>>>>>>> On 10/22/13 6:47 AM, Jaroslav Bachorik wrote:
>>>>>>>> Please, review the following test fix:
>>>>>>>>
>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8020467
>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8020467/webrev.01
>>>>>>>>
>>>>>>> Have you considered to force GC when getUsed() == 0 regardless of
>>>>>>> which
>>>>>>> memory pool it is?  This will avoid special casing for CMS old 
>>>>>>> gen in
>>>>>>> the test and will handle similar issue in the future for a 
>>>>>>> different
>>>>>>> collector implementation.  To make the test reliable, the test 
>>>>>>> should
>>>>>>> still pass if the memory pool has no object in it (G1 survivor 
>>>>>>> space
>>>>>>> case?).
>>>>>> Hi Mandy,
>>>>>>
>>>>>> I don't know whether GC will help for other pools - but I can
>>>>>> enable it for all pools - it should not hurt.
>>>>>>
>>>>>> The test should pass even with on object in the monitored pool
>>>>>> since the pool should not report an exceeded threshold.
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>>> Mandy
>>>>>>>
>>>>>>>> The test tries to make sure that the "pool usage threshold" 
>>>>>>>> trigger
>>>>>>>> and the reported pool memory usage are not contradicting each 
>>>>>>>> other.
>>>>>>>> The problem is that it is not possible to get the "pool usage
>>>>>>>> threshold exceeded" flag and the pool memory usage atomicly in
>>>>>>>> regard
>>>>>>>> to the GC. Specifically, when "CMS Old Gen" pool is examined 
>>>>>>>> and the
>>>>>>>> usage is retrieved before a GC promotes some objects to the old 
>>>>>>>> gen
>>>>>>>> but the usage threshold is checked after the GC has promoted some
>>>>>>>> instance into the old gen the test will fail.
>>>>>>>>
>>>>>>>> The patch makes sure that there are some instances promoted in 
>>>>>>>> "CMS
>>>>>>>> Old Gen" before checking the "pool usage threshold" to get
>>>>>>>> semi-consistent view.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> -JB-
>>
>



More information about the serviceability-dev mailing list