RFR 8215523: jstat reports incorrect values for OU for CMS GC

Poonam Parhar poonam.bajaj at oracle.com
Fri Jun 21 20:30:05 UTC 2019


Updating the subject line with the updated bug synopsis.

And also, much thanks to Thomas Schatzl for his remarkable help in 
tweaking and refining this fix.

regards,
Poonam

On 6/21/19 12:21 PM, Poonam Parhar wrote:
> Hello,
>
> Please review the changes for the fix of:
>
> Bug 8215523 <https://bugs.openjdk.java.net/browse/JDK-8215523>: jstat 
> reports incorrect values for OU for CMS GC
> Webrev: http://cr.openjdk.java.net/~poonam/8215523/webrev.00/
>
> Problem:
> With CMS, at certain points, 'OU' (Old usage) values reported in the 
> jstat output increase first and then drop down. This is wrong because 
> the 'old usage' drops down without any old generation collection.
>
> Cause:
> For the 'used' values of different memory spaces, jstat uses a 
> periodic task to read these values and update the corresponding jstat 
> counters.
>
> For CMS generation, there is a sampler that periodically updates the 
> 'used' counter:
>
>  class GenerationUsedHelper : public PerfLongSampleHelper {
> 103 private:
> 104 Generation* _gen;
> 105
> 106 public:
> 107 GenerationUsedHelper(Generation* g) : _gen(g) { }
> 108
> 109 inline jlong take_sample() {
> 110 return _gen->used();
> 111 }
> 112 };
>
> The above used() call in turn makes a call to 
> CompactibleFreeListSpace::used().
>
> 376 size_t CompactibleFreeListSpace::used() const {
> 377 return capacity() - free();
> 378 }
> 379
> 380 size_t CompactibleFreeListSpace::free() const {
> 381 // "MT-safe, but not MT-precise"(TM), if you will: i.e.
> 382 // if you do this while the structures are in flux you
> 383 // may get an approximate answer only; for instance
> 384 // because there is concurrent allocation either
> 385 // directly by mutators or for promotion during a GC.
> 386 // It's "MT-safe", however, in the sense that you are guaranteed
> 387 // not to crash and burn, for instance, because of walking
> 388 // pointers that could disappear as you were walking them.
> 389 // The approximation is because the various components
> 390 // that are read below are not read atomically (and
> 391 // further the computation of totalSizeInIndexedFreeLists()
> 392 // is itself a non-atomic computation. The normal use of
> 393 // this is during a resize operation at the end of GC
> 394 // and at that time you are guaranteed to get the
> 395 // correct actual value. However, for instance, this is
> 396 // also read completely asynchronously by the "perf-sampler"
> 397 // that supports jvmstat, and you are apt to see the values
> 398 // flicker in such cases.
> 399 assert(_dictionary != NULL, "No _dictionary?");
> 400 return (_dictionary->total_chunk_size(DEBUG_ONLY(freelistLock())) +
> 401 totalSizeInIndexedFreeLists() +
> 402 _smallLinearAllocBlock._word_size) * HeapWordSize;
> 403 }
>
> As stated above in the code comments, free() can return an imprecise 
> approximate value when allocations are happening in the 
> CompactibleFreeListSpace at the same time, e.g. during direct 
> allocations or promotions.
>
> Solution:
> For CMS, maintain an internal variable for the space usage. This 
> internal variable gets updated only at points where the CMS space 
> usage computation is guaranteed to be accurate. jstat and other memory 
> usage monitoring tools get access to this stable value of used space.
>
> Testing: jdk-tier1,jdk-tier2,jdk-tier3,hs-tier1,hs-tier2,hs-tier3
>
> Thanks,
> Poonam
>
>




More information about the hotspot-gc-dev mailing list