RFR: 8229189: Improve JFR leak profiler tracing to deal with discontiguous heaps
David Holmes
david.holmes at oracle.com
Sun Sep 1 22:42:51 UTC 2019
Adding JFR dev list. I don't know if JFR folk hang out on runtime-dev.
David
On 31/08/2019 12:25 am, Erik Österlund wrote:
> Hi,
>
> Ping. This patch is a prerequisite for pushing the now reviewed
> 8224815: Remove non-GC uses of CollectedHeap::is_in_reserved()
>
> And that is needed for the ZGC mac port.
>
> Thanks,
> /Erik
>
> On 2019-08-07 10:18, Erik Österlund wrote:
>> Hi,
>>
>> The JFR leak profiler has marking bit maps that assume a contiguous
>> Java heap. ZGC is discontiguous, and therefore does not work with JFR.
>> If one tried to use the JFR leak profiler with ZGC, it would allocate
>> a bit map for the multi-terabyte "reserved region", even though
>> perhaps only 64 MB is used, spread out across this address space. That
>> is one of the reason the leakprofiler is turned off for ZGC.
>>
>> In order to enable leakprofiler support on ZGC, the tracing must also
>> use the Access API instead of raw oop loads. But that is outside the
>> scope of this RFE; here we deal only with the discontiguous address
>> space problem.
>>
>> My solution involves implementing a segmented bit map, that makes no
>> assumptions about the layout of the Java heap. Given an address, it
>> looks up a bitmap fragment from a hash table, given the high order
>> bits of a pointer. If there is no such fragment, it is populated to
>> the table. The low order bits (shifted by LogMinObjAlignmentInBytes)
>> are used to find the bit in the bit map for marking an object in the
>> traversal.
>>
>> In order to not cause regressions in the speed, some optimizations
>> have been made:
>>
>> 1) The table uses & instead of % to lookup buckets, ensuring the table
>> is always a power of two size.
>> 2) The hot paths get inlined.
>> 3) There is a cache for the last fragment, as the probability of two
>> subsequent bit accesses for two objects found during tracing in the
>> heap do not cross the set up fragment granule (64 MB heap memory)
>> boundary. This is something G1 exploits for the cross region check,
>> and the same general idea is applied here. The code also asks first if
>> a bit is marked and then marks it, as two calls. The cache + inlining
>> allows the compiler to lookup the fragment only once for the two
>> operations.
>> 4) Keep the table sparse.
>>
>> As a result, no regressions that are outside of the noise can be
>> noticed with this new more GC-agnostic approach.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8229189
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8229189/webrev.00/
>>
>> Thanks,
>> /Erik
More information about the hotspot-jfr-dev
mailing list