RFR: 8229189: Improve JFR leak profiler tracing to deal with discontiguous heaps

Markus Gronlund markus.gronlund at oracle.com
Mon Sep 2 14:12:07 UTC 2019


Hi Erik,

Looks good.

Thank you for the detailed explanation.

Markus

-----Original Message-----
From: Erik Österlund 
Sent: den 7 augusti 2019 10:19
To: hotspot-runtime-dev at openjdk.java.net
Subject: RFR: 8229189: Improve JFR leak profiler tracing to deal with discontiguous heaps

Hi,

The JFR leak profiler has marking bit maps that assume a contiguous Java heap. ZGC is discontiguous, and therefore does not work with JFR. If one tried to use the JFR leak profiler with ZGC, it would allocate a bit map for the multi-terabyte "reserved region", even though perhaps only 64 MB is used, spread out across this address space. That is one of the reason the leakprofiler is turned off for ZGC.

In order to enable leakprofiler support on ZGC, the tracing must also use the Access API instead of raw oop loads. But that is outside the scope of this RFE; here we deal only with the discontiguous address space problem.

My solution involves implementing a segmented bit map, that makes no assumptions about the layout of the Java heap. Given an address, it looks up a bitmap fragment from a hash table, given the high order bits of a pointer. If there is no such fragment, it is populated to the table. The low order bits (shifted by LogMinObjAlignmentInBytes) are used to find the bit in the bit map for marking an object in the traversal.

In order to not cause regressions in the speed, some optimizations have been made:

1) The table uses & instead of % to lookup buckets, ensuring the table is always a power of two size.
2) The hot paths get inlined.
3) There is a cache for the last fragment, as the probability of two subsequent bit accesses for two objects found during tracing in the heap do not cross the set up fragment granule (64 MB heap memory) boundary. 
This is something G1 exploits for the cross region check, and the same general idea is applied here. The code also asks first if a bit is marked and then marks it, as two calls. The cache + inlining allows the compiler to lookup the fragment only once for the two operations.
4) Keep the table sparse.

As a result, no regressions that are outside of the noise can be noticed with this new more GC-agnostic approach.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8229189

Webrev:
http://cr.openjdk.java.net/~eosterlund/8229189/webrev.00/

Thanks,
/Erik


More information about the hotspot-runtime-dev mailing list