RFR: 8312132: Add tracking of multiple address spaces in NMT [v5]

Fri Mar 22 14:40:34 UTC 2024

On Wed, 20 Mar 2024 17:06:06 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> Hi,
>> 
>> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix.
>> 
>> ## `MemoryFileTracker`
>> 
>> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is:
>> 
>> ```c++
>>     static MemoryFile* make_device(const char* descriptive_name);
>>     static void free_device(MemoryFile* device);
>> 
>>     static void allocate_memory(MemoryFile* device, size_t offset, size_t size,
>>                                 MEMFLAGS flag, const NativeCallStack& stack);
>>     static void free_memory(MemoryFile* device, size_t offset, size_t size);
>> 
>> 
>> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does:
>> 
>> ```c++
>> void ZNMT::reserve(zaddress_unsafe start, size_t size) {
>>   MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap);
>> }
>> void ZNMT::commit(zoffset offset, size_t size) {
>>   MemTracker::allocate_memory_in(ZNMT::_device, static_cast<size_t>(offset), size, mtJavaHeap, CALLER_PC);
>> }
>> void ZNMT::uncommit(zoffset offset, size_t size) {
>>   MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size);
>> }
>> 
>> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) {
>>   // NMT doesn't track mappings at the moment.
>> }
>> void ZNMT::unmap(zaddress_unsafe addr, size_t size) {
>>   // NMT doesn't track mappings at the moment.
>> }
>> 
>> 
>> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. 
>> 
>> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are:
>> 
>> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo...
>
> Johan Sjölen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Include os.inline.hpp

Hi @jdksjolen,

Nice work! The general direction is good.

I've glanced over parts of the change. Unfortunately, I won't be able to provide more feedback until after Easter because of a lack of time.

Cheers, Thomas

src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 30:

> 28: #include "utilities/growableArray.hpp"
> 29: #include "utilities/nativeCallStack.hpp"
> 30: 

Here, I would love it if we had smaller than pointer-sized callstack IDs. 

The simplest way would be to copy callstacks into a growable area, and use their numerical indices as ID. We can get by with 16 bits, or 32 bits if we think 64K callstacks are not enough (they should be enough).

Those can then be combined very efficiently with MEMFLAG and VMAState in the VMATree nodes.

The hashmap for reverse id lookup could be a standard hashmap of key=callstack, value=ID.

As a future possible improvement, that could then replace the MallocSiteTable which does almost the same thing. (only have to avoid using malloc then, because of recursivities).

src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 84:

> 82:       return *_stack;
> 83:     }
> 84:   };

What is the point of the StackIndex class?

src/hotspot/share/nmt/nmtTreap.hpp line 45:

> 43: // the tree which is O(log n) so we are safe from stack overflow.
> 44: 
> 45: // TreapNode has LEQ nodes on the left, GT nodes on the right.

Simplification: why do we need a separate template parameter for the comparison? Why not just use K's operators < > == ?

src/hotspot/share/nmt/nmtTreap.hpp line 48:

> 46: template<typename K, typename V, int(*CMP)(K,K)>
> 47: class TreapNode {
> 48:   template<typename InnerK, typename InnerV, int(*CMPP)(InnerK,InnerK)>

Could we do without the following friend declarations? Also, does the TreapCHeap friend not need to be declared as template? (At least my CDT complains about this)

src/hotspot/share/nmt/nmtTreap.hpp line 54:

> 52:   uint64_t _priority;
> 53:   K _key;
> 54:   V _value;

Should both key and value be const? After all, you don't want to modify either after the node was added to the tree.

src/hotspot/share/nmt/vmatree.hpp line 36:

> 34: 
> 35: 
> 36: class VMATree {

If we do keep a separate comparator template argument (see my remark before, maybe we don't): do we really need this as external function in a C++ file? I think we want this inlined, no?

src/hotspot/share/nmt/vmatree.hpp line 46:

> 44: 
> 45:   // Each node has some stack and a flag associated with it.
> 46:   struct Metadata {

all members const?

src/hotspot/share/nmt/vmatree.hpp line 63:

> 61:     }
> 62:   };
> 63: 

In my original proposal, a `MappingState` (https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L106C7-L114) is the combination of ( MEMFLAGS + VMAstate ).

A `MappingStateChange` https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L38-L65 is the combination of two `MappingState`. Its "is_noop()" method compares the combination of both ( MEMFLAGS + VMAstate ) tuples.

This `State` here is neither. It contains only the incoming VMAState, and carries the full information only for the outgoing state. Its ` is_noop` only compares VMAState. This renders `is_noop()` pretty useless since it alone is not sufficient when checking if areas for mergeability. Consequently, you then tree-search the adjacent node whenever is_noop is used ( e.g. line 161, `if (is_noop(stA) && Metadata::equals(stA.metadata, leqA_n->val().metadata)) { ` .

I would suggest following my proposal here and carrying the full information for both incoming and outgoing states. Advantages would be:
- it removes the need for tree searches whenever we check whether neighboring areas can be folded.
- it would also make the code easier to read and arguably more robust
- it is arguably more logical - semantically, there is no difference between VMAState and the rest of the state information, all of them are part of the total region state that determines if regions are mergeable.

If you are worried about data size: all the information we need can still be encoded in a 64-bit value, as suggested above. And compared with a single cmpq. 16 bits for MEMFLAGS, 2 bits for VMAState, and 32 bits are more than enough as a callstack id (in fact, 16 bits should be enough).

src/hotspot/share/nmt/vmatree.hpp line 312:

> 310:   }
> 311: 
> 312:   SummaryDiff commit_mapping(size_t from, size_t sz, Metadata& metadata) {

So, the `Merge` template parameter only tries to solve the problem that callers don't specify MEMFLAGS on commit or uncommit, right?

We can
- rethink that; require all callers of commit/uncommit to provide MEMFLAGS too. I know I have been arguing against it, but I like the simpler implementation of NMT. And arguably, it closer reflects `mmap` realities, where reserving and committing don't have the same roles as in hotspot.
- don't make merging a topic for the tree. If we want the MEMFLAGS from prior reservations to carry over, just search the tree prior to registering the mapping. Since we need a "tell me the region this pointer points to" functionality anyway, just let's reuse that one. 
- or, at least, revert to a simple bool or enum that describes merging behavior. Keep it simple and stupid.

src/hotspot/share/nmt/vmatree.hpp line 326:

> 324:       // The releasing API also has no call stack, so we inherit the callstack also.
> 325:       merge_into.flag = existent.flag;
> 326:       merge_into.stack_idx = existent.stack_idx;

Okay, here we don't have to merge at all, no? I don't see any reason to preserve meta info for a released area. 

Therefore, I would just register the area with mtNone and empty-callstack-id.

-------------

PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-1954770433
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535682200
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535683085
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535565061
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535559268
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535560678
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535566453
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535567099
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535579388
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535644027
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535650870