RFR: 8312132: Add tracking of multiple address spaces in NMT [v105]

Thu May 23 06:59:18 UTC 2024

On Wed, 22 May 2024 14:12:44 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> Hi,
>> 
>> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix.
>> 
>> ## `MemoryFileTracker`
>> 
>> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is:
>> 
>> ```c++
>>     static MemoryFile* make_device(const char* descriptive_name);
>>     static void free_device(MemoryFile* device);
>> 
>>     static void allocate_memory(MemoryFile* device, size_t offset, size_t size,
>>                                 MEMFLAGS flag, const NativeCallStack& stack);
>>     static void free_memory(MemoryFile* device, size_t offset, size_t size);
>> 
>> 
>> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does:
>> 
>> ```c++
>> void ZNMT::reserve(zaddress_unsafe start, size_t size) {
>>   MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap);
>> }
>> void ZNMT::commit(zoffset offset, size_t size) {
>>   MemTracker::allocate_memory_in(ZNMT::_device, static_cast<size_t>(offset), size, mtJavaHeap, CALLER_PC);
>> }
>> void ZNMT::uncommit(zoffset offset, size_t size) {
>>   MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size);
>> }
>> 
>> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) {
>>   // NMT doesn't track mappings at the moment.
>> }
>> void ZNMT::unmap(zaddress_unsafe addr, size_t size) {
>>   // NMT doesn't track mappings at the moment.
>> }
>> 
>> 
>> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. 
>> 
>> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are:
>> 
>> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo...
>
> Johan Sjölen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Lower number of pages

More comments, but out of time. Will continue tomorrow or next week.

test/hotspot/gtest/nmt/test_vmatree.cpp line 67:

> 65:   using Tree = VMATree;
> 66:   using Node = Tree::TreapNode;
> 67:   using NCS = NativeCallStackStorage;

Tree: not consistently used. I would prefer consistent use of VMATree, but if you alias, it should at least be used consistently.
NCS: not used at all.
If you need these definitions, can they not be somewhere at the start of the file?

test/hotspot/gtest/nmt/test_vmatree.cpp line 70:

> 68:   NativeCallStackStorage ncs(true);
> 69:   NativeCallStackStorage::StackIndex si1 = ncs.push(stack1);
> 70:   NativeCallStackStorage::StackIndex si2 = ncs.push(stack2);

Should be provided by the fixup class, or as global statics.

test/hotspot/gtest/nmt/test_vmatree.cpp line 71:

> 69:   NativeCallStackStorage::StackIndex si1 = ncs.push(stack1);
> 70:   NativeCallStackStorage::StackIndex si2 = ncs.push(stack2);
> 71: 

Please make these auto functions normal conventional functions. I really dislike that style.

It has real disadvantages. For example, no IDE I know can resolve a call graph across them, or even into them. E.g., in CDT, one of the most capable C++ IDEs, the call graph for VMATreeTest::in_type_of gives me .... nothing. I need to do a dumb full text search to find the call sites. And if the term to search for is very generic, I am out of luck. I know several developers that rely heavily on call graph search in CDT, I certainly do.

New techniques should serve a purpose. Lambdas as replacement for our old Closures makes sense, since they bring benefit (avoiding runtime polymorphy). But here I don't see the point.

test/hotspot/gtest/nmt/test_vmatree.cpp line 77:

> 75:     for (int i = 0; i < 100; i++) {
> 76:       tree.reserve_mapping(i * 100, 100, rd);
> 77:     }

No need to do this 100 times, and it somewhat obfuscates what you want to do here. Another problem with 100 times is that you seem to like EXPECT more than ASSERT, and if error we get 100+ error messages.

Comment is inprecise. Only 2 nodes *if properties match*. Please change comment to something like "adjacent areas having the same properties should be merged", then simplify to something like "reserve 1, reserve 2, expect count 2"

test/hotspot/gtest/nmt/test_vmatree.cpp line 81:

> 79:     treap(tree).visit_range_in_order(0, 999999, [&](Node* x) {
> 80:       found_nodes++;
> 81:     });

Recurring pattern. Give the fixture class a `count_nodes()` and do it there. 
Ideally, with some sugar for asserts to keep testing code small, e.g. `assert_count(int expected_count)`.

test/hotspot/gtest/nmt/test_vmatree.cpp line 125:

> 123:     for (int i = 0; i < 100; i++) {
> 124:       tree.release_mapping(i*100, 100);
> 125:     }

If you feel you need the 100-times-reservation, write a helper function in the fixture. reserve_100() or so.

test/hotspot/gtest/nmt/test_vmatree.cpp line 126:

> 124:       tree.release_mapping(i*100, 100);
> 125:     }
> 126:     EXPECT_EQ(nullptr, treap_root(tree)) << "Releasing all memory should result in an empty tree";

Recurring. Give fixture something like assert_null_root()

test/hotspot/gtest/nmt/test_vmatree.cpp line 156:

> 154:       }
> 155:       i++;
> 156:     });

This doesn't really test the state, nor the stack. It also seems to be a lot of code for a single-use test. Similar in other tests.

---

Proposal: since you have the recurring pattern of doing something with the tree, then checking its expected state, write a check function in the fixture (e.g. assert_tree_state() ) that does that.

Then use it like this:

// Committing in middle of reservation ends with a sequence of 4 nodes
TEST_VM_F(VMATreeTest, commit_in_middle_of_reservation) {

	... reserve, commit in middle, then

	const AddressState expected_states[] = {
		{ 0,  .... },
		{ 25,  .... },
		{ 50,  .... },
		{ 100,  .... } };
	assert_tree_state(expected_states, 4);
}

And to preserve your sanity when constructing AddressState with its many components, you can add a helper that allows that in a human-friendly form.

For example:
A string that encodes flag, stack, and reserved/Commit state in three letters. 
First letter: A-H, let this be one of 8 selected MEMFLAGS or - for mtNone
Second letter: a-d, let this be one of 4 selected stacks, or - for no stack/empty
Third letter: R or C , resreved, committed or none, or - for unreserved
Season to taste.

Then, e.g. for reserving 0..100, and committing 25..50, you write:

const AddressState expected_states[] = {
	makestate( 0, "---", "AaR"),
	makestate(25, "AaR", "AaC"),
	makestate(50, "AaC", "AaR"),
	makestate( 0, "AaR", "---") };
assert_tree_state(expected_states, 4);

Now you can write a bunch of corner-case tests without being too verbose, intent is immediately clear. As an added bonus, it checks for tree state *exactly*, e.g. which nodes are part of the tree, which are not, in which order, etc.

If you reuse that string format for reserving etc, you could do this:

    reserve_or_commit(0, 100, "AaR");
    reserve_or_commit(25, 50, "AaC");

or just 

    do(0, 100, "AaR");
    do(25, 50, "AaC");

>From there one, one could even auto-build the expected state, but if too much automatism goes into the test, this increases the possibility for errors creeping in one never finds because they make the test go green.

test/hotspot/gtest/nmt/test_vmatree.cpp line 351:

> 349: struct SimpleVMATracker : public CHeapObj<mtTest> {
> 350:   const size_t page_size = 4096;
> 351:   enum Tpe { Reserved, Committed, Free };

Wow, we save one letter! ;-)

test/hotspot/gtest/nmt/test_vmatree.cpp line 366:

> 364:   };
> 365:   // Page (4KiB) granular array
> 366:   const size_t num_pages = 1024 * 512;

constexpr. And then use it :) see line below

test/hotspot/gtest/nmt/test_vmatree.cpp line 377:

> 375: 
> 376:   VMATree::SummaryDiff do_it(Tpe tpe, size_t start, size_t size, NativeCallStack stack, MEMFLAGS flag) {
> 377:     assert(size % page_size == 0 && start % page_size == 0, "page alignment");

use is_aligned

test/hotspot/gtest/nmt/test_vmatree.cpp line 440:

> 438:   for (int i = 0; i < operation_count; i++) {
> 439:     const size_t page_start = (size_t)(os::random() % tr->num_pages);
> 440:     const size_t num_pages = (size_t)(os::random() % (tr->num_pages - page_start));

Proposal: roll dice for two positions in num_pages range. Then sort them (swap if A > B). 

That gives you a fairer distribution, since you don't overemphasize the end of the range.

test/hotspot/gtest/nmt/test_vmatree.cpp line 447:

> 445:     const size_t size = num_pages * page_size;
> 446: 
> 447:     const MEMFLAGS flag = (MEMFLAGS)(os::random() % mt_number_of_types);

I would maybe scale down, not use mt_number_of_types. 

We want to stress merging, too. So the total number of possible states should not that large. E.g. 8 states, with e.g. 4 flags and 2 stacks.

-------------

PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-2067634593
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611021414
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611024758
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1610959890
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611037411
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611035446
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611038979
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611039692
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611067145
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611069569
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611091143
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611075904
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611095608
PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1611098358