RFR: 8328944: NMT reports "unknown" memory

Thomas Stuefe stuefe at openjdk.org
Mon Nov 4 08:36:42 UTC 2024


On Fri, 1 Nov 2024 20:44:50 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:

> We use mtNone value in several functions default parameters, which may show up in NMT reports.
> 
> We address this, by removing the default value and forcing, where possible, for the callers to declare NMT tag that reflects the usage.
> 
> Eventually the goal would be not to use mtNone anywhere, but we are not there quite yet.
> 
> TODO: update copyrights.
> 
> Testing: undergoing MARCH5 tier1-5 ...

Had a cursory look. The direction is okay. Some general comments:

1) We have two cases: a) where today we report mtNone and b) where today we report a tag other than mtNone, but the tag was set after the initial reservation

For the latter case, if the case is simple (the reservation was for a single thing, and its purpose was clear at reservation) there must have existed a subsequent NMT registration, which you should then remove since it should not be necessary anymore. I will point them out if I stumble over them, but may miss a bunch.

2) About 
> Eventually the goal would be not to use mtNone anywhere, but we are not there quite yet.

I think this is neither realistic nor desirable. mtNone means "don't know yet; will do later". This is perfectly valid for cases where
- you allocate something on behalf of someone else; obviously, you don't know what he intends to do with it
- when you reserve a range that is later shared by multiple tags.

If we move to more fine granular tags, or something like hierarchical tags, the latter point will become more important. Therefore, mtNone has its place.

src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp line 72:

> 70:     assert(immediate > 0 && Assembler::operand_valid_for_logical_immediate(/*is32*/false, immediate),
> 71:            "Invalid immediate %d " UINT64_FORMAT, index, immediate);
> 72:     result = os::attempt_reserve_memory_at((char*)immediate, size, false, mtMetaspace);

mtMetaspace is wrong. There is no right tag apart from mtNone, since this address range is a combination of mtClass+mtClassShared, depending on how class space is configured. mtNone signals "will be set later", which is the correct course here IHMO.

The tag is set later, see metaspace.cpp and metaspaceShared.cpp

src/hotspot/os/bsd/gc/z/zPhysicalMemoryBacking_bsd.cpp line 82:

> 80: 
> 81:   // Reserve address space for backing memory
> 82:   _base = (uintptr_t)os::reserve_memory(max_capacity, false, mtGC);

This is Java heap.

src/hotspot/share/cds/archiveBuilder.cpp line 330:

> 328:   size_t buffer_size = estimate_archive_size();
> 329:   ReservedSpace rs(buffer_size, MetaspaceShared::core_region_alignment(),
> 330:                    os::vm_page_size(), nullptr, mtMetaspace);

This is not Metaspace. Possibly mtClassShared, though that one is reserved for the runtime mapping of the CDS archive; this is the dumptime image. mtInternal would be the best fit, IMHO.

src/hotspot/share/cds/metaspaceShared.cpp line 1387:

> 1385:       total_space_rs = ReservedSpace(total_range_size, base_address_alignment,
> 1386:                                      os::vm_page_size(), (char*)base_address, mtClass);
> 1387:     } else {

All the changes here in `MetaspaceShared::reserve_address_space_for_archives` are unnecessary; we tag the address region already. No need to do this multiple times, and in every case branch. The ReservedSpace objects here are not used for much and mostly ephemeral anyway - they are only used to reserve memory here, but after that have not much meaning, and apart from the class space rs don't even survive the invocation.

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 249:

> 247:             _bitmap_bytes_per_slice, bitmap_page_size);
> 248: 
> 249:   ReservedSpace bitmap(_bitmap_size, bitmap_page_size, mtGC);

remove nmt registration below?

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 269:

> 267: 
> 268:   if (ShenandoahVerify) {
> 269:     ReservedSpace verify_bitmap(_bitmap_size, bitmap_page_size, mtGC);

remove nmt registration below?

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 288:

> 286: 
> 287:   ReservedSpace aux_bitmap(_bitmap_size, aux_bitmap_page_size, mtGC);
> 288:   os::trace_page_sizes_for_requested_size("Aux Bitmap",

remove nmt registration below?

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 306:

> 304: 
> 305:   ReservedSpace region_storage(region_storage_size, region_page_size, mtGC);
> 306:   os::trace_page_sizes_for_requested_size("Region Storage",

remove nmt registration below?

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 340:

> 338: 
> 339:     if (_collection_set == nullptr) {
> 340:       cset_rs = ReservedSpace(cset_size, cset_align, os::vm_page_size(), nullptr, mtGC);

remove NMT registration in ShenandoahCollectionSet?

src/hotspot/share/jfr/recorder/storage/jfrVirtualMemory.cpp line 107:

> 105:   _rs = ReservedSpace(reservation_size_request_bytes,
> 106:                       os::vm_allocation_granularity(), os::vm_page_size(),
> 107:                       nullptr, mtTracing);

remove NMT registraton?

src/hotspot/share/memory/heap.cpp line 231:

> 229:   // reserve space for _segmap
> 230:   ReservedSpace seg_rs(reserved_segments_size, mtCode);
> 231:   if (!_segmap.initialize(seg_rs, committed_segments_size)) {

remove NMT registration

src/hotspot/share/memory/metaspace.cpp line 605:

> 603:     assert(is_aligned(result, Metaspace::reserve_alignment()), "Alignment too small for metaspace");
> 604:     rs = ReservedSpace::space_for_range(result, size, Metaspace::reserve_alignment(),
> 605:                                                       os::vm_page_size(), false, false, mtMetaspace);

Please revert to mtNone; same reason as given in metaspaceShared.cpp. You can add a comment that tags are adjusted afterward, when the distribution of mtClass and mtClassShared is clear.

src/hotspot/share/memory/metaspace.cpp line 746:

> 744:       }
> 745:       rs = ReservedSpace(size, Metaspace::reserve_alignment(),
> 746:                          os::vm_page_size() /* large */, (char*)base, mtMetaspace);

mtClass, but actually not needed, since we set mtClass below. Again, not necessary to do this for every branch individually we take here when reserving class space memory.

src/hotspot/share/memory/metaspace.cpp line 775:

> 773: 
> 774:     // Mark metaspace as such
> 775:     MemTracker::record_virtual_memory_tag((address)rs.base(), mtMetaspace);

No, please revert. mtClass is correct here.

src/hotspot/share/memory/metaspace/testHelpers.cpp line 82:

> 80:     // have reserve limit -> non-expandable context
> 81:     _rs = ReservedSpace(reserve_limit * BytesPerWord, Metaspace::reserve_alignment(),
> 82:                         os::vm_page_size(), nullptr, MemTag::mtMetaspace);

Give this mtTest

src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 258:

> 256:   ReservedSpace rs(word_size * BytesPerWord,
> 257:                    Settings::virtual_space_node_reserve_alignment_words() * BytesPerWord,
> 258:                    os::vm_page_size(), nullptr, mtMetaspace);

remove tracking below.

This is actually wrong, and only works out of accident correctly, since this constructor just happens to be called for mtMetaspace; more correct would be to hand in the tag. This is on my plate, there is a JBS issue somewhere.

src/hotspot/share/memory/virtualspace.hpp line 50:

> 48:   // ReservedSpace
> 49:   ReservedSpace(char* base, size_t size, size_t alignment,
> 50:                 size_t page_size, bool special, bool executable, MemTag mem_tag);

The growth in complexity is unfortunate.

src/hotspot/share/memory/virtualspace.hpp line 67:

> 65: 
> 66:   void reserve(size_t size, size_t alignment, size_t page_size,
> 67:                char* requested_address, bool executable, MemTag mem_tag);

Why do the instance methods get MemTag? Why is MemTag not stored as a (const) attribute of ReservedSpace? 

But then, we would have double accounting - we'd store the tag both in NMT and in ReservedSpace instances. If we change them post-reservation, we would need to change both. Sigh.

So we pass in MemTag just for the ReservedSpace to pass it into os::reserve_memory. Seeing how often ReservedSpace is used in situations where the tag is not clear at reservation time, I wonder whether we are not better off letting the user of ReservedSpace register the NMT tag post reservation like they do today.

src/hotspot/share/nmt/memReporter.cpp line 83:

> 81:   if (mem_tag != mtNone) {
> 82:     out->print("(%s" SIZE_FORMAT "%s type=%u", alloc_type,
> 83:       amount_in_current_scale(amount), scale, (unsigned)mem_tag);

No, please revert.

src/hotspot/share/nmt/memReporter.cpp line 252:

> 250:    // report malloc'd memory
> 251:   if (amount_in_current_scale(MAX2(malloc_memory->malloc_size(), pk_malloc)) > 0) {
> 252:     print_malloc(malloc_memory->malloc_counter(), mtNMT);

Please revert, the original was correct. print_malloc is used for both summary and detail reports; for summary, we pass in mtNone deliberately since the report happens under a common tag.

src/hotspot/share/nmt/virtualMemoryTracker.cpp line 364:

> 362:       // thread does not detach from VM before exits, and leads to
> 363:       // leak JavaThread object
> 364:       if ((reserved_rgn->mem_tag() == mtThreadStack) && (reserved_rgn->mem_tag() == mem_tag)) {

Please explain the changes in this function, especially the changed conditions. The added conditions for tag equality seem wrong to me.

src/hotspot/share/runtime/safepointMechanism.cpp line 61:

> 59:     const size_t page_size = os::vm_page_size();
> 60:     const size_t allocation_size = 2 * page_size;
> 61:     char* polling_page = os::reserve_memory(allocation_size, false, mtSafepoint);

remove registration

src/hotspot/share/utilities/debug.cpp line 712:

> 710: 
> 711: void initialize_assert_poison() {
> 712:   char* page = os::reserve_memory(os::vm_page_size(), false, mtInternal);

remove registration

test/hotspot/jtreg/runtime/NMT/MallocRoundingReportTest.java line 60:

> 58:                 NMTTestUtils.runJcmdSummaryReportAndCheckOutput(
> 59:                         "Test (reserved=" + numKB + "KB, committed=" + numKB + "KB)",
> 60:                         "(malloc=" + numKB + "KB type=12 #1) (at peak)"

Again, this is wrong. The memory had not been reserved for NMT usage, so it should not be tagged with mtNMT (12)

test/hotspot/jtreg/runtime/NMT/MallocRoundingReportTest.java line 68:

> 66:                 NMTTestUtils.runJcmdSummaryReportAndCheckOutput(
> 67:                         "Test (reserved=0KB, committed=0KB)",
> 68:                         "(malloc=0KB type=12) (peak=" + numKB + "KB #1)"

ditto

-------------

PR Review: https://git.openjdk.org/jdk/pull/21843#pullrequestreview-2412301432
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827274229
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827277342
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827280468
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827292469
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827310565
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827310673
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827310815
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827310928
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827312298
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827312806
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827312951
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827314192
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827314524
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827314999
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827316399
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827318234
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827329282
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827323214
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827334776
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827339410
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827348792
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827361075
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827361821
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827363362
PR Review Comment: https://git.openjdk.org/jdk/pull/21843#discussion_r1827363506


More information about the shenandoah-dev mailing list