RFR: 8360048: NMT crash in gtest/NMTGtests.java: fatal error: NMT corruption: Block at 0x0000017748307120: header canary broken [v2]
Coleen Phillimore
coleenp at openjdk.org
Tue Jul 15 12:11:44 UTC 2025
On Mon, 14 Jul 2025 22:16:08 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:
>>> Are you accepting nits here? I see other PR was already reviewed.
>>
>> The intent was to just create a proxy for the other PR so we could hit the integrate button. But that hasn't happened yet so nits accepted.
>>
>>> Really hard to understand where the fix is. Bug synopsis also does not help :) I assume the problem is really in the test?
>>
>> I have similar thoughts and am asking for some clarification from Gerard (as original reviewer).
>
>> Really hard to understand where the fix is. Bug synopsis also does not help :) I assume the problem is really in the test?
>
> Agree, it's confusing, Afshin said:
>
>> The canary header test failed since there were concurrent remove and free() from the tree. The remove operations are synch'ed with corresponding NMT lock.
>
> but frankly I don't see any locks involved in this code path:
>
> This where we detect the issue:
>
>
> inline OutTypeParam MallocHeader::resolve_checked_impl(InTypeParam memblock) {
> char msg[256];
> address corruption = nullptr;
> if (!is_valid_malloced_pointer(memblock, msg, sizeof(msg))) {
> fatal("Not a valid malloc pointer: " PTR_FORMAT ": %s", p2i(memblock), msg);
> }
> OutTypeParam header_pointer = (OutTypeParam)memblock - 1;
> if (!header_pointer->check_block_integrity(msg, sizeof(msg), &corruption)) {
> header_pointer->print_block_on_error(tty, corruption != nullptr ? corruption : (address)header_pointer);
> fatal("NMT corruption: Block at " PTR_FORMAT ": %s", p2i(memblock), msg);
> }
> return header_pointer;
> }
>
>
> called by:
>
>
> inline MallocHeader* MallocHeader::resolve_checked(void* memblock) {
> return MallocHeader::resolve_checked_impl<void*, MallocHeader*>(memblock);
> }
>
>
> called by:
>
>
> void* MallocTracker::record_free_block(void* memblock) {
> ...
> MallocHeader* header = MallocHeader::resolve_checked(memblock);
> ...
> }
>
>
> called by:
>
>
> static inline void* record_free(void* memblock) {
> ...
> return MallocTracker::record_free_block(memblock);
> }
>
>
> called by:
>
>
> void os::free(void *memblock) {
> ...
> void* const old_outer_ptr = MemTracker::record_free(memblock);
> ...
> }
>
>
> called by:
>
>
> void Treap::remove_all() {
> ...
> _allocator.free(head);
> ...
> }
>
>
> called by:
>
>
> static void test_add_committed_region_adjacent_overlapping() {
> RegionsTree* rtree = VirtualMemoryTracker::Instance::tree();
> rtree->tree().remove_all();
>
> size_t size = 0x01000000;
> ReservedSpace rs = MemoryReserver::reserve(size, mtTest);
> MemTracker::NmtVirtualMemoryLocker nvml;
> ...
>
>
> As you can see in the old code, we call `remove_all` before we lock (MemTracker::NmtVirtualMemoryLocker)
>
> I think the simplest temp fix here would be to do:
>
>
> static void test_add_committed_region_adjacent_overlapping() {
> MemTracker::NmtVirtualMemoryLocker nvml;
>
> RegionsTree* rtree = VirtualMemoryTracker::Instance::tree();
> rtree->tree().remove_all();
>
> size_t size = 0x01000000;
> ReservedSpace rs = MemoryReserver::reserve(size, mtTest);
>
>
> In the new code we...
@gerard-ziemski Can you prepare just the test fix for us to review? This should be a further RFE.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26284#issuecomment-3073351811
More information about the hotspot-runtime-dev
mailing list