RFR: JDK-8304815: Use NMT for more precise hs_err location printing
Thomas Stuefe
stuefe at openjdk.org
Thu Mar 23 17:18:49 UTC 2023
(This is a byproduct of work on the arm port for https://github.com/openjdk/jdk/pull/10907. I needed better debugging information in the hs-err file and in gdb.)
Back in 2022 @zhengyu123 had the very nice idea of using NMT mapping info for smartening up pp in gdb: [JDK-8280289](https://bugs.openjdk.org/browse/JDK-8280289).
The same idea can be applied to hs_err file location printing. NMT has information about all its mappings and can tell us where it thinks a given unknown pointer points into.
This could be even more useful if the "find malloc block" part of that functionality would be smarter. As it is now, it only works if the pointer in question points to the start of a user-allocated area. Would be nice if the code could (carefully) search for the next valid-looking malloc header instead.
--------------
This patch does this: we introduce a new API, MemTracker::print_containing_region(void*), that tries to make sense of a given unknown pointer.
It will search its mmap regions and print those if found. For malloc'ed pointers, it will carefully sniff out the immediate surroundings of the block, trying to find what looks like a valid malloc header. It uses SafeFetch to not trip over unmapped or protected pages. Note that, of course, we may get false recognition positives if it finds something that looks like a valid header. But even that could be useful (e.g. a remnant dead header may indicate we access memory after free).
Looks like this (its arm, so 32-bit pointers):
Register to memory mapping:
-> r0 = 0x728a6ae0 into life malloced block starting at 0x728a6ae0, size 104, tag mtSynchronizer
-> r1 = 0x75b02010 into life malloced block starting at 0x75b02010, size 184, tag mtObjectMonitor
-> r2 = 0x728a6ae0 into life malloced block starting at 0x728a6ae0, size 104, tag mtSynchronizer
r3 = 0x0 is nullptr
-> r4 = 0x728a6ae0 into life malloced block starting at 0x728a6ae0, size 104, tag mtSynchronizer
r5 = 0xb6d3bbc8: <offset 0x018f6bc8> in /shared/projects/openjdk/jdk-jdk/output-fastdebug-arm/images/jdk/lib/server/libjvm.so at 0xb5445000
r6 = 0xffffffff is an unknown value
r7 = 0x0 is nullptr
r8 = 0x0000000a is an unknown value
r9 = 0x728a4308 is a thread
r10 = 0x0000000b is an unknown value
-> fp = 0x753fe8cc in mmap'd memory region [0x75380000 - 0x75400000] by Thread Stack
r12 = 0x0 is nullptr
-> sp = 0x753fe8b0 in mmap'd memory region [0x75380000 - 0x75400000] by Thread Stack
lr = 0xb69bc1d0: <offset 0x015771d0> in /shared/projects/openjdk/jdk-jdk/output-fastdebug-arm/images/jdk/lib/server/libjvm.so at 0xb5445000
pc = 0xb69c8670: <offset 0x01583670> in /shared/projects/openjdk/jdk-jdk/output-fastdebug-arm/images/jdk/lib/server/libjvm.so at 0xb5445000
The small caveat here is that NMT reporting needs ThreadCritical, and thus using it for location printing may block error reporting if it crashed inside a ThreadCritical section. We face the same issue today when printing the NMT report as part of the hs-err file.
I think the usefulness of these printouts justify this risk. However I opened https://bugs.openjdk.org/browse/JDK-8304824 to investigate a better locking strategy for NMT.
-------------
Commit messages:
- JDK-8304815-Use-NMT-for-more-precise-hs_err-location-printing
Changes: https://git.openjdk.org/jdk/pull/13162/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13162&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8304815
Stats: 198 lines in 11 files changed: 169 ins; 8 del; 21 mod
Patch: https://git.openjdk.org/jdk/pull/13162.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/13162/head:pull/13162
PR: https://git.openjdk.org/jdk/pull/13162
More information about the hotspot-dev
mailing list