Proposal: Extend Native Memory Tracking across the whole process via interposition

Thomas Stuefe tstuefe at redhat.com
Fri Dec 1 17:31:53 UTC 2023


Hi, community,

I experimented with extending Native Memory Tracking across the whole
process. I want to share my findings and propose a new JDK feature to allow
us to do that.

TL;DR

Proposed is a "native memory interposition library" shipped with the JDK
that would intercept all native memory calls from everywhere and redirect
them to NMT.

Motivation:

NMT is very useful but limited in its coverage. It only covers Hotspot and
a select few sites from the JDK. Most of the JDK, third-party native code,
and system libraries are not covered. This is a large hole in our
observability. I have seen people do (and done myself! eg [1]) strange and
weird things to hunt memory leaks in native code. This is especially tricky
in locked-down customer scenarios.

But NMT is a capable tracker. We could use it for much more than just
tracking Hotspot.

In the past, developers have attempted to extend NMT instrumentation over
parts of the JDK (e.g. [2]), which met resistance from Oracle. This is
understandable: a naive extension would require libraries to link against
the libjvm and instrument their coding. That introduces new dependencies
nobody wants.

---

I propose a different way that works without instrumenting any caller code.
I hope this proposal proves less controversial than brute-force NMT
instrumentation of the JDK. And it would allow introspection of non-JDK
parts too.

We could ship an interception library (a "libjnmt.so") within the JDK. That
library, if preloaded, would redirect native memory requests to NMT. A
customer who wants to analyze the native memory footprint of its apps could
start the JVM with LD_PRELOAD=libjnmt and then use NMT for introspection.

Oracle and we continuously improve NMT; extending its reach across the
whole process would leverage that investment nicely.

It also meshes well with other improvements. For example, we report NMT
numbers via JFR since [4] - with interposition, we could now expose
third-party native allocations via JFR. The new jcmd "System.map" would
automatically show memory mappings from outside Hotspot. There is a
precedent (libjsig), so shipping interposition libraries is not that
strange.

---

I have a Linux-based POC that works and looks promising [3]. With that
prototype, I can see:

- allocations from the JDK - e.g., now I finally see mapped byte buffers.
- allocations from third-party user code
- most allocations from system libraries, e.g., from the system zlib
- allocations via the new FFI interface

The prototype tracks both mmap and malloc. Technically, the tricky part was
to handle the initialization window: being able to correctly handle
allocations starting at the process C++ initialization while dynamically
handing over allocations to the libjvm once it is loaded and NMT is
initialized. Another tricky problem was to prevent circularities stemming
from call intercepting. The prototype solves these problems and is already
stable enough to be used.

Note that the patch is not complex or large. Some small interaction with
the JVM is needed, though, so this cannot be done just with an outside
library.

The prototype was developed and tested on Linux x64 and with glibc 2.31. It
seems stable so far, but of course, the work is in an early stage, and bugs
may exist. If you want to play with the prototype, build it [3] and then
call:

LD_PRELOAD=${JDK_DIR}/lib/server/libjnmt.so ${JDK_DIR}/bin/java
-XX:NativeMemoryTracking=detail <program> <args>

Example: quarkus with "third-party code" injected that leaks periodically
[5]:

LEAK_MALLOC=1 LEAK_MMAP=1 LD_PRELOAD=${JDK_DIR}/lib/server/libjnmt.so
${JDK_DIR}/bin/java -agentpath:/shared/projects/jvmti-leak/leaker.so
-XX:NativeMemoryTracking=detail -jar ./quarkus-profiling-workshop/
target/quarkus-app/quarkus-run.jar

In Summary mode, we see the slowly growing leaks:

-External (via interposition) (reserved=82216KB, committed=82216KB)
                            (malloc=81588KB #585) (at peak)
                            (mmap: reserved=628KB, committed=628KB, at peak)


and in Detail mode, their call stacks:

[0x00007ff067ee7000 - 0x00007ff067ee8000] reserved and committed 4KB for
External (via interposition) from
    [0x00007ff067ef5056]the_mmap(void*, unsigned long, int, int, int,
long)+0x66 in libjnmt.so
    [0x00007ff067ef5781]mmap+0x71 in libjnmt.so
    [0x00007ff067ee955a]leak_mmap+0x3f in leaker.so
    [0x00007ff067ee95b1]leakleak+0x1c in leaker.so
    [0x00007ff067ee95c6]leakleakleak+0x12 in leaker.so
    [0x00007ff067ee95db]leakabit+0x12 in leaker.so
    [0x00007ff067ee95f8]leaky_thread+0x1a in leaker.so


[0x00007ff067ef5166]the_malloc(unsigned long)+0x106 in libjnmt.so
[0x00007ff067ee94ae]do_malloc+0xb8 in leaker.so
[0x00007ff067ee9518]leak_malloc+0x20 in leaker.so
[0x00007ff067ee95a7]leakleak+0x12 in leaker.so
[0x00007ff067ee95c6]leakleakleak+0x12 in leaker.so
[0x00007ff067ee95db]leakabit+0x12 in leaker.so
[0x00007ff067ee95f8]leaky_thread+0x1a in leaker.so
                             (malloc=17679KB type=External (via
interposition) #34) (at peak)

---

What about MEMFLAGS?

The prototype does not extend MEMFLAGS apart from introducing a new
"External" category that tracks allocations done via interposition. The
question of MEMFLAGS - in particular, opening it up to outside extension -
has been contentious. It is orthogonal to this proposal - nice but not
required.

This proposal makes external allocations visible under the new "External"
tag:
- in NMT summary mode, we only have the "External" total, which is already
useful even as a lump sum: it shows the footprint non-hotspot libraries
contribute to RSS. An RSS increase that is reflected neither by hotspot
allocations nor by "External" can only stem from a select few places, e.g.
from libc malloc retention.
- In NMT detail mode, this proposal shows us the call stacks to foreign
call sites, pinpointing at least the libraries involved.

--

What do you think, does this make sense?

Thanks, Thomas


[1] https://github.com/SAP/SapMachine/wiki/SapMachine-MallocTracer
[2]
https://mail.openjdk.org/pipermail/core-libs-dev/2022-November/096197.html
[3] https://github.com/tstuefe/jdk/tree/libjnmt
[4] https://bugs.openjdk.org/browse/JDK-8157023
[5] https://github.com/tstuefe/jvmti_leak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jdk-dev/attachments/20231201/1d34bdcd/attachment.htm>


More information about the jdk-dev mailing list