Re: Proposal: track zlib native memory usage with NMTThomas Stüfe thomas.stuefe at gmail.com

Wed Mar 22 22:01:33 UTC 2023

> Hi all,
> 
> I am currently working on https://bugs.openjdk.org/browse/JDK-8296360; I
> was preparing the final PR [1], but then Alan did ask me to discuss this on
> core-libs first.
> 
> Backstory:
> 
> NMT tracks hotspot native allocations but does not cover the JDK libraries
> (small exception: Unsafe.AllocateMemory). However, the native memory
> footprint of JDK libraries can be significant. We have no in-VM tracker for
> these and need tools like valgrind or our SapMachine MallocTracer [2] to
> observe them.
> 
> At SAP, in our proprietary VM, we have instrumented the whole JDK. Not via
> NMT, but with an older tracker we developed 15 years ago. It has run stable
> and productively on many platforms for ~15 years now, atop quite different
> libc implementations. It is impractical to contribute it upstream, however.
> Would not make much sense either since it overlaps a lot with NMT.
> 
> 2018 I discussed the possibility of extending NMT across the JDK wholesale,
> and the associated risks and difficulties [3]. The reception was not great,
> so I backed off and did other things.
> 
> But the issue of JDK native footprint comes up regularly. It is a thorn in
> our foot. Recently we had a customer whose zlib footprint exploded into the
> tens of GB range. Luckily the customer used our proprietary VM, so we
> pinpointed the cause quickly. With the OpenJDK we would have flown blind. I
> want to change that.
> 
> 
> Proposal:
> 
> In contrast to my first proposal from 2018 I'd like to instrument the core
> libs only in small steps, and only for selected allocation hotspots. There
> are not that many. And I start with the zlib.
> 
> My proposal does not touch the zlib itself. The solution works with both
> bundled zlib and system zlib. All the instrumentation happens in the JVM
> zlib wrapper. It mainly uses the zalloc mechanism of zlib streams to
> reroute the allocations.
> 
> 
> Costs and risks:
> 
> There is a risk involved with instrumentations like these. If we overlook
> instrumentation points we end up with unbalanced malloc/frees (instrumented
> malloc+raw free, or vice versa) which would corrupt the C-heap since NMT
> uses malloc headers. But in this case, the risk is very small since the
> instrumentations are few.

I always meant to ask, why is it that we chose to dedicate the beginning of the memory chunk to NMT, and not the end?

If we used the end, then in this case with an unbalanced malloc/free, we would still be OK.

The end seems more natural to me and either way we need to track the size, and moreover, in the current case, we need the size of the header as well.

cheers

> 
> (Side note: our internal tracker sidesteps this problem entirely by
> avoiding malloc headers. Instead, it uses a hash tables to match pointers
> with their meta information. But that has other cons and I do not plan to
> change the way NMT works.)
> 
> Performance-wise, instead of calling into the libc directly, we would call
> into the hotspot, then into the libc. That indirection will cost some
> cycles. If NMT is off, there does not happen much more beside the real
> malloc call. Even NMT summary mode is very cheap. So I don't expect this to
> be a problem but will run performance tests.
> 
> --------
> 
> My patch [1] works and can be built and tested. But the PR is still a work
> in progress. I just wanted to make sure nobody generally objects to my work.
> 
> Cheers, Thomas
> 
> [1] https://github.com/openjdk/jdk/pull/10988
> [2] https://github.com/SAP/SapMachine/wiki/SapMachine-MallocTracer
> [3] https://mail.openjdk.org/pipermail/hotspot-dev/2018-November/035358.html