Extend NMT to JDK native libraries?

Thomas Stüfe thomas.stuefe at gmail.com
Mon Nov 26 20:43:21 UTC 2018


Hi Zhengyu,


On Wed, Nov 21, 2018 at 5:26 PM Zhengyu Gu <zgu at redhat.com> wrote:
>
> >
> > But note the unbalanced-malloc issue. The more we expose NMT to the
> > JDK wild, the more probable are such bugs. Within the hotspot all is
> > nice and tidy.
>
> Hotspot also has a few unbalanced-malloc instances.

Oh, where? Really, raw malloc with os::free, or the other way around?

Would they not create the problem I described, corrupted C-heap etc?

> And yes, we will
> find more in library, I would consider it as initial investment to fix
> them, especially, if we can do them module-by-module, no?

Yes, they should be fixed if possible. My concern is that it is
realistic to assume that we miss some, or even if not that new ones
will creep in if the code changes. And that the penalties for these
unbalanced mallocs is severe - crashes and heap corruptions - if we
use malloc headers.

Even worse, they would only show up, in release builds, if NMT is
switched on, since only NMT adds malloc headers in release builds.
That would be annoying, telling a customer to switch on NMT just for
him to hit such a code path and crash the VM.

I don't know, I seem to miss something here. Do you not see this as
serious problem preventing NMT from covering code outside hotspot?

>
> Mismatched statistics is quite annoying ... Did see people actually
> counting bytes and expecting to match :-) JDK-8191369 actually was
> driven by customers, who tried to match smap.
>

I remember :) and yes, we have many curious customers too. But that is
fine. We want NMT to be as exact as possible.

Cheers, Thomas

>
> Thanks,
>
> -Zhengyu
>
> >
> > If we wanted to avoid these bugs, we would have to remove malloc
> > headers from both os::malloc() and NMT MallocTracker and add a malloc
> > pointer hashtable of some sorts to the latter. This is not very
> > difficult, but would still need an initial investment.
>
>
>
> >
> > Thanks, Thomas
> >
> >>
> >> Thanks,
> >>
> >> -Zhengyu
> >>
> >>>
> >>> (I think even if we were to instrument parts of the JDK - e.g. just
> >>> NIO - this would already be very helpful. In parts we do this already
> >>> for mtOther.).
> >>>
> >>> On Wed, Nov 21, 2018 at 3:54 PM Zhengyu Gu <zgu at redhat.com> wrote:
> >>>>
> >>>> FYI: There was a phase 2 RFE: Native Memory Tracking (Phase 2)
> >>>> https://bugs.openjdk.java.net/browse/JDK-6995789
> >>>>
> >>>> -Zhengyu
> >>>>
> >>>> On 11/21/18 9:28 AM, Thomas Stüfe wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> (yet again not sure if this is serviceablity-dev or not - I start at
> >>>>> hs-dev, feel free to move this mail around.)
> >>>>>
> >>>>> Do we have any plans to extend NMT to cover native JDK libaries too?
> >>>>> That would be a really cool feature.
> >>>>>
> >>>>> --
> >>>>>
> >>>>> We at SAP have done a similar thing in the past:
> >>>>>
> >>>>> We have a monitoring facility in our port which tracks C-heap
> >>>>> allocations, non-imaginatively called "malloc statistic". This feature
> >>>>> predates NMT somewhat - had we had NMT at that time, we would not have
> >>>>> bothered. Our Malloc statistic is less powerful than NMT and
> >>>>> implementation-wise completely at odds with it, so I never felt the
> >>>>> urge to bring it upstream. However, one thing we did do is we extended
> >>>>> its coverage to the JDK native code.
> >>>>>
> >>>>> This has been quite helpful in the past to find leaks in JDK, see
> >>>>> e.g.: https://bugs.openjdk.java.net/browse/JDK-8155211
> >>>>>
> >>>>> We did this by exposing os::malloc, os::free etc from libjvm.so
> >>>>> ("JVM_malloc", "JVM_realloc", "JVM_free"). In the JDK native code, we
> >>>>> then either manually replaced calls to raw ::malloc(), ::free() etc
> >>>>> with JVM_malloc(), JVM_free(). Or, in places where this was possible,
> >>>>> we did this replacement stuff wholesale by employing a header which
> >>>>> re-defined malloc(), free() etc JVM_malloc, JVM_free etc. Of course,
> >>>>> we also had to add a number of linkage dependencies to the libjvm.so.
> >>>>>
> >>>>> All this is pretty standard stuff.
> >>>>>
> >>>>> One detail stood out: malloc headers are evil. In our experience, JDK
> >>>>> native code was more difficult to control and "unbalanced
> >>>>> malloc/frees" kept creeping in - especially with the
> >>>>> wholesale-redefinition technique. Unbalanced mallocs/frees means cases
> >>>>> where malloc() is instrumented but ::free() stays raw, or the other
> >>>>> way around. Both combinations are catastrophic since os::malloc uses
> >>>>> malloc headers. We typically corrupted the C-Heap and crashed, often
> >>>>> much later in completely unrelated places.
> >>>>>
> >>>>> These types of bugs were very hard to spot and hence very expensive.
> >>>>> And they can creep in in many ways. One example, there exist a
> >>>>> surprising number of system APIs which return results in C-heap and
> >>>>> require the user to free that, which of course must happen with raw
> >>>>> ::free(), not os::free().
> >>>>>
> >>>>> We fixed this by not using malloc headers. That means a pointer
> >>>>> returned by os::malloc() is compatible with raw ::free() and vice
> >>>>> versa. The only bad thing happening would be our statistic numbers
> >>>>> being slightly off.
> >>>>>
> >>>>> Instead of malloc headers we use a hand-groomed hash table to track
> >>>>> the malloced memory. It is actually quite fast, fast enough that this
> >>>>> malloc statistic feature is on-by-default in our port.
> >>>>>
> >>>>> --
> >>>>>
> >>>>> Of course, if we extend NMT to JDK native code we also would want to
> >>>>> extend it to mmap() etc - we never did this with our statistic, since
> >>>>> it only tracked malloc.
> >>>>>
> >>>>> What do you think? Did anyone else play with similar ideas? Would it
> >>>>> be worth the effort?
> >>>>>
> >>>>> Cheers, Thomas
> >>>>>


More information about the hotspot-dev mailing list