RFR: 8342504: Remove NMT header and footer canaries [v2]

Johan Sjölen jsjolen at openjdk.org
Fri Feb 28 09:49:57 UTC 2025


On Fri, 28 Feb 2025 09:29:23 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> > > As I said in preparation of this work, I don't oppose it, but I am not happy.
> > > ASAN is not a replacement. ASAN is a special build, slow, needs tons of additional memory, stops at the first (often false) positive, and is often bitrotted since, to my knowledge, no vendor builds ASAN-enabled JVMs regularly. More importantly, if you have a problem in the field, it is easy to convince a customer to switch on NMT. You will not convince a customer to switch their production JVMs against an ASAN-enabled one.
> > > But okay, let's remove it. I hope the capabilities this will enable are worth the loss of this capability.
> > 
> > 
> > Do you have any recent examples of NMT canaries helping out "in the field"? I do feel a little uneasy about loosing this feature, assuming it still helps out in the field.
> 
> Thinking about this, I actually rely on this feature a lot more often:
> 
> A) Customer: "Look at my hs-err file"
> 
> B) Me: (see crashes in somewhere in libc, or something that looks like corrupted C-heap) "Switch on NMT please"
> 
> C) Customer: "ok" - does it, no change.
> 
> D) Now I know it is not a simple double free or overwrite of memory allocated by the hotspot or direct byte buffers. I can exclude, for now, _us_ (the VM vendor) as a culprit and take a closer look at whatever third-party JNI libraries are running in the process. In the end, it may still turn out we broke something, but for now a misuse of C-heap from non-JDK code is more likely. So, NMT is a useful first-responder tool in these cases.
> 
> This is not that rare. And this feature will be a lot more useful once we have full-process - or at least, with Johan's plans - full JDK integration of NMT. Because then, the surface of instrumented code is larger.
> 
> If, at step (B), I would have asked the customer: "Can you please switch out the JVM against one I give you that has ASAN enabled", that runs into tons of problems: they may not be in a position to switch out binaries (maybe deployed out of reach somewhere), they may not be allowed to, ASAN build may be too slow, ASAN crashes may be unacceptable in production (and these crashes will be much more likely than NMT finding some problem).
> 
> For that reason, I propose to leave the footer canary as it is. I cannot see what purpose removing it serves (other than "oh it is simpler now"), and arguably it's the more important of the two canaries for catching end-of-buffer overruns. The leading canary let's shrink to one byte. Ideally placed right in front of the user payload.

This sounds like a good middle-ground.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21560#issuecomment-2690200172


More information about the hotspot-runtime-dev mailing list