RFR: 8293972: runtime/NMT/NMTInitializationTest.java#default_long-off failed with "Suspiciously long bucket chains in lookup table." [v4]
Thomas Stuefe
stuefe at openjdk.org
Thu Jul 20 08:42:43 UTC 2023
On Thu, 20 Jul 2023 07:22:31 GMT, David Holmes <dholmes at openjdk.org> wrote:
> Can someone remind me what the realloc issue is please. On the surface this looks like a big simplification.
>
> Thanks.
In brief, NMTPreInit is a mechanism that allows us to initialize NMT together with all other VM subsystems after normal argument processing.
Before NMTPreInit, NMT was controlled by the launcher. That was complex and brittle and prevented us from using NMT when hotspot was embedded into a different launcher. E.g. IntelliJ, Eclipse, our gtestlauncher... the latter prevented us from having NMT-related gtests.
NMTPreInit needs to track early-state mallocs and reallocs to handle them correctly in case these blocks are handed into os::realloc or os::free after NMT initialization. Not doing so corrupts the C-heap because of malloc header mismatch.
This PR wants to remove the capability to handle early state reallocs. It tries to achieve that by moving NMT argument processing to the start of CJVM. The assumption is that no C-heap reallocs happen before that. The problem with that assumption is that we cannot be sure. We have a lot of code that runs right from the point of loading the libjvm.so on, long before CJVM is invoked. In the case of static linking (gtestlauncher or whatever Google tries to do), right from the point of initial program load.
This PR bets on no reallocs happening before CJVM. Should our bet go wrong, the C-heap *will* be corrupted if:
- NMT is enabled
- we realloc before NMT initialization and free or realloc after NMT initialization.
-----
We had a similar discussion for https://bugs.openjdk.org/browse/JDK-8301811 and https://bugs.openjdk.org/browse/JDK-8299196 - see my comment there. For some reason, NMTPreInit seems to attract a lot of cleanup efforts.
Dealing with malloc and free seems deceptively simple. So NMTPreInit seems unnecessarily complex. But the issue itself is complex. NMTPreInit is the simplest possible answer to that complex problem. Dumbing it down more incurs a much greater risk and/or a larger deal of inflexibility.
We *could* get rid of it if we agree that no code must run, under any circumstances, before NMT initialization. No code because otherwise, the use of C-heap allocation is very difficult to police. And getting it wrong is very costly.
That would mean forbidding global C++ objects and platform-dependent DLL initialization. That, in turn, should be first agreed upon by the hotspot community and then has to actually happen. And I'm not sure it would be the right decision since global C++ objects are a powerful tool, especially for testing and prototyping.
My gut feeling is that I don't want to depend on this assumption and that even if we get this right now, an inability to run any form of complex code before CJVM ties us down in ways I don't yet foresee.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14607#issuecomment-1643515829
More information about the hotspot-runtime-dev
mailing list