RFR: 8347564: ZGC: Crash in DependencyContext::clean_unloading_dependents

Axel Boldt-Christmas aboldtch at openjdk.org
Tue Jan 28 07:26:48 UTC 2025


On Tue, 28 Jan 2025 04:04:39 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> Good work, Axel. `Cleaner`-based solution to manage nmethod dependencies on `java.lang.invoke.CallSite`s does look redundant. If that's the case, then I'd expect that by the time cleanup action is invoked corresponding dependency list is empty. Is it the case in practice?

For our STW collectors this should always be the case, as the cleaner will run its code after the collection where the CallSite dies, which will in turn have unlinked the nmethods.

For the concurrent collectors, because references processing is done before nmethod unlinking, we might run the cleaner code before the GC gets to the nmethod, and as such observe a none empty list. 

I've run some sanity stress tests locally with G1 and the following patch:

diff --git a/src/hotspot/share/prims/methodHandles.cpp b/src/hotspot/share/prims/methodHandles.cpp
index 97e3eae1a2f..fadcec9bd90 100644
--- a/src/hotspot/share/prims/methodHandles.cpp
+++ b/src/hotspot/share/prims/methodHandles.cpp
@@ -1330,6 +1331,7 @@ JVM_ENTRY(void, MHN_clearCallSiteContext(JNIEnv* env, jobject igcls, jobject con
     NoSafepointVerifier nsv;
     MutexLocker ml(THREAD, CodeCache_lock, Mutex::_no_safepoint_check_flag);
     DependencyContext deps = java_lang_invoke_MethodHandleNatives_CallSiteContext::vmdependencies(context());
+    guarantee(deps.is_empty(), "just checking");
     deps.remove_and_mark_for_deoptimization_all_dependents(&deopt_scope);
     // This is assumed to be an 'atomic' operation by verification.
     // So keep it under lock for now.

But I'll take it a spin through our CI as well. But if this were to occur it would mean we must have either missed adding a dependent CallSite oop to a linked nmethod, or that GC failed to unlink a nmethod with a dead oop. Neither would be very healthy. 

> But the original problem makes me wonder whether `Cleaner`-based approach to manage native resources is broken or not.

_Tangent:
`Cleaner`-based native resource management may be broken but not because of this. (It suffers from a similar problem to finalisers, namely that they are triggered at the GCs discretion, and the GC have no real insight into the cost of a "native resource" so it may never be cleaned. Trying to create scoped lifetimes, with try-with-resource should always be preferred)_

> Dependent instance is used to avoid referent to be artificially kept alive by cleanup action. If it is not safe to access it in order to perform cleanup, how the whole approach is intended to work? Or does anything make 'CallSite` use case special?

`CallSite` is special, because the GC knows about it. The reason being is that the GC must remove all nmethod dependencies when unlinking, otherwise we would have a dangling pointer and an ABA problem. When unlinking the GC runs the following code.

    for (Dependencies::DepStream deps(this); deps.next(); ) {
      if (deps.type() == Dependencies::call_site_target_value) {
        // CallSite dependencies are managed on per-CallSite instance basis.
        oop call_site = deps.argument_oop(0);
        MethodHandles::clean_dependency_context(call_site);
      }

The `call_site` in question here may or may not be dead. But regardless it will have a link to the currently unlinking nmethod, which must be removed. After unlinking the GC will phase the system (by a handshake or by the property of doing all this work in a safepoint). If it was not removed it is a dangling pointer, not only that it is a pointer we are more than likely going to reuse for a new nmethod, which will make them indistinguishable, leading to ABA.

 There is a world where we move this to java. Having some object representing the nmethod. Letting CallSites have a linked list of weak references to these objects, etc. Having thee GC solve the memory reclamation problem.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23194#issuecomment-2618104075


More information about the core-libs-dev mailing list