RFR: JDK-8268088: Clarify Method::clear_jmethod_ids() related comments in ClassLoaderData::~ClassLoaderData() [v2]

Tue Jun 8 13:29:17 UTC 2021

On Tue, 8 Jun 2021 04:30:59 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Hi Ioi, thanks for the thoughts!
>> 
>>> If the issue is specific to JVMTI code only, the comment should reflect that. Currently it seems to suggest that all possible JNI code could use an invalid jmethodID. However, using an invalid jmethodID with JNI APIs such as CallStaticVoidMethod will lead to a crash.
>> 
>> The issue that we encountered recently was with a JVMTI profiling agent. However, we can't conclude from that it is only restricted to JVMTI usages. Since we don't have any evidences to exclude non-JVMTI cases, narrowing down the scope of Coleen's existing comment is not a good idea. Expending the comment with clarifications and example (JVMTI agent) is appropriate at this point.
>> 
>>> I don't think the JNI spec allows invalid jmethodID to be used.
>> 
>> I haven't found the specific language in the JNI spec. The quoted spec language in updated comment does not strongly indicate that.
>> 
>>> 
>>> Furthermore, we should consider freeing the JNIMethodBlock when JVMTI is not enabled, so we can avoid the memory leak.
>>> 
>> 
>> It's not just JVMTI vs non-JVMTI issues. The JNIMethodBlocks are not always created. When application native code creates (directly or indirectly) jmethod_ids, how does VM know when it's safe to release the memory? It faces the same issue as the JVMTI agent case. https://github.com/openjdk/jdk/commit/66ef04af110862a1a6b727f12ad5eed9c26cd5a9 addresses from unloading point view. However, that's unsafe either.
>> 
>>> We may even want to update the JVMTI spec to disallow invalid jmethodID, but will that introduce incompatibility with 3rd party code?
>>> 
>> 
>> If we update the JNI spec to ban such usages and fix all existing third party code (if we can), we can change the VM to release memory safely. That would be welcomed by developers. However, any unfixed native code would still run into risks of crashes.
>> 
>>> Lastly, do we have any existing test cases that cover the use of invalid jmethodIDs?
>> 
>> Unfortunately there's no existing test case. It would help prevent the crash in production if we had a test case that caught the issue early. It's a good to-do item in addition to the comment clarification, before looking into a potential longer term solution for the memory leak.
>
>> The issue that we encountered recently was with a JVMTI profiling agent. However, we can't conclude from that it is only restricted to JVMTI usages. Since we don't have any evidences to exclude non-JVMTI cases, narrowing down the scope of Coleen's existing comment is not a good idea. Expending the comment with clarifications and example (JVMTI agent) is appropriate at this point.
> 
> I have looked at every use of `jmethodID` in jni.cpp. All of them will try to dereference the `Method*` without a NULL check. So any code that tries to invoke JNI calls with a `jmethodID` associated with a deallocated class will crash immediately.
> 
>> > I don't think the JNI spec allows invalid jmethodID to be used.
>> 
>> I haven't found the specific language in the JNI spec. The quoted spec language in updated comment does not strongly indicate that.
> 
> See https://docs.oracle.com/en/java/javase/16/docs/specs/jni/design.html
> 
> "A field or method ID does not prevent the VM from unloading the class from which the ID has been derived. After the class is unloaded, the method or field ID becomes invalid.
> ...
> The programmer must not pass illegal pointers or arguments of the wrong type to JNI functions. Doing so could result in arbitrary consequences, including a corrupted system state or VM crash."
> 
>> > Furthermore, we should consider freeing the JNIMethodBlock when JVMTI is not enabled, so we can avoid the memory leak.
>> 
>> It's not just JVMTI vs non-JVMTI issues. The JNIMethodBlocks are not always created. When application native code creates (directly or indirectly) jmethod_ids, how does VM know when it's safe to release the memory? It faces the same issue as the JVMTI agent case. [66ef04a](https://github.com/openjdk/jdk/commit/66ef04af110862a1a6b727f12ad5eed9c26cd5a9) addresses from unloading point view. However, that's unsafe either.
>> 
>> > We may even want to update the JVMTI spec to disallow invalid jmethodID, but will that introduce incompatibility with 3rd party code?
>> 
>> If we update the JNI spec to ban such usages and fix all existing third party code (if we can), we can change the VM to release memory safely. That would be welcomed by developers. However, any unfixed native code would still run into risks of crashes.
> 
> From above, we can conclude that both the JNI spec and implementation disallow the use of invalid jmethodIDs. No existing code 3rd party JNI code can run with invalid jmethodIDs without crashing.
> 
> It's only JVMTI that has the concept of `JVMTI_ERROR_INVALID_METHODID`.  So the only reason of not freeing the `JNIMethodBlock` is because of JVMTI.
> 
> However, the concept of  `JVMTI_ERROR_INVALID_METHODID` is inherently unsafe. Even the existing JVMTI implementation is buggy because classes can be deallocated concurrently (ZGC does that):
> 
> 
> static jvmtiError JNICALL
> jvmti_GetMethodDeclaringClass(jvmtiEnv* env,
>             jmethodID method,
>             jclass* declaring_class_ptr) {
>   ...
>   Method* checked_method = Method::checked_resolve_jmethod_id(method);
>   if (checked_method == NULL) {
>       return JVMTI_ERROR_INVALID_METHODID;
>   }
>   if (declaring_class_ptr == NULL) {
>       return JVMTI_ERROR_NULL_POINTER;
>   }
> 
>   >>>>>> ZGC concurrently deallocates the class
>  
>   err = jvmti_env->GetMethodDeclaringClass(checked_method, declaring_class_ptr); >>> will crash
> 
> 
> In [JDK-8268088](https://bugs.openjdk.java.net/browse/JDK-8268088), you described a JVMTI-based profiler that "captured stack traces and then symbolized methods using jmethod_ids at a later point during runtime execution, which could happen after unloading the associated classes and loaders."
> 
> This is unsafe and can crash with ZGC even if we don't free the JNIMethodBlocks, because `checked_method` could point to a deallocated `Method`. See more in [JDK-8268364](https://bugs.openjdk.java.net/browse/JDK-8268364).
> 
> To handle this correctly, the profiler should ensure that the classes with the collected stacks cannot be GCed. This can be done by creating a GlobalRef on the classloader associated with the jmethodID:
> 
> - GetMethodDeclaringClass -> GetClassLoader -> NewGlobalRef
> 
> The profiler can release the GlobalRef after it has processed the jmethodID.

I added comments to both of these bugs JDK-8268088 and JDK-8268364.

There are places in the JNI spec that don't specify a "valid method ID" and some places just say "a method ID".  We might be able to fix those.  We will crash for invalid method ID in all the JNI entry points.  There is -Xcheck:jni that does check the method ID though.  Guess I'm repeating my comment above.

As far as the new comment goes, for now, it should include JNI until which time we clarify the spec. I don't see the harm in that.

I was trying to think of a way to check jmethodID with SafeFetch or something to to return JVMTI_ERROR_INVALID_METHODID without running into ABA problems.  Maybe that would work and we can deallocate the jmethodID block.

That said, I never saw the PR  66ef04a and I'd want to review that carefully. My first quick click-through was not positive.  I hid JMethodIDBlocks in method.cpp on purpose!  Plus it seems like a huge change for JDK 11.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4383