Clarifying jmethodID Usage and Potential JVM Crashes

Jaroslav Bachorík jaroslav.bachorik at datadoghq.com
Tue Jun 20 05:45:13 UTC 2023


On Tue, Jun 20, 2023 at 4:38 AM David Holmes <david.holmes at oracle.com>
wrote:

> On 19/06/2023 11:46 pm, Jaroslav Bachorík wrote:
> > Hi,
> >
> > On Thu, Jun 1, 2023 at 5:49 PM <daniel.daugherty at oracle.com
> > <mailto:daniel.daugherty at oracle.com>> wrote:
> >
> >     Hmmm... seems like concurrent class unloading has revealed a
> situation
> >     in some
> >     code where class refs should have been held, but were not held
> because
> >     it's very
> >     difficult to do, and the code (mostly) got away with it for a long
> >     time...
> >
> >     David's quote from the JNI spec makes things very, very clear:
> >
> >       > "A field or method ID does not prevent the VM from unloading the
> >     class from
> >       > which the ID has been derived. After the class is unloaded, the
> >     method or
> >       > field ID becomes invalid and may not be passed to any function
> >     taking
> >     such
> >       > an ID. The native code, therefore, must make sure to:
> >       >   - keep a live reference to the underlying class, or
> >       >   - recompute the method or field ID
> >       >
> >       > if it intends to use a method or field ID for an extended period
> of
> >     time."
> >
> >     and the JVM/TI spec does depend on the JNI spec for certain pieces
> >     like the
> >     semantics of jmethodID.
> >
> >     I agree with David that fixing this is going to be difficult and a
> major
> >     effort which is why we've all only chipped at this boulder before...
> >
> >
> > I apologize for being a stickler, but it seems that the concurrent class
> > unloading and the ephemeral nature of jmethodIDs could cause problems
> > even for a much less exotic usecase of the JVMTI GetStackTrace
> > (
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#GetStackTrace
> <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#GetStackTrace
> >).
> > IIUC, the stacktrace consists of stack frames and each stack frame
> > contains a jmethodID identifying the associated Java method. Further on,
> > obtaining the stacktrace will not create references to corresponding
> > classes so the jmethodIDs can go 'bad' at any given moment, without an
> > official way to make sure such a 'bad' jmethodID would not crash JVM.
>
> Yep all correct. The jvmtiFrameInfo contains the jmethodID as applicable
> when the stacktrace was taken. If the current scope is not ensuring
> liveness then those method ids could be invalidated at any time.
>
> I agree this is a hole in the JVM TI specification, but it is not a hole
> I know how to patch. Additional API are needed here along with changes
> to the validity requirements of jMethodIds.
>

Thanks! Yes, this makes sense. I just wanted to make sure my perception was
correct.

Cheers,

-JB-


>
> Cheers,
> David
> -----
>
> > Cheers,
> >
> > -JB-
> >
> >
> >     Dan
> >
> >
> >     On 6/1/23 4:28 AM, David Holmes wrote:
> >      > On 1/06/2023 5:16 pm, Jaroslav Bachorík wrote:
> >      >> Hi David,
> >      >>
> >      >> On Thu, Jun 1, 2023 at 3:56 AM David Holmes
> >     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> >      >> <mailto:david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>>> wrote:
> >      >>
> >      >>     Hi Jaroslav,
> >      >>
> >      >>     On 31/05/2023 9:12 pm, Jaroslav Bachorík wrote:
> >      >>      > Dear Team,
> >      >>      >
> >      >>      > I've been investigating the unusual JVM crashes occurring
> in
> >      >>     JVMTI calls
> >      >>      > on a J9 JVM. During my investigation, I scrutinized the
> >      >> `jmethodID`
> >      >>      > definition closely, available here: [jmethodID
> >      >>      >
> >      >>
> >     definition](
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >
> >      >>
> >     <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >>
> >      >>
> >     <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >
> >      >>
> >     <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >>>).
> >      >>
> >      >>      >
> >      >>      > To paraphrase, the definition suggests that `jmethodID`
> >      >> identifies a
> >      >>      > Java method, initializer, or constructor. These
> >     identifiers, once
> >      >>      > returned by JVM TI functions and events, can be safely
> >     stored.
> >      >>     However,
> >      >>      > when the class is unloaded, they become invalid,
> >     rendering them
> >      >>      > unsuitable for use.
> >      >>      >
> >      >>      > My interpretation is that the JVMTI user should verify the
> >      >>     validity of a
> >      >>      > `jmethodID` value before using it to prevent potential
> >     crashes.
> >      >>     Would
> >      >>      > you agree with this interpretation?
> >      >>
> >      >>     Not quite - as you note you can't verify the jmethodID
> >     validity.
> >      >> What
> >      >>     the user needs to do, in line with what Dan was saying, is
> >     ensure
> >      >> that
> >      >>     they keep track of the classes to which the methods belong
> >     and keep
> >      >>     them
> >      >>     alive if necessary. Now that may be easier said than done,
> but
> >      >> that is
> >      >>     the gist of it. This comes from the JNI spec:
> >      >>
> >      >>     "A field or method ID does not prevent the VM from unloading
> >     the
> >      >> class
> >      >>     from which the ID has been derived. After the class is
> >     unloaded, the
> >      >>     method or field ID becomes invalid and may not be passed to
> any
> >      >>     function
> >      >>     taking such an ID. The native code, therefore, must make
> >     sure to:
> >      >>
> >      >>           keep a live reference to the underlying class, or
> >      >>           recompute the method or field ID
> >      >>
> >      >>     if it intends to use a method or field ID for an extended
> >     period of
> >      >>     time."
> >      >>
> >      >>      > This sounds like a sensible requirement, but its practical
> >      >>     application
> >      >>      > remains unclear. As far as I know, methods can be unloaded
> >      >>     concurrently
> >      >>      > to the native code executing JVMTI functions. This
> >     introduces a
> >      >>      > potential race condition where the JVM unloads the
> >     methods during
> >      >>     the
> >      >>      > check->use flow, making it only a partial solution. To
> >     complicate
> >      >>      > matters further, no method exists to confirm whether a
> >      >>     `jmethodID` is valid.
> >      >>      >
> >      >>      > Theoretically, we could monitor the `CompiledMethodUnload`
> >      >> event to
> >      >>      > track the validity state, creating a constantly expanding
> >     set of
> >      >>      > unloaded `jmethodID` values or a bloom filter, if one
> >     does not
> >      >> care
> >      >>      > about few potential false positives. This strategy,
> however,
> >      >> doesn't
> >      >>      > address the potential race condition, and it could even
> >      >>     exacerbate it
> >      >>      > due to possible event delays. This delay might mistakenly
> >      >> validate a
> >      >>      > `jmethodID` value that has already been unloaded, but for
> >      >> which the
> >      >>      > event hasn't been delivered yet.
> >      >>      >
> >      >>      > Honestly, I don't see a way to use `jmethodID` safely
> >     unless the
> >      >>     code
> >      >>      > using it suspends the entire JVM and doesn't resume until
> >     it's
> >      >>     finished
> >      >>      > with that `jmethodID`. Any other approach might lead to
> JVM
> >      >>     crashes, as
> >      >>      > we've observed with J9.
> >      >>      >
> >      >>      > Lastly, it's noteworthy that Hotspot takes meticulous
> >     measures to
> >      >>     ensure
> >      >>      > that using jmethodIDs for unloaded methods doesn't crash
> the
> >      >> JVM and
> >      >>      > even provides useful information. This observation has
> >     led me to
> >      >>      > question whether the documentation aligns with the Hotspot
> >      >>      > implementation, especially given that following closely
> the
> >      >>      > documentation appears to increase the risk associated
> >     with the
> >      >>     use of
> >      >>      > `jmethodID` values.
> >      >>
> >      >>     There have been folk who wanted to make this area more
> >     user-friendly
> >      >>     but
> >      >>     that shouldn't be mistaken for moving towards a world where
> >      >> jmethodIDs
> >      >>     are always safe to use.
> >      >>
> >      >>
> >      >> Yes, I see your point. Unfortunately, this confirms my worries
> that
> >      >> using AsyncGetCallTrace (ASGCT) on a system strictly adhering to
> >     the
> >      >> JVMTI spec of jmethoID is not really possible without risking
> >     random
> >      >> and quite frequent crashes on systems with concurrent class
> >     unloading
> >      >> enabled.
> >      >> FTR, ASGCT will record the stack trace as a list of frames, each
> >     one
> >      >> containing the corresponding jmethodID value. Considering that
> the
> >      >> most common usage of ASGCT is in a signal handler it makes it
> >      >> impossible to use JVMTI calls to resolve the holder class and
> >     create
> >      >> a strong reference to prevent it from being unloaded.
> >      >> And even if this would be possible we would need to figure out
> when
> >      >> to release the class reference when it is no more needed - and
> >     it is
> >      >> not really clear how we could do that reliably, leaving us with
> the
> >      >> option of holding the class references indefinitely or risking
> >      >> crashing JVM.
> >      >
> >      > I would have to agree you cannot use jmethodID's for that
> >     purpose, not
> >      > without also recording (and keeping-alive) the associated
> >     classes. As
> >      > to when you would clear such references I can't say as I don't
> know
> >      > how the ASGCT stack record would be used.
> >      >
> >      >> I want to emphasize that not being able to resolve additional
> >     details
> >      >> for a jmethodID pointing to a method of an unloaded class is not
> an
> >      >> issue, as long as the JVMTI call does not crash. I think that
> >      >> https://bugs.openjdk.org/browse/JDK-8268364
> >     <https://bugs.openjdk.org/browse/JDK-8268364>
> >      >> <https://bugs.openjdk.org/browse/JDK-8268364
> >     <https://bugs.openjdk.org/browse/JDK-8268364>> did address exactly
> the
> >      >> problem of concurrent class unloading causing races in the code
> >     that
> >      >> is checking for validity of jmethodID and then using it.
> >      >
> >      > Yes but internal to the JVMTI implementation. The basic null check
> >      > that was used was found to be insufficient when used with ZGC and
> so
> >      > we would crash more often than was considered reasonable. There
> is a
> >      > quoted comment in the code in
> >      > https://bugs.openjdk.org/browse/JDK-8268088
> >     <https://bugs.openjdk.org/browse/JDK-8268088>:
> >      >
> >      >   // These aren't deallocated and are going to look like a leak,
> but
> >      > that's
> >      >   // needed because we can't really get rid of jmethodIDs because
> we
> >      > don't
> >      >   // know when native code is going to stop using them. The spec
> >     says
> >      > that
> >      >   // they're "invalid" but existing programs likely rely on their
> >     being
> >      >   // NULL after class unloading.
> >      >
> >      > that kind of sums up the position of trying to accommodate "bad
> >     code"
> >      > in a reasonable way.
> >      >
> >      >> Can this be summarize in a way that the user is not guaranteed
> >     to get
> >      >> any additional information for an invalid jmethodID but it would
> be
> >      >> really nice for JVM not to crash when jmethodID becomes invalid
> as
> >      >> there is no way for the user to check for its validity in an
> atomic
> >      >> manner
> >      >> - and yes, even calling GetMethodDeclaringClass in order to
> obtain
> >      >> the class one could create a strong reference is a subject to
> racy
> >      >> behaviour so it really can not be used as a workaround.
> >      >
> >      > The only real solution IMO would be to encode jmethodIDs in a way
> >     that
> >      > validity can be tracked and then have all JVMTI methods be able to
> >      > check that validity and guarantee atomicity of the method such
> that
> >      > the id can't become invalid whilst in use. Whether that is at all
> >      > feasible/practical I do not know - but it sounds like a major
> effort
> >      > to me.
> >      >
> >      > Cheers,
> >      > David
> >      >
> >      >> Cheers,
> >      >>
> >      >> -JB-
> >      >>
> >      >>
> >      >>     Cheers,
> >      >>     David
> >      >>
> >      >>      > I welcome your thoughts and perspectives on this matter.
> >      >>      >
> >      >>      > Best regards,
> >      >>      >
> >      >>      > Jaroslav
> >      >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/serviceability-dev/attachments/20230620/02943b7c/attachment-0001.htm>


More information about the serviceability-dev mailing list