Clarifying jmethodID Usage and Potential JVM Crashes
Jaroslav Bachorík
jaroslav.bachorik at datadoghq.com
Tue Jun 20 05:45:13 UTC 2023
On Tue, Jun 20, 2023 at 4:38 AM David Holmes <david.holmes at oracle.com>
wrote:
> On 19/06/2023 11:46 pm, Jaroslav Bachorík wrote:
> > Hi,
> >
> > On Thu, Jun 1, 2023 at 5:49 PM <daniel.daugherty at oracle.com
> > <mailto:daniel.daugherty at oracle.com>> wrote:
> >
> > Hmmm... seems like concurrent class unloading has revealed a
> situation
> > in some
> > code where class refs should have been held, but were not held
> because
> > it's very
> > difficult to do, and the code (mostly) got away with it for a long
> > time...
> >
> > David's quote from the JNI spec makes things very, very clear:
> >
> > > "A field or method ID does not prevent the VM from unloading the
> > class from
> > > which the ID has been derived. After the class is unloaded, the
> > method or
> > > field ID becomes invalid and may not be passed to any function
> > taking
> > such
> > > an ID. The native code, therefore, must make sure to:
> > > - keep a live reference to the underlying class, or
> > > - recompute the method or field ID
> > >
> > > if it intends to use a method or field ID for an extended period
> of
> > time."
> >
> > and the JVM/TI spec does depend on the JNI spec for certain pieces
> > like the
> > semantics of jmethodID.
> >
> > I agree with David that fixing this is going to be difficult and a
> major
> > effort which is why we've all only chipped at this boulder before...
> >
> >
> > I apologize for being a stickler, but it seems that the concurrent class
> > unloading and the ephemeral nature of jmethodIDs could cause problems
> > even for a much less exotic usecase of the JVMTI GetStackTrace
> > (
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#GetStackTrace
> <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#GetStackTrace
> >).
> > IIUC, the stacktrace consists of stack frames and each stack frame
> > contains a jmethodID identifying the associated Java method. Further on,
> > obtaining the stacktrace will not create references to corresponding
> > classes so the jmethodIDs can go 'bad' at any given moment, without an
> > official way to make sure such a 'bad' jmethodID would not crash JVM.
>
> Yep all correct. The jvmtiFrameInfo contains the jmethodID as applicable
> when the stacktrace was taken. If the current scope is not ensuring
> liveness then those method ids could be invalidated at any time.
>
> I agree this is a hole in the JVM TI specification, but it is not a hole
> I know how to patch. Additional API are needed here along with changes
> to the validity requirements of jMethodIds.
>
Thanks! Yes, this makes sense. I just wanted to make sure my perception was
correct.
Cheers,
-JB-
>
> Cheers,
> David
> -----
>
> > Cheers,
> >
> > -JB-
> >
> >
> > Dan
> >
> >
> > On 6/1/23 4:28 AM, David Holmes wrote:
> > > On 1/06/2023 5:16 pm, Jaroslav Bachorík wrote:
> > >> Hi David,
> > >>
> > >> On Thu, Jun 1, 2023 at 3:56 AM David Holmes
> > <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> > >> <mailto:david.holmes at oracle.com
> > <mailto:david.holmes at oracle.com>>> wrote:
> > >>
> > >> Hi Jaroslav,
> > >>
> > >> On 31/05/2023 9:12 pm, Jaroslav Bachorík wrote:
> > >> > Dear Team,
> > >> >
> > >> > I've been investigating the unusual JVM crashes occurring
> in
> > >> JVMTI calls
> > >> > on a J9 JVM. During my investigation, I scrutinized the
> > >> `jmethodID`
> > >> > definition closely, available here: [jmethodID
> > >> >
> > >>
> > definition](
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >
> > >>
> > <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >>
> > >>
> > <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >
> > >>
> > <
> https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> <https://docs.oracle.com/en/java/javase/17/docs/specs/jvmti.html#jmethodID
> >>>).
> > >>
> > >> >
> > >> > To paraphrase, the definition suggests that `jmethodID`
> > >> identifies a
> > >> > Java method, initializer, or constructor. These
> > identifiers, once
> > >> > returned by JVM TI functions and events, can be safely
> > stored.
> > >> However,
> > >> > when the class is unloaded, they become invalid,
> > rendering them
> > >> > unsuitable for use.
> > >> >
> > >> > My interpretation is that the JVMTI user should verify the
> > >> validity of a
> > >> > `jmethodID` value before using it to prevent potential
> > crashes.
> > >> Would
> > >> > you agree with this interpretation?
> > >>
> > >> Not quite - as you note you can't verify the jmethodID
> > validity.
> > >> What
> > >> the user needs to do, in line with what Dan was saying, is
> > ensure
> > >> that
> > >> they keep track of the classes to which the methods belong
> > and keep
> > >> them
> > >> alive if necessary. Now that may be easier said than done,
> but
> > >> that is
> > >> the gist of it. This comes from the JNI spec:
> > >>
> > >> "A field or method ID does not prevent the VM from unloading
> > the
> > >> class
> > >> from which the ID has been derived. After the class is
> > unloaded, the
> > >> method or field ID becomes invalid and may not be passed to
> any
> > >> function
> > >> taking such an ID. The native code, therefore, must make
> > sure to:
> > >>
> > >> keep a live reference to the underlying class, or
> > >> recompute the method or field ID
> > >>
> > >> if it intends to use a method or field ID for an extended
> > period of
> > >> time."
> > >>
> > >> > This sounds like a sensible requirement, but its practical
> > >> application
> > >> > remains unclear. As far as I know, methods can be unloaded
> > >> concurrently
> > >> > to the native code executing JVMTI functions. This
> > introduces a
> > >> > potential race condition where the JVM unloads the
> > methods during
> > >> the
> > >> > check->use flow, making it only a partial solution. To
> > complicate
> > >> > matters further, no method exists to confirm whether a
> > >> `jmethodID` is valid.
> > >> >
> > >> > Theoretically, we could monitor the `CompiledMethodUnload`
> > >> event to
> > >> > track the validity state, creating a constantly expanding
> > set of
> > >> > unloaded `jmethodID` values or a bloom filter, if one
> > does not
> > >> care
> > >> > about few potential false positives. This strategy,
> however,
> > >> doesn't
> > >> > address the potential race condition, and it could even
> > >> exacerbate it
> > >> > due to possible event delays. This delay might mistakenly
> > >> validate a
> > >> > `jmethodID` value that has already been unloaded, but for
> > >> which the
> > >> > event hasn't been delivered yet.
> > >> >
> > >> > Honestly, I don't see a way to use `jmethodID` safely
> > unless the
> > >> code
> > >> > using it suspends the entire JVM and doesn't resume until
> > it's
> > >> finished
> > >> > with that `jmethodID`. Any other approach might lead to
> JVM
> > >> crashes, as
> > >> > we've observed with J9.
> > >> >
> > >> > Lastly, it's noteworthy that Hotspot takes meticulous
> > measures to
> > >> ensure
> > >> > that using jmethodIDs for unloaded methods doesn't crash
> the
> > >> JVM and
> > >> > even provides useful information. This observation has
> > led me to
> > >> > question whether the documentation aligns with the Hotspot
> > >> > implementation, especially given that following closely
> the
> > >> > documentation appears to increase the risk associated
> > with the
> > >> use of
> > >> > `jmethodID` values.
> > >>
> > >> There have been folk who wanted to make this area more
> > user-friendly
> > >> but
> > >> that shouldn't be mistaken for moving towards a world where
> > >> jmethodIDs
> > >> are always safe to use.
> > >>
> > >>
> > >> Yes, I see your point. Unfortunately, this confirms my worries
> that
> > >> using AsyncGetCallTrace (ASGCT) on a system strictly adhering to
> > the
> > >> JVMTI spec of jmethoID is not really possible without risking
> > random
> > >> and quite frequent crashes on systems with concurrent class
> > unloading
> > >> enabled.
> > >> FTR, ASGCT will record the stack trace as a list of frames, each
> > one
> > >> containing the corresponding jmethodID value. Considering that
> the
> > >> most common usage of ASGCT is in a signal handler it makes it
> > >> impossible to use JVMTI calls to resolve the holder class and
> > create
> > >> a strong reference to prevent it from being unloaded.
> > >> And even if this would be possible we would need to figure out
> when
> > >> to release the class reference when it is no more needed - and
> > it is
> > >> not really clear how we could do that reliably, leaving us with
> the
> > >> option of holding the class references indefinitely or risking
> > >> crashing JVM.
> > >
> > > I would have to agree you cannot use jmethodID's for that
> > purpose, not
> > > without also recording (and keeping-alive) the associated
> > classes. As
> > > to when you would clear such references I can't say as I don't
> know
> > > how the ASGCT stack record would be used.
> > >
> > >> I want to emphasize that not being able to resolve additional
> > details
> > >> for a jmethodID pointing to a method of an unloaded class is not
> an
> > >> issue, as long as the JVMTI call does not crash. I think that
> > >> https://bugs.openjdk.org/browse/JDK-8268364
> > <https://bugs.openjdk.org/browse/JDK-8268364>
> > >> <https://bugs.openjdk.org/browse/JDK-8268364
> > <https://bugs.openjdk.org/browse/JDK-8268364>> did address exactly
> the
> > >> problem of concurrent class unloading causing races in the code
> > that
> > >> is checking for validity of jmethodID and then using it.
> > >
> > > Yes but internal to the JVMTI implementation. The basic null check
> > > that was used was found to be insufficient when used with ZGC and
> so
> > > we would crash more often than was considered reasonable. There
> is a
> > > quoted comment in the code in
> > > https://bugs.openjdk.org/browse/JDK-8268088
> > <https://bugs.openjdk.org/browse/JDK-8268088>:
> > >
> > > // These aren't deallocated and are going to look like a leak,
> but
> > > that's
> > > // needed because we can't really get rid of jmethodIDs because
> we
> > > don't
> > > // know when native code is going to stop using them. The spec
> > says
> > > that
> > > // they're "invalid" but existing programs likely rely on their
> > being
> > > // NULL after class unloading.
> > >
> > > that kind of sums up the position of trying to accommodate "bad
> > code"
> > > in a reasonable way.
> > >
> > >> Can this be summarize in a way that the user is not guaranteed
> > to get
> > >> any additional information for an invalid jmethodID but it would
> be
> > >> really nice for JVM not to crash when jmethodID becomes invalid
> as
> > >> there is no way for the user to check for its validity in an
> atomic
> > >> manner
> > >> - and yes, even calling GetMethodDeclaringClass in order to
> obtain
> > >> the class one could create a strong reference is a subject to
> racy
> > >> behaviour so it really can not be used as a workaround.
> > >
> > > The only real solution IMO would be to encode jmethodIDs in a way
> > that
> > > validity can be tracked and then have all JVMTI methods be able to
> > > check that validity and guarantee atomicity of the method such
> that
> > > the id can't become invalid whilst in use. Whether that is at all
> > > feasible/practical I do not know - but it sounds like a major
> effort
> > > to me.
> > >
> > > Cheers,
> > > David
> > >
> > >> Cheers,
> > >>
> > >> -JB-
> > >>
> > >>
> > >> Cheers,
> > >> David
> > >>
> > >> > I welcome your thoughts and perspectives on this matter.
> > >> >
> > >> > Best regards,
> > >> >
> > >> > Jaroslav
> > >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/serviceability-dev/attachments/20230620/02943b7c/attachment-0001.htm>
More information about the serviceability-dev
mailing list