RFR: 8255452: Doing GC during JVMTI MethodExit event posting breaks return oop

Dean Long dlong at openjdk.java.net
Fri Oct 30 08:51:47 UTC 2020


On Thu, 29 Oct 2020 12:44:58 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

> The imasm::remove_activation() call does not deal with safepoints very well. However, when the MethodExit JVMTI event is being called, we call into the runtime in the middle of remove_activation(). If the value being returned is an object type, then the top-of-stack contains the oop. However, the GC does not traverse said oop in any oop map, because it is simply not expected that we safepoint in the middle of remove_activation().
> 
> The JvmtiExport::post_method_exit() function we end up calling, reads the top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, that eventually call Java and a bunch of stuff that safepoints. So after the JVMTI callback, we can expect the top-of-stack oop to be broken. Unfortunately, when we continue, we therefore end up returning a broken oop.
> 
> Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, is wrong, as we can safepoint on the way back to Java, which will break the return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving the transition to VM and back, into a block of code that is protected against GC. Before the JRT_BLOCK is called, we stash away the return oop, and after the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when InterpreterRuntime::post_method_exit is called when throwing an exception, we don't have the same problem of retaining an oop result, and hence the JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is the same as before for this path. 
> 
> This is a JVMTI bug that has probably been around for a long time. It crashes with all GCs, but was discovered recently after concurrent stack processing, as StefanK has been running better GC stressing code in JVMTI, and the bug reproduced more easily with concurrent stack processing, as the timings were a bit different. The following reproducer failed pretty much 100% of the time:
> while true; do make test JTREG="RETAIN=all" TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done 
> 
> With my fix I can run this repeatedly without any more failures. I have also sanity checked the patch by running tier 1-5, so that it does not introduces any new issues on its own. I have also used Stefan's nice external GC stressing with jcmd technique that was used to trigger crashes with other GCs, to make sure said crashes no longer reproduce either.

Marked as reviewed by dlong (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/930


More information about the serviceability-dev mailing list