RFR: JDK-8253001: [JVMCI] Add API for getting stacktraces independently of current thread

Wed Sep 23 19:56:49 UTC 2020

On Wed, 23 Sep 2020 18:54:33 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

>> Tuning in to provide some background on why Truffle needs this and why we spent a lot of time to stabilize this PR. If
>> we could have gone a different route we would have.
>> Truffle introduces the separation of guest and host language. As host language, we understand the Java host VM. This is
>> either HotSpot (relevant for this PR) or SubstrateVM (Native Image). Guest languages are interpreters implemented on
>> top of Truffle, like JavaScript, Ruby, or Python, but also Espresso our Java implementation based on Truffle. Truffle
>> uses Graal and JVMCI to optimize these guest languages to optimized machine code using a technique called the first
>> Futamura projection. This Graal compilation is limited to JDKs that provide JVMCI APIs.  In Truffle we use the notion
>> of guest and host stack frames. Guest stack frames represent a method activation in the guest language and host stack
>> frames represent host Java method activations. A guest stack frame consists of the caller location  and guest local
>> variables.  Truffle languages need to access guest frames of the current thread to construct a stack trace or to lazily
>> access variables in a parent guest frame. There are two techniques to do this: 1. Have a separate stack data structure
>> on the heap that keeps track of the guest frames for each thread. 2. Walk the live local variables in Java host frames
>> to access the guest frame and caller location.  We use the technique (1) to implement Truffle guest stack traces on a
>> JVM without JVMCI support. This is pretty simple and allows us to walk the guest stack for any thread we need. But,
>> there are downsides with this:
>> * For each method invocation we have additional overhead for maintaining the extra heap data structure.
>> * The frame object always escapes the current compilation scope and can therefore not be escape analyzed by Graal.
>> 
>> Both of these issues are deal-breakers, performance-wise. With Truffle we want to be competitive with other specialized
>> VMs, so technique (1) is not good enough. JVMCI currently exposes stack walking APIs for Truffle that allows us to
>> access the host frame local variables of the current thread. This allows us to lazily reconstruct the guest frames from
>> the host frames from certain known and live local variables. We also have special logic to reconstruct read-only guest
>> frames from optimized Truffle+Graal compiled methods without the need to invalidate the optimized code.  We are using
>> the technique (2) successfully for many years, but now with the growing maturity of Truffle we have new requirements:
>> 1. We need to be able to walk all the root pointers of a guest language. This includes all active guest frames. This is
>> needed to allow languages to walk all live objects (e.g. Ruby needs that) and to compute the size of the live objects
>> of a truffle language. 2. We need to be able to read locals from other threads to produce the guest stack trace of
>> other threads in the Truffle debugger.  This was not a big issue before, because we were mostly dealing with
>> single-threaded languages (JavaScript).  The Truffle debugger should not be confused with the Java host debugger. The
>> Truffle debugger works based on the Truffle instrumentation framework and cannot debug Java host code. It only shows
>> guest stack frames and statements and is entirely agnostic to which Java methods were used to implement it and on which
>> Java VM it runs on. It is entirely built with Java, without the use of JVMTI, this allows us to debug guest code
>> without having the Java debugger attached. It allows us to on-demand enable debugging in a production scenario when it
>> is needed and only for a guest language instance that needs it without slowing down other code in host VM (e.g. in an
>> app server). Truffle debugging works on SubstrateVM (native-image) which has currently no support for JVMTI. Enabling
>> and not using the debugger also comes without any peak performance overhead (some memory and warmup overhead).  To
>> summarize: 1. We cannot use the StackWalker API as it does not allow us to access local variables. 2. We cannot
>> manually allocate extra objects for guest language frames, as this would hurt performance. 3. We cannot use JVMTI
>> because: 3a. We need it to implement language features, not just debugger features. 3b. There is no way to enable it on
>> demand for an individual guest application (we run multiple guest applications per host VM).  3c. There is no Java API
>> that allows it be used in the same process.  Therefore our best idea was to introduce this new JVMCI API. We are of
>> course open to other suggestions, if they solve our problem. This is also not an entirely new feature, this PR is an
>> extension to the existing JVMCI functionality to walk the stack frames with local variable access.  I hope these
>> clarifications were helpful.
>
> java.lang.StackWalker does expose locals as well. What am I missing?

@fisk @coleenp , it appears as though StackWalker can only be used for the current thread. Am I missing some other,
potentially internal, API that extends StackWalker to work on other threads?

-------------

PR: https://git.openjdk.java.net/jdk/pull/110