RFR: JDK-8253001: [JVMCI] Add API for getting stacktraces independently of current thread

Wed Sep 23 22:11:32 UTC 2020

On Wed, 23 Sep 2020 20:23:18 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> You are right; the java.lang.StackWalker does not have a thread parameter. If one is needed, I imagine we can add one
>> (using a handshake). However, I was under the impression that only the debugger case needed this for other remote
>> threads, in which case JVMTI seems like the natural solution. So yeah, is the non-debug case in need of remote stack
>> traces with locals?
>
> I'll let @chumer answer that question.
> 
> However, I have another one of my own. As far as I can see, the only use of `java.lang.LiveStackFrame` and
> `java.lang.LiveStackFrameInfo` in the JDK code base are in the
> [LocalsAndOperands](https://github.com/openjdk/jdk/blob/6bab0f539fba8fb441697846347597b4a0ade428/test/jdk/java/lang/StackWalker/LocalsAndOperands.java)
> test where they are used via reflection. Do you know if there are/were plans to make these classes public?

The special case we support in the `StackWalker` API is intentionally limited, because a thread examining its own stack
is the least risky and most performant scenario.  The `StackWalker::walk` API point, in particular, is carefully
designed so that its internal implementation can internally use unsafe "dangling" pointers from the thread into its own
stack.  This reduces copying and buffering, which is obviously the least expensive way to "take a quick peek" at what's
on the stack.

It is reasonable to ask to extend such functionality to a second, uncooperative thread, but this brings in lots of
extra baggage:

- How does the requesting thread get permission to look inside the target thread?  (New security analysis.)
- At what point does the target thread get its state taken as a snapshot?  Any random moment?
- How is the target thread "held still" while it is being sampled?  (And then, "Where is this term 'safepoint' defined in
  the JVM specifications?")
- Can a target thread refuse or defer the request, to defend some particular encapsulation?
- How is that state stored, and what are the time and space costs for such storage?
- What happens if the requesting thread just wants to look at a few bits?  Do we still buffer up a whole backtrace?
- Or, is the target thread required to execute callbacks provided by the requesting thread, with a temporary view, and if
  so, that limits are there on such callbacks?
- Can the observation process ever cause the target thread to fail, or will any and all failures (OOME, SOE, etc.) be
  attributed to the requesting thread?
- What happens if the requesting thread makes two requests in a row:  Are there any guarantees about relations between
  the two sets of results?  (To be fair, this is also an issue with the self-walking case.)
- What happens if the requesting thread asks to change a value in a frame or pop or re-invoke or replace a frame?  (Not
  allowed in the self-walking case either, but a plausible extension.)

If only "just adding a thread parameter" were a straightforward extension…  Instead, we have serious user model issues
(see above), and serious implementation issues (see the PR).

I think we could perhaps add cross-thread access to the current `StackWalker` API, if we came up with answers to the
above.  I think, in order to engineer it correctly, we would want to factor it as the composition of a self-walking
request, *plus* a cross-call mechanism which would allow one thread to ask another thread to run a function.  Jumbling
these complex operations together into a big pile of new code would be the wrong way to do it.  The self-walking API is
pretty well understood, and there is a good literature on cross-call mechanisms too.  Let's break the problem up.

 BTW, the current `StackWalker` API could certainly accept minor extensions to inspect locals, and/or to perform frame
 replacement, as hinted above.  The JVM currently benefits from performing on-stack replacement when it can tell that a
 slow loop is worth (re-)optimizing as a fast loop.  There's no reason the JDK libraries (say, the streams runtime, in
 particular) shouldn't have a shot at doing something similar.  That would require internal JDK hooks self-inspect and
 replace loops with improved "customizations", on the fly.

All of the above comments apply only to what might be called the self-inspecting, self-reflective, or "introspective"
modes of stack walking.  Debuggers usually don't do this (except in one-world environments like Lisp and SmallTalk),
but rather operate from the side, through a privileged channel "under the virtual metal" like JVMTI.  I suppose for
those use cases, JVMTI is plenty good.  If there is some trick for self-attachment (either direct or through a
conspirator process), then some introspection is also possible, via JVMTI.

For best performance, a more "one world" implementation is desirable, but this implies that we create a whole category
of "debugging/monitoring code".  Such debugging/monitoring code would (like today's runtime internals like those that
use `Unsafe`) have privileges beyond regular application code.  It might also have eBPF-like limitations on resource
usage, so that its executions could be hidden "under the metal" of regular executions.  IMO these are promising ideas.
They might help us define a better, more cooperative debugging/monitoring primitives.  I raise the ideas here because I
think there may be a root issue here:  How can we use the JDK's on-line introspection APIs for more purposes?  How can
we inject privileged monitoring code into Java executions?

Adding yet another stack walking mechanism to the JVM seems to me like an inefficient way to move, a little bit, in the
direction of cooperative debugging/monitoring facilities in the JDK.  Conversely, if we can create a way to do
(privileged) cross-calls, then we won't need yet another stack walking mechanism.

I guess this is where I end up:  Please consider refactoring this into an extension (if any is needed) to the
self-inspection API (`StackWalker`) and something a cross-call API.  Then we should consider hooking it up to JVMCI.

-------------

PR: https://git.openjdk.java.net/jdk/pull/110