RFR: JDK-8253001: [JVMCI] Add API for getting stacktraces independently of current thread

Thu Sep 24 06:41:43 UTC 2020

On Wed, 23 Sep 2020 22:06:33 GMT, John R Rose <jrose at openjdk.org> wrote:

>> I'll let @chumer answer that question.
>> 
>> However, I have another one of my own. As far as I can see, the only use of `java.lang.LiveStackFrame` and
>> `java.lang.LiveStackFrameInfo` in the JDK code base are in the
>> [LocalsAndOperands](https://github.com/openjdk/jdk/blob/6bab0f539fba8fb441697846347597b4a0ade428/test/jdk/java/lang/StackWalker/LocalsAndOperands.java)
>> test where they are used via reflection. Do you know if there are/were plans to make these classes public?
>
> The special case we support in the `StackWalker` API is intentionally limited, because a thread examining its own stack
> is the least risky and most performant scenario.  The `StackWalker::walk` API point, in particular, is carefully
> designed so that its internal implementation can internally use unsafe "dangling" pointers from the thread into its own
> stack.  This reduces copying and buffering, which is obviously the least expensive way to "take a quick peek" at what's
> on the stack.  It is reasonable to ask to extend such functionality to a second, uncooperative thread, but this brings
> in lots of extra baggage:
> - How does the requesting thread get permission to look inside the target thread?  (New security analysis.)
> - At what point does the target thread get its state taken as a snapshot?  Any random moment?
> - How is the target thread "held still" while it is being sampled?  (And then, "Where is this term 'safepoint' defined in
>   the JVM specifications?")
> - Can a target thread refuse or defer the request, to defend some particular encapsulation?
> - How is that state stored, and what are the time and space costs for such storage?
> - What happens if the requesting thread just wants to look at a few bits?  Do we still buffer up a whole backtrace?
> - Or, is the target thread required to execute callbacks provided by the requesting thread, with a temporary view, and if
>   so, that limits are there on such callbacks?
> - Can the observation process ever cause the target thread to fail, or will any and all failures (OOME, SOE, etc.) be
>   attributed to the requesting thread?
> - What happens if the requesting thread makes two requests in a row:  Are there any guarantees about relations between
>   the two sets of results?  (To be fair, this is also an issue with the self-walking case.)
> - What happens if the requesting thread asks to change a value in a frame or pop or re-invoke or replace a frame?  (Not
>   allowed in the self-walking case either, but a plausible extension.)
> 
> If only "just adding a thread parameter" were a straightforward extension…  Instead, we have serious user model issues
> (see above), and serious implementation issues (see the PR).
> I think we could perhaps add cross-thread access to the current `StackWalker` API, if we came up with answers to the
> above.  I think, in order to engineer it correctly, we would want to factor it as the composition of a self-walking
> request, *plus* a cross-call mechanism which would allow one thread to ask another thread to run a function.  Jumbling
> these complex operations together into a big pile of new code would be the wrong way to do it.  The self-walking API is
> pretty well understood, and there is a good literature on cross-call mechanisms too.  Let's break the problem up.
>  BTW, the current `StackWalker` API could certainly accept minor extensions to inspect locals, and/or to perform frame
>  replacement, as hinted above.  The JVM currently benefits from performing on-stack replacement when it can tell that a
>  slow loop is worth (re-)optimizing as a fast loop.  There's no reason the JDK libraries (say, the streams runtime, in
>  particular) shouldn't have a shot at doing something similar.  That would require internal JDK hooks self-inspect and
>  replace loops with improved "customizations", on the fly.
> 
> All of the above comments apply only to what might be called the self-inspecting, self-reflective, or "introspective"
> modes of stack walking.  Debuggers usually don't do this (except in one-world environments like Lisp and SmallTalk),
> but rather operate from the side, through a privileged channel "under the virtual metal" like JVMTI.  I suppose for
> those use cases, JVMTI is plenty good.  If there is some trick for self-attachment (either direct or through a
> conspirator process), then some introspection is also possible, via JVMTI.  For best performance, a more "one world"
> implementation is desirable, but this implies that we create a whole category of "debugging/monitoring code".  Such
> debugging/monitoring code would (like today's runtime internals like those that use `Unsafe`) have privileges beyond
> regular application code.  It might also have eBPF-like limitations on resource usage, so that its executions could be
> hidden "under the metal" of regular executions.  IMO these are promising ideas.  They might help us define a better,
> more cooperative debugging/monitoring primitives.  I raise the ideas here because I think there may be a root issue
> here:  How can we use the JDK's on-line introspection APIs for more purposes?  How can we inject privileged monitoring
> code into Java executions?  Adding yet another stack walking mechanism to the JVM seems to me like an inefficient way
> to move, a little bit, in the direction of cooperative debugging/monitoring facilities in the JDK.  Conversely, if we
> can create a way to do (privileged) cross-calls, then we won't need yet another stack walking mechanism.  I guess this
> is where I end up:  Please consider refactoring this into an extension (if any is needed) to the self-inspection API
> (`StackWalker`) and something a cross-call API.  Then we should consider hooking it up to JVMCI.

John there is a lot to be said here in the solution domain. But before we get there, I want to get answers about the
problem domain, so I know if we are solving a real or imaginary problem. The crucial question it boils down to is: "is
remote thread stack sampling with locals needed in the non-debugger case"? If so, we can start discussing the solution
domain of that. But I suspect we already have all the APIs in place that are needed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/110