More useful structured concurrency stack traces

Tue Jul 9 19:10:36 UTC 2024

Probably best to bring this to loom-dev as there have been some 
exploration into but where we decided not to expose any APIs at this time.

-Alan

On 09/07/2024 19:50, Louis Wasserman wrote:
> My understanding of the structured concurrency APIs now in preview is 
> that when a subtask is forked, exceptions thrown in that stack trace 
> will have stack traces going up to the beginning of that subtask, not 
> e.g. up the structured concurrency task tree.  (My tests suggest this 
> is the case for simple virtual threads without structured 
> concurrency.)  Most concurrency frameworks on the JVM that I’ve 
> encountered share the property that stack traces for exceptions don’t 
> trace through the entire causal chain – and, not unrelatedly, that 
> developers struggle to debug concurrent applications, especially with 
> stack traces from production and not full debuggers attached.
>
> In some cases, like chained CompletableFutures, this seems necessary 
> to ensure that executing what amounts to a loop does not result in 
> stack traces that grow linearly with the number of chained futures.  
> But when structured concurrency is involved, it seems more plausible 
> to me that the most useful possible stack traces would go up the tree 
> of tasks – that is, whenever a task was forked, the stack trace would 
> look roughly as if it were a normal/sequential/direct invocation of 
> the task.  This could conceivably cause stack overflows where they 
> didn’t happen before, but only for code that violates the expectations 
> we have around normal sequential code: you can’t recurse unboundedly; 
> use iteration instead.
>
> I’m curious if there are ways we could make the upcoming structured 
> concurrency APIs give those stack traces all the way up the tree, or 
> provide hooks to enable you to do that yourself.  Last year’s JVMLS 
> talk on Continuations Under the Covers demonstrated how stacks were 
> redesigned in ways that frequently and efficiently snapshot the stack 
> itself – not just the trace, but the thing that includes all the 
> variables in use.  There’s a linked list of StackChunks, and all but 
> maybe the top of the stack has those elements frozen, etc, and the top 
> of the stack gets frozen when the thread is yielded.  Without 
> certainty about how stack traces are managed in the JVM today, I would 
> imagine you could possibly do something similar – you’d add a way to 
> cheaply snapshot a reference to the current stack trace that can be 
> traversed later.  If you’re willing to hold on to all the references 
> currently on the stack – which might be acceptable for the structured 
> concurrency case in particular, where you might be able to assume 
> you’ll return to the parent task and its stack at some point – you 
> might be able to do this by simply wrapping the existing StackChunks.  
> Then, each `fork` or `StructuredTaskScope` creation might snapshot the 
> current call stack, and you’d stitch together the stack traces 
> later…somewhere.  That part is a little more open ended: would you add 
> a new variant of `fillInStackTrace`?  Would it only apply to 
> exceptions that bubbled up to the task scope?  Or would we be adding 
> new semantics to what happens when you throw an exception or walk the 
> stack in general?  The most plausible vision I have at this point is 
> an API that spawns a virtual thread which receives a stack trace of 
> some sort – or perhaps snapshots the current stack trace – and 
> prepends that trace to all stack traces within the virtual thread’s 
> execution.
>
> I suppose this is doable today if you’re willing to pay the 
> performance cost of explicitly getting the current stack trace every 
> time you fork a task or start a scope.  That is kind of antithetical 
> to the point of virtual threads – making forking tasks very efficient 
> – but it’s something you might be willing to turn on during testing.
>
> Right now, my inspiration for this question is attempting to improve 
> the stack trace situation with Kotlin coroutines, where Google 
> production apps have complained about the difficulty of debugging with 
> the current stack traces.  But this is something I'd expect to apply 
> equally well to all JVM languages: the ability to snapshot and string 
> together stack trace causal chains like this in production could 
> significantly improve the experience of debugging concurrent code.
>
> -- 
> Louis Wasserman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240709/7420ba6e/attachment-0001.htm>