RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing

Mon Oct 26 11:29:10 UTC 2020

On Sat, 24 Oct 2020 07:34:57 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>> The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack s
 canning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly.
>> 
>> In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint.
>> 
>> With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation.
>
> I'm really glad you caught that one! And I like the abstraction provided by KeepStackGCProcessedMark.
> 
> There is one execution path you missed coming from `VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf)`. This code should probably be moved into EscapeBarrier. `EscapeBarrier::deoptimize_objects(int depth)` could be changed to be more generic `EscapeBarrier::deoptimize_objects(int from_depth, int to_depth)`. VM_GetOrSetLocal::doit_prologue() could call eb.deoptimize_objects(_depth, _depth) then. That would be better but maybe not yet really good...
> 
> Update: I see now that there is also a stackwalk in `VM_GetOrSetLocal::doit_prologue()` which needs to be taken care of with regard to concurrent stack processing. I'd like to try to refactor this. Will propose a patch.
> 
> Thanks again, Richard.

Hi Erik, the last commit in https://github.com/reinrich/jdk/commits/pr-832-with-better-encapsulation would be the refactoring I would like to do. It removes the code not compliant with concurrent thread stack processing from VM_GetOrSetLocal::doit_prologue(). Instead EscapeBarrier::deoptimize_objects(int d1, int d2) is called. You added already a KeepStackGCProcessedMark to that method and I changed it to accept a range [d1, d2] of frames do the object deoptimization for.

I'm not sure how to handle this from a process point of view. Can the refactoring be done within this change? Should a new item or subtask be created for it. I'd be glad if you could give an advice on that.

Thanks, Richard.

-------------

PR: https://git.openjdk.java.net/jdk/pull/832