RFR: 8255243: Reinforce escape barrier interactions with ZGC conc stack processing [v2]
Serguei Spitsyn
sspitsyn at openjdk.java.net
Wed Oct 28 19:47:47 UTC 2020
On Tue, 27 Oct 2020 13:31:36 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:
>> The escape barrier reallocates scalarized objects potentially deep into the stack of a remote thread. Each allocation can safepoint, causing referenced frames to be invalid. Some sprinklings were added that deal with that, but I believe it was subsequently broken with the integration of the new vector API, that has its own new deoptimization code that did not know about this. Not surprisingly, the integration of the new vector API had no idea about this subtlety, and allocates an object, and then reads an object deep from the stack of a remote thread (using an escape barrier). I suppose the issue is that all these 3 things were integrated at almost the same time. The problematic code sequence is in VectorSupport::allocate_vector() in vectorSupport.cpp, which is called from Deoptimization::realloc_objects(). It first allocates an oop (possibly safepointing), and then reads a vector oop from the stack. This is usually fine, but not through the escape barrier, with concurrent stack s
canning. While I have not seen any crashes yet, I can see from code inspection, that there is no way that this works correctly.
>>
>> In order to make this less fragile for future changes, we should really have a RAII object that keeps the target thread's stack of the escape barrier, stable and processed, across safepoints. This patch fixes that. Then it becomes much easier to reason about its correctness, compared to hoping the various hooks are applied after each safepoint.
>>
>> With this new robustness fix, the thread running the escape barrier, keeps the target thread stack processed, straight through safepoints on the requesting thread, making it easy and intuitive to understand why this works correctly. The RAII object basically just has to cover the code block that pokes at the remote stack and goes in and out of safepoints, arbitrarily. Arguably, this escape barrier doesn't need to be blazingly fast, and can afford keeping stacks sane through its operation.
>
> Erik Österlund has updated the pull request incrementally with two additional commits since the last revision:
>
> - Better encapsulate object deoptimization in EscapeBarrier also to facilitate correct interaction with concurrent thread stack processing.
>
> The Stackwalk for object deoptimization in VM_GetOrSetLocal::doit_prologue is not prepared for concurrent thread stack processing.
> EscapeBarrier::deoptimize_objects(int depth) is extended to cover a range of frames from depth d1 to depth d2. It is also prepared for concurrent thread stack processing. With this change it is used to deoptimize objects in the prologue of VM_GetOrSetLocal.
> - Review comments
Hi Erik and Richard,
Changes in the serviceability files looks fine.
Thanks,
Serguei
-------------
Marked as reviewed by sspitsyn (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/832
More information about the serviceability-dev
mailing list