RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v8]

Erik Österlund eosterlund at openjdk.java.net
Thu Oct 1 10:15:32 UTC 2020


On Thu, 1 Oct 2020 09:50:45 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

>> Marked as reviewed by rehn (Reviewer).
>
>> _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on
>> [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_
>> I've only looked at scattered pieces, but what I've looked at seemed to be
>> in good shape. Only a few minor comments.
>> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/frame.cpp
>> 456 // for(StackFrameStream fst(thread); !fst.is_done(); fst.next()) {
>> 
>> Needs to be updated for the new constructor arguments. Just in general, the
>> class documentation seems to need some updating for this change.
> 
> Fixed.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/frame.cpp
>> 466 StackFrameStream(JavaThread *thread, bool update, bool process_frames);
>> 
>> Something to consider is that bool parameters like that, especially when
>> there are multiple, are error prone. An alternative is distinct enums, which
>> likely also obviates the need for comments in calls.
> 
> Coleen also had the same comment, and we agreed to file a follow-up RFE to clean that up. This also applies to all the
> existing parameters passed in. So I would still like to do that, but in a follow-up RFE. Said RFE will also make the
> parameter use in RegisterMap and vframeStream explicit.
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/thread.hpp
>> 956 void set_processed_thread(Thread *thread) { _processed_thread = thread; }
>> 
>> I think this should assert that either _processed_thread or thread are NULL.
>> Or maybe the RememberProcessedThread constructor should be asserting that
>> _cur_thr->processed_thread() is NULL.
> 
> Fixed.
> 
>> ------------------------------------------------------------------------------
> 
> Thanks for the review Kim!

In my last PR update, I included a fix to an exception handling problem that I encountered after lots of stress testing
that I have been running for a while now. I managed to catch the issue, get a reliable reproducer, and fix it.

The root problem is that the hook I had placed in SharedRuntime::exception_handler_for_return_address has been ignored.
The reason is that the stack is not walkable at this point. The hook then just ignores it. This had some unexpected
consequences. After looking closer at this code, I found that if we did have a walkable stack when we call
SharedRuntime::raw_exception_handler_for_return_address, that would have been the only hook we need at all for
exception handling. It is always the common root point where we unwind into a caller frame due to an exception throwing
into the caller, and we need to look up the rethrow handler of the caller. However, we are indeed not walkable here. To
deal with this, I have rearranged the exceptino hooks a bit. First of all, I have deleted all before_unwind hooks for
exception handling, because they should not be needed if the after_unwind hook is reliably called on the caller side
instead. And those hooks do indeed need to be there, because we do not always have a point where we can call
before_unwind (e.g. C1 unwind exception code, that just unwinds and looks up the rethrow handler via
SharedRuntime::exception_handler_for_return_address). I have then traced all paths from
SharedRuntime::raw_exception_handler_for_return_address into runtime rethrow handlers called, for each rethrow
exception handler PC exposed in the function. They are:

* OptoRuntime::rethrow_C when unwinding into C2 code
* exception_handler_for_pc_helper via Runtime1::handle_exception_from_callee_id when unwinding into C1 code
* JavaCallWrapper::~JavaCallWrapper when unwinding into a Java call stub
* InterpreterRuntime::exception_handler_for_exception when unwinding into an interpreted method
* Deoptimization::fetch_unroll_info (with exec_mode == Unpack_exception) when unwinding into a deoptimized nmethod

Each rethrow handler returned has a corresponding comment saying which rethrow runtime rethrow handler it will end up
in, once the stack has been walkable and we have transferred control into the caller. And all of those runtime hooks
now have an after_unwind() hook.

The good news is that now the responsibility for who calls the unwind hook for exception is clearer: it is never done
by the callee, and always done by the caller, in its rethrow handler, at which point the stack is walkable.

In order to avoid further issues where an unwind hook is ignored, I have changed them to assert that there is a
last_Java_frame present. Previously I did not assert that, because there was shared code between runtime native
transitions and the native wrapper, that called an unwind hook. This prevented the unwind hooks to assert this, as
compiler threads would still perform native transitions, that just ignored the request. I moved the problematic hook
for native up one level in the hierarchy to a path where it is only called by native wrappers (where we always have a
last_Java_frame), so that I can finally assert that the unwind hooks always are called at points where we have a
last_Java_frame. This makes me feel confident that I do not have another hook that is being accidentally ignored.

However, the relationship for the various exception handling code executed in the caller, the callee, and between the
two (before we are walkable) is rather complicated. So it would be good to have someone that knows the exception code
very well have a look at this, to make sure I have not missed anything. I have rerun all testing and done a load of
stress testing to sanity check this. The reproducer I eventually found that reproduced the issue with 100% success
rate, was run many times with the new patch, and no longer reproduces any issue.

-------------

PR: https://git.openjdk.java.net/jdk/pull/296


More information about the hotspot-dev mailing list