RFR: 8261391: ZGC crash - SEGV in RevokeOneBias::do_thread [v3]
David Holmes
david.holmes at oracle.com
Mon Feb 22 07:50:15 UTC 2021
Correction ...
On 22/02/2021 3:11 pm, David Holmes wrote:
> Hi Erik,
>
> On 19/02/2021 6:16 pm, Erik Osterlund wrote:
>> Hi David,
>>
>> As for the general problem of accessing a private handle from another
>> thread, my thinking was that this could only possibly be safe if a
>> thread ”grabs on to” a remote thread so it’s not concurrently running.
>> Otherwise doing so would never be safe with or without the new
>> concurrent root processing code.
>
> It is safe whenever the two threads use a protocol that ensures safety.
> So bundling a local handle into a synchronous VM-op or synchronous
> handshake op is safe, because the originator won't move on until the op
> has been completed. But any kind of async op would be dangerous - as
> would trying to "return" a Handle via the op.
>
>> So the reasoning was that if we make it safe for safepoint operations
>> to access any thread’s private oops, and for handshakes to access the
>> handshakee’s private oops, then that should cover that kind of
>> interactions.
>> However, I did think about the possibility of the handshaker’s private
>> oops being accessed by the handshakee, and thought like you that you
>> just shouldn’t do that. You should use global handles for such shared
>> interactions. But ever since then I have now and then thought ”what if
>> I missed a handle”. And this bug proves that I did indeed miss a handle.
>
> The current case seems to me to be what you would normally encounter.
> It is the handshaker that creates the handle and the op and then
> potentially has the handshakee execute the op - hence the it has to be
> safe for the handshakee to access the handshaker's handles. The reverse
> is not expected because, AFAICS, the handshakee never creates the op so
> it would only expose local handles if it "returned" them via the op -
> and that is inherently unsafe!
Robbin pointed out that the more general issue of safety here involves
e.g walking the stack of the handshakee which may contain oops; it isn't
to per-se make Handles created by the handshakee appear safe.
David
-----
>>
>> Therefore, while I agree that passing a private handle to be used by
>> another thread is something we shouldn’t do, I would sleep better at
>> night knowing that it is safe to do so anyway, should it happen, in
>> case there is another strange interaction that I missed, or some new
>> code comes in that thinks it’s safe. This is the more robust way of
>> dealing with it.
>
> On reflection I can see there are situations where we can construct a
> safe exchange of data using handles, so yes this must work. But I don't
> like the fact the code performing the exchange has to be aware that
> there is some special call that needs to be done to in fact make it safe
> though (beyond the protocol) - that is error prone. The Handle logic
> should encapsulate this if possible. And ideally we would detect when
> such a cross-thread access in in fact unsafe.
>
> So notwithstanding I think there is more to be looked at here, the
> current fix is a necessary and desirable one, and I withdraw my
> (non-binding) objections.
>
> Thanks,
> David
> -----
>
>> Thanks,
>> /Erik
>>
>>> On 18 Feb 2021, at 13:48, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>> Correction ...
>>>
>>>> On 18/02/2021 12:22 pm, David Holmes wrote:
>>>> More generally I am concerned by the fact that we have a potential
>>>> bug if a Handle created by one thread is accessed by another,
>>>> without first calling start_processing() on the creator of the
>>>> Handle. This would seem to be error prone to me unless it is
>>>> actually hidden away inside the Handle implementation - though how
>>>> to do that efficiently is unclear. I'm very concerned that it is too
>>>> easy to write code (eg. handing something off to the service thread
>>>> or notification thread) that will be buggy.
>>>
>>> I was confusing Handles with Resources in terms of lifecycle - not
>>> realising Handles are stack-local to a Thread and can only be shared
>>> with another thread very, very carefully.
>>>
>>> David
>>>
>>>> Perhaps @fisk can comment on the general problem.
>>>> Thanks,
>>>> David
>>>> src/hotspot/share/runtime/handshake.cpp line 84:
>>>>> 82: public:
>>>>> 83: AsyncHandshakeOperation(AsyncHandshakeClosure* cl,
>>>>> JavaThread* target, jlong start_ns)
>>>>> 84: : HandshakeOperation(cl, target, NULL),
>>>>> _start_time_ns(start_ns) {}
>>>> Not at all clear why requester is always NULL here. There is
>>>> obviously a requester thread even in the async case. If that thread
>>>> created a Handle in the op then don't we need to call
>>>> start_processing on it?
>>>> src/hotspot/share/runtime/handshake.cpp line 328:
>>>>> 326:
>>>>> 327: void Handshake::execute(HandshakeClosure* hs_cl) {
>>>>> 328: HandshakeOperation cto(hs_cl, NULL, Thread::current());
>>>> It is unfortunate that we need to manifest the current thread here
>>>> (and elsewhere).
>>>> -------------
>>>> Changes requested by dholmes (Reviewer).
>>>> PR: https://git.openjdk.java.net/jdk/pull/2571
More information about the hotspot-runtime-dev
mailing list