RFR: 8257831: Suspend with handshakes [v2]

David Holmes david.holmes at oracle.com
Wed Mar 31 23:28:09 UTC 2021



On 1/04/2021 12:35 am, Robbin Ehn wrote:
> On Wed, 31 Mar 2021 10:48:11 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
> 
>>> Thanks @dcubed-ojdk - no it won't. The problem is that the signal can hit very late in a thread's termination process, after the JavaThread destructor has executed, so no query of JavaThread state is possible. So we needed something in the Thread state and the SR_lock was a good enough proxy for that. It may be possible to use a different Thread field (perhaps _ParkEvent but encapsulated in a Thread::has_terminated() helper method).
>>
>> SR_handler is used for OS-level suspend/resume (not to conflict with this change-set).
>> This feature is only used by JFR (AFAIK), and JFR only samples threads on it's ThreadsList.
>> This means the JavaThread can never be terminated, hence this code will always pass.
>>
>> If someone else is using OS-level suspend/resume without a ThreadsList, the bug is there is no ThreadsList AFAICT.
>>
>> Since ThreadLocalStorage::thread() is cleared last in ~Thread with Thread::clear_thread_current(); may be in the ~Thread destructor.
>> So the argument is that would be safe to read stuff from Thread but not JavaThread?
>> Since the compiler (and CPU) may reorder and optimize away code, so I argue reading from a half destroyed object is not a great idea.
>> E.g. Monitor* _SR_lock; is not a volatile pointer; since reading from this memory is UB after destruction, compiler is free to remove or not publish the store to NULL.
>>
>> So I suggest either to remove this check, since the only user is using a ThreadsList and any other should also be using that too.
>> Or call Thread::clear_thread_current() before the JavaThread destructor is called, that way we can be certain that there is no UB.
> 
> I got some offline input from David, there seem to be an issue with signal being delivered after the ThreadsListHandle destructor. If that is the case a ThreadsList doesn't help here.
> 
> So I suggested we keep this out of this change-set and just take another suitable field to mirror what we have today.
> 
> `ParkEvent * _ParkEvent;` ?

Yes nicely packaged as:

bool Thread::has_terminated() {
   return _ParkEvent == NULL;
}

Also note:

  // It's possible we can encounter a null _ParkEvent, etc., in 
stillborn threads.
   // We NULL out the fields for good hygiene.
   ParkEvent::Release(_ParkEvent); _ParkEvent   = NULL;

the comment is wrong as it can't be NULL even if stillborn. And now we 
NULL it out for good hygiene and as a late-stage termination indicator.

And yes we can make _ParkEvent volatile to dissuade the compiler from 
eliding the NULLing out.

Thanks,
David

> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/3191
> 


More information about the hotspot-runtime-dev mailing list