RFR(s): 8228758: assert(_no_handle_mark_nesting == 0) failed: allocating handle inside NoHandleMark

Tue Sep 3 23:35:16 UTC 2019

On 4/09/2019 6:38 am, Daniel D. Daugherty wrote:
> Just jumping in with a comment on one point below...
> 
> On 9/3/19 8:43 AM, Robbin Ehn wrote:
>> Hi David,
>>
>> Truncated:
>>
>>>
>>> You stated that if you need to not use any handleMarks then you most 
>>> likely don't want polls - and no polls is what a LEAF routine has to 
>>> guarantee:
>>>
>>> QUICK -> No handles -> no polls == LEAF
>>
>> I'll look at the other way around:
>> QUICK can have polls therefore they cannot be leafs.
>> (at least the one triggering this assert)
>>
>>>
>>>> Since we missing the cleaner, accidentally using a HandleMark would 
>>>> lead to a leak. The NoHandleMark protected against this leak.
>>>
>>> Maybe I'm misunderstanding exactly how HandleMarks nest and behave. I 
>>> would have thought this nesting would be allowed:
>>>
>>> entry_point
>>>    NoHandleMark
>>>    -> subroutine
>>>       HandleMark  // prevents leak of Handle to outer scope
>>>       allocate handle
>>>
>>> whereas this would cause an error
>>>
>>> entry_point
>>>    NoHandleMark
>>>    -> subroutine
>>>       allocate handle
>>>
>>> but IIUC a NoHandleMark prevents all and any handle use in its own 
>>> scope or a nest scope - no?
>>
>> Yes, sorry, NoHandleMark increments _no_handle_mark_nesting and
>> HandleArea::allocate_handle checks that it is 0, otherwise this assert.
>> (it also checks that _handle_mark_nesting is larger than 0, so it must 
>> be inside a HandleMark)
>>
>>>
>>> I'm not sure that is viable, or the right way to try and address 
>>> this. In the extreme it would require all VM susbsystems to be fully 
>>> re-entrant and that is not a reasonable expectation.
>>
>> We already have that today with safepoint, many parts of subsystems 
>> can't handle
>> safepoints. The one adding the new code to be run in a handshake must 
>> make sure
>> that works. Not saying we should be able to run anything in a 
>> handshake, but
>> fundamental things like Handles should be allowed.
> 
> I'm going to have to disagree with the comparison here. I don't think
> comparing handshakes with blocking for safepoints is quite right. A
> thread blocking for a safepoint while executing a subsystem's code is
> a synchronous thing in my mind:
> 
>      I'm calling into the subsystem and that subsystem uses locks so my
>      thread might safepoint while in there...
> 
>      The subsystem has to synchronously prepare itself for grabbing that
>      lock by Handle-izing oops, etc...
> 
> When I think about handshakes, I think about them as happening 
> asynchronously
> like async-signals resulting in calls to handlers... Maybe my mental 
> model of
> handshakes is wrong... :-)

I agree with you Dan - the analogy is somewhat off. The reasons for 
deferring safepoints in a region are quite simple to know and express 
and enforce - and basically amount to "if I hit a safepoint GC might 
touch stuff I'm using". The actual safepoint just causes blocking which 
is in general benign to the thread. In contrast handshakes cause the 
current thread to start executing arbitrary code with no regard for the 
code that was being executed. That's much harder to account for. My 
biggest concern is locking itself, which must include polls, but then 
handshake code might call into the locking code again.

> In any case, I'm going to hold off on reviewing the code since I think
> you're still considering making changes based on David's last set of
> comments...

I don't think any changes are pending/under-consideration. I gave the 
okay on making the proposed changes, we're just discussing some of the 
motivations.

Cheers,
David

> Dan
> 
> 
> 
>>
>>>
>>>> This changes the polling model a bit and we have not proper look at 
>>>> this.
>>>> Maybe we should add the constraints to the poll and to find any 
>>>> potential
>>>> troublesome polls.
>>>
>>> I'm not sure how we would discover the problems until we have a 
>>> handshake that executes some code that violates a constraint. The 
>>> constraints themselves are not obvious. Subsystems do not clearly 
>>> identify themselves as being reentrant or not - most things are not 
>>> reentrant by their nature.
>>
>> As we do with safepoints, NoSafepointVerifier (since it covers both 
>> cases).
>>
>> /Robbin
>>
>>>
>>> Cheers,
>>> David
>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>> Some extra words about NoHandleMark:
>>>>>> I believe NoHandleMark should be removed.
>>>>>> In the LEAF case, there is no point in creating a Handle within a
>>>>>> NoSafepointVerifier, since we don't safepoint.
>>>>>> The other usecase seems to be, I'm in java don't do VM things, and 
>>>>>> someone tried
>>>>>> to protected that with a NoHandleMark. There many ways to create a 
>>>>>> much more
>>>>>> robust check for that.
>>>>>> But a discussion for another time.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Issue:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8228758
>>>>>>>>
>>>>>>>> Changeset:
>>>>>>>> http://cr.openjdk.java.net/~rehn/8228758/webrev/
>>>>>>>>
>>>>>>>> Passes t1-3 (and performance benchmarks)
>>>>>>>>
>>>>>>>> Thanks, Robbin
>