RFR 8243491: Implementation of Foreign-Memory Access API (Second Incubator)

Sat May 2 08:45:28 UTC 2020

Hi Maurizio,

On 5/1/20 4:36 PM, Maurizio Cimadamore wrote:
>
> Hi Peter,
> this does look better yes.
>
> I suspect this doesn't affect performance negatively right? (in most 
> cases, when acquiring, state will be OPEN).
>
> Now there's dup(). I think implementing dup() on a root scope is not 
> too hard - for the child you probably need some trick, but probably 
> not too bad - in the sense that in a Child scope, the cleanup action 
> is really the increment of the root exit scope. So you could have a:
>
> close(boolean doCleanup)
>
> like we do now, and avoid the cleanup on dup (which in the case of the 
> Child will avoid the adder increment). I *believe* this should be 
> functionally equivalent to what we have now.
>

I'll look into optimizing this part.

> One question: the more I look at the code, the less I'm sure that a 
> close vs. access race cannot occur. I'm considering this situation:
>
> * thread 1 does acquire, and find state == OPEN
> * thread 2 does close, set state to CLOSING, then checks if the adders 
> match
>
> But, how can we be sure that the value we get back from the adder 
> (e.g. acquires.sum()) is accurate and reflects the fact that thread 
> (1) has entered already?
>

Because of ordering. All 4 actions are volatile reads or writes. So we 
can linearize them:

Thread 1 does:

a) increments 'acquires'

b) reads OPEN from 'state'

Thread 2 does:

c) writes CLOSING to 'state'

d) sums 'acquires'

'a' happens before 'b' happens before 'c' happens before 'd' => 'a' 
happens before 'd'

> The API doesn't seem to provide any such guarantee:
>
> " The returned value is /NOT/ an atomic snapshot; invocation in the 
> absence of concurrent updates returns an accurate result, but 
> concurrent updates that occur while the sum is being calculated might 
> not be incorporated."
>
As far as the relation of action 'a' to action 'd' is concerned, those 
two actions are not concurrent. They are clearly 'a' happens before 'd', 
so in 'd' we are guaranteed to see the effects of 'a'.

Surely we may also see the effects of other concurrent updates. But what 
those concurrent updates do is increment the 'acquires' counter (more 
precisely one of the counters in LongAdder). So we may see an even 
larger sum, never smaller (unless overflow happens, but then the sum 
won't be any "closer" to the sum of 'releases', only further away). What 
is important here is that we will not miss the update of action 'a' 
above. The important part is also that we 1st do a sum of 'releases' and 
then sum of 'acquires'. That way we will never miss an update of 
'acquires' if we saw a related update of 'releases' made by same thread, 
because the acquiring thread 1st increments 'acquires' and then, when 
finished, increments 'releases'.

It's all a play of action orderings.

> I guess perhaps the trick is in that "while" ? E.g. there's no 
> guarantee only if the concurrent update occurs _while_ sum() is called.
>

The while loop is just an addition to prevent spurious acquire failures. 
It has nothing to do with "detection".

> Now I think this is ok - because when acquire races with close we can 
> have two cases:
>
> 1) "state" has been set to CLOSING _before_ it is read inside acquire()
> 2) "state" has been set to CLOSING _after_ it is read inside acquire()
>
> In the case (1), acquire will start spinning, so nothing can harmful 
> can really happen here. Either the read of "state" from acquire() 
> happened when "state" is about to transition to CLOSED (in which case 
> it will fail soon after) - or the read happened before close() had a 
> chance to look at the counter - in which case there might be a chance 
> that the counter will be updated concurrently (e.g. acquire() thread 
> calls increment() while close() thread calls sum()). But there will be 
> two outcomes here: either the adder has missed the update, in which 
> case the segment will be close, and acquire() will fail; or the adder 
> got the update, in which case close() will fail and acquire() will fail.
>
> In the case (2) we have an happen before edge between the "state" read 
> performed by acquire() and the "state" write performed by close(). 
> Which means that, by the time  we get to calling acquires.sum() we are 
> guaranteed that the thread doing the close() will have seen the adder 
> update from the thread doing the acquire (since the update comes 
> _before_ the volatile read in the acquire() method). So, for this case 
> we have that:
>
> * [acquire] acquires.increment() happens before
> * [acquire] state > OPEN happens before
> * [close] state = CLOSING happens before
> * [close] acquires.sum()
>
> Which, I think, proves that the thread performing the acquire cannot 
> possibly have assumed that the scope is OPEN and also updating the 
> adder concurrently with the call to sum() in the close thread.
>

But you just showed above that acquires.increment() (by acquiring 
thread) happens before acquires.sum() (by closing thread) - they don't 
execute concurrently, so closing thread can not possibly see the sum of 
'acquires' be equal to the sum of 'releases' (which has been taken just 
before the sum of acquires) and therefore it will transition state back 
to OPEN and throw exception. So close will fail and acquire will succeed.

Regards, Peter

> Maurizio
>
> On 01/05/2020 14:00, Peter Levart wrote:
>>
>> On 4/30/20 8:10 PM, Maurizio Cimadamore wrote:
>>>
>>> On 30/04/2020 01:06, Peter Levart wrote:
>>>> Think differently: what if the client succeeded in closing the 
>>>> segment, just because it did it in a time window when no thread in 
>>>> the thread pool held an open scope (this is entirely possible with 
>>>> parallel stream for example since threads periodically acquire and 
>>>> close scopes). This would have the same effect on threads in the 
>>>> thread pool - they would not be able to continue their work... What 
>>>> I'm trying to say is that this is just a mechanism to make things 
>>>> safe, not to coordinate work. If program wants to avoid trouble, it 
>>>> must carefully coordinate work of threads. 
>>>
>>> This appear to me to be a bit of a slippery slope? Sure, if somebody 
>>> prematurely calls close() on a segment while other threads are 
>>> accessing it, it could be seen as undefined behavior (a la C 
>>> specifications ;-) ), but then, if you pull on the same string, why 
>>> even bother with confinement in the first place? If you call close() 
>>> prematurely and you get a VM crash that's on you?
>>
>>
>> Luckily, I think I have fixed this shortcoming in the alternative 
>> MemoryScope:
>>
>>
>> http://cr.openjdk.java.net/~plevart/jdk-dev/8243491_MemoryScope/v2/MemoryScope.java 
>>
>>
>>
>> The trick is in using a 'state' with 3 values: OPEN, CLOSING, CLOSED ...
>>
>>
>> The acquiring thread does the following in order:
>>
>> - increments the 'acquires' scalable counter (volatile write)
>>
>> - reads the 'state' (volatile read) and then enters a spin-loop:
>>
>>     - if state == STATE_OPEN the acquire succeeded (this is fast 
>> path); else
>>
>>     - if state == STATE_CLOSING it spin-loops re-reading 'state' in 
>> each iteration; else
>>
>>     - if state == STATE_CLOSED it increments 'releases' scalable 
>> counter and throws exception
>>
>>
>> The closing thread does the following in order:
>>
>> - writes STATE_CLOSING to 'state' (volatile write)
>>
>> - sums the 'releases' scalable counter (volatile reads)
>>
>> - sums the 'acquires' scalable counter (volatile reads)
>>
>> - compares both sums and:
>>
>>     - if they don't match then it writes back STATE_OPEN to 'state' 
>> (volatile write) and throws exception; else
>>
>>     - it writes STATE_CLOSED to 'state' (volatile write) and executes 
>> cleanup action
>>
>>
>> This, I think, is better, isn't it?
>>
>>
>> Regards, Peter
>>
>>
>>
>>
>>
>>> Maurizio
>>>
>>>