RFR (S) 8182397: Race in field updates when creating ArrayKlasses can lead to crash

Erik Österlund erik.osterlund at oracle.com
Thu Jul 27 12:56:54 UTC 2017


Hi Andrew,

On 2017-07-27 10:58, Andrew Haley wrote:
> On 25/07/17 19:13, Erik Österlund wrote:
>>> On 25 Jul 2017, at 18:57, Andrew Haley <aph at redhat.com> wrote:
>>>
>>> On 25/07/17 14:41, Erik Österlund wrote:
>>>> On 2017-07-25 14:42, Andrew Haley wrote:
>>>>> On 25/07/17 12:13, Erik Österlund wrote:
>>>>>> For example, take this example pseudo code for performing what I refer
>>>>>> to as a stable load between two fields modified concurrently with
>>>>>> potential ABA issues:
>>>>>>
>>>>>> loop {
>>>>>>     x_start = load_relaxed(field_A)
>>>>>>     y = load_consume(field_B)
>>>>>>     x = load_consume(field_A)
>>>>>>     if (x_start == x) break;
>>>>>> }
>>>>>>
>>>>>> // use x->foo
>>>>> I don't understand this pseudocode.  What is the base address for field_A
>>>>> and field_B ?
>>>> field_A and field_B could be two different registers pointing at
>>>> different addresses - i.e. they are arbitrary pointers. The key in this
>>>> example is that field_A is reloaded, and then we compare if the reloaded
>>>> value is equal to the original value (with a possible ABA problem), and
>>>> stop the loop then. But the original and reloaded value could reside in
>>>> different registers, and when we continue using x->foo afterwards, the
>>>> compiler could elect to use either one of the two registers as base
>>>> pointers in the dereference - either the one from the reloaded value of
>>>> field_A or for the original value, as they are equal to each other.
>>> OK, I see what you're getting at.  Compilers have to be pretty
>>> smart to make consume work properly.
>> Precisely.
> I've been thinking about this some more, and I think this example does
> not have a problem after all.
>
> loop {
>     x_start = load_relaxed(field_A)
>     y = load_consume(field_B)
>     x = load_consume(field_A)
>     if (x_start == x) break;
> }
>
> // use x->foo
>
> It is true that if the compiler uses x_start instead of x to load
> x->foo then there is no data dependency between the load of x and the
> use of x->foo.  However, this does not matter, because the load of
> x->foo is *control dependent* on load_consume(field_A), and this is
> enough to keep things in order.
>
> Sure, this would fail on some mis-specified architecture which
> recognizes data dependencies but not control dependencies, but that's
> their problem and they'll have to use load acquire until their
> hardware spec is fixed.
>

The problem I described was that data dependent loads on source level 
are not necessarily mapped to data dependent loads in generated machine 
code, because that is what we were talking about. I am glad we seem to 
agree about this.

Now whether that is a problem or not on a given architecture or not is a 
totally different question. I see you are arguing that it shouldn't be 
and that if it is, then that is their problem. I don't know if I see it 
the same way. At the very least, I think we agree that we can not leave 
normal loads around without at the very least wrapping them in an API 
with some well defined semantics that at the very least allows certain 
platforms to choose what to do in such a scenario.

Having said that, I do believe that e.g. PPC requires an isync to 
respect that loads before a conditional branch are consistent with loads 
after the branch, unless there is a data dependency between a load 
before the branch and a load after the branch to force the issue across 
that branch. And in this case there is not. Because it is totally fine 
to speculate across the branch unless a data dependency ties loads 
before the branch with loads after the branch. Then once the branch is 
committed and x_start == x as speculated, the already calculated outcome 
of that branch is perfectly valid unless either isync prevented such 
speculation or an explicit data dependency made the speculation illegal.

That is precisely why some implementations of load acquire on PPC use 
ld; cmp; bc; isync instead of ld; lwsync to force the load before the 
conditional branch to complete first. Because it would certainly not be 
forced to complete unless isync was there, and therefore not data 
dependent loads would not have to be consistent.

In a very similar fashion, I believe that at least according to 
specification (not necessarily implementations of it - don't know about 
that), ARMv7 will similarly require an isb after that conditional branch.

Thanks,
/Erik


More information about the hotspot-dev mailing list