RFR (S) 8182397: Race in field updates when creating ArrayKlasses can lead to crash
Erik Österlund
erik.osterlund at oracle.com
Thu Jul 27 12:56:54 UTC 2017
Hi Andrew,
On 2017-07-27 10:58, Andrew Haley wrote:
> On 25/07/17 19:13, Erik Österlund wrote:
>>> On 25 Jul 2017, at 18:57, Andrew Haley <aph at redhat.com> wrote:
>>>
>>> On 25/07/17 14:41, Erik Österlund wrote:
>>>> On 2017-07-25 14:42, Andrew Haley wrote:
>>>>> On 25/07/17 12:13, Erik Österlund wrote:
>>>>>> For example, take this example pseudo code for performing what I refer
>>>>>> to as a stable load between two fields modified concurrently with
>>>>>> potential ABA issues:
>>>>>>
>>>>>> loop {
>>>>>> x_start = load_relaxed(field_A)
>>>>>> y = load_consume(field_B)
>>>>>> x = load_consume(field_A)
>>>>>> if (x_start == x) break;
>>>>>> }
>>>>>>
>>>>>> // use x->foo
>>>>> I don't understand this pseudocode. What is the base address for field_A
>>>>> and field_B ?
>>>> field_A and field_B could be two different registers pointing at
>>>> different addresses - i.e. they are arbitrary pointers. The key in this
>>>> example is that field_A is reloaded, and then we compare if the reloaded
>>>> value is equal to the original value (with a possible ABA problem), and
>>>> stop the loop then. But the original and reloaded value could reside in
>>>> different registers, and when we continue using x->foo afterwards, the
>>>> compiler could elect to use either one of the two registers as base
>>>> pointers in the dereference - either the one from the reloaded value of
>>>> field_A or for the original value, as they are equal to each other.
>>> OK, I see what you're getting at. Compilers have to be pretty
>>> smart to make consume work properly.
>> Precisely.
> I've been thinking about this some more, and I think this example does
> not have a problem after all.
>
> loop {
> x_start = load_relaxed(field_A)
> y = load_consume(field_B)
> x = load_consume(field_A)
> if (x_start == x) break;
> }
>
> // use x->foo
>
> It is true that if the compiler uses x_start instead of x to load
> x->foo then there is no data dependency between the load of x and the
> use of x->foo. However, this does not matter, because the load of
> x->foo is *control dependent* on load_consume(field_A), and this is
> enough to keep things in order.
>
> Sure, this would fail on some mis-specified architecture which
> recognizes data dependencies but not control dependencies, but that's
> their problem and they'll have to use load acquire until their
> hardware spec is fixed.
>
The problem I described was that data dependent loads on source level
are not necessarily mapped to data dependent loads in generated machine
code, because that is what we were talking about. I am glad we seem to
agree about this.
Now whether that is a problem or not on a given architecture or not is a
totally different question. I see you are arguing that it shouldn't be
and that if it is, then that is their problem. I don't know if I see it
the same way. At the very least, I think we agree that we can not leave
normal loads around without at the very least wrapping them in an API
with some well defined semantics that at the very least allows certain
platforms to choose what to do in such a scenario.
Having said that, I do believe that e.g. PPC requires an isync to
respect that loads before a conditional branch are consistent with loads
after the branch, unless there is a data dependency between a load
before the branch and a load after the branch to force the issue across
that branch. And in this case there is not. Because it is totally fine
to speculate across the branch unless a data dependency ties loads
before the branch with loads after the branch. Then once the branch is
committed and x_start == x as speculated, the already calculated outcome
of that branch is perfectly valid unless either isync prevented such
speculation or an explicit data dependency made the speculation illegal.
That is precisely why some implementations of load acquire on PPC use
ld; cmp; bc; isync instead of ld; lwsync to force the load before the
conditional branch to complete first. Because it would certainly not be
forced to complete unless isync was there, and therefore not data
dependent loads would not have to be consistent.
In a very similar fashion, I believe that at least according to
specification (not necessarily implementations of it - don't know about
that), ARMv7 will similarly require an isb after that conditional branch.
Thanks,
/Erik
More information about the hotspot-dev
mailing list