RFR: 8187033: [PPC] Imporve performance of ObjectStreamClass.getClassDataLayout()

Hans Boehm hboehm at google.com
Fri Sep 15 17:54:17 UTC 2017


On Fri, Sep 15, 2017 at 7:32 AM, Peter Levart <peter.levart at gmail.com>
wrote:

> Hi,
>
> On 09/15/2017 07:12 AM, Kazunori Ogata wrote:
>
>> Hi Hans,
>>
>> Thank you for your comment
>>
>> Hans Boehm <hboehm at google.com> wrote on 2017/09/15 09:44:56:
>>
>> Just to be clear, the semi-plausible failure scenario here is that a
>>> reader will see a non-null slots value, but then read an uninitialized
>>> value from the ClassDataSlot array. This could happen if the compiler or
>>> hardware correctly speculated (e.g. based on a previous program
>>> execution), the value of dataLayout, used that to first load a value
>>>
>> from
>>
>>> the ClassDataSlot array, and only then confirmed that the dataLayout
>>>
>> value
>>
>>> was correct. None of this is likely to happen today, but is potentially
>>> profitable, and allowed by the current memory model. I think the extra
>>> assumption here should at least be documented.
>>>
>> In this scenario, the load from the ClassDataSlot array is also a
>> speculated load, so it should be confirmed after the load from dataLayout
>> is confirmed, and the full fence should be detected during the
>> confirmation of the speculated load from the array slot.  Otherwise, the
>> full fence won't work as expected, I think.
>>
>
> Disclaimer: I don't know this code well. But it still doesn't look
technically correct to me after the change.


> Just a reminder that the final code in question is the following:
>
> 1196     ClassDataSlot[] getClassDataLayout() throws InvalidClassException
> {
> 1197         // REMIND: synchronize instead of relying on fullFence()?
> 1198         ClassDataSlot[] slots = dataLayout;
> 1199         if (slots == null) {
> 1200             slots = getClassDataLayout0();
> 1201             VarHandle.fullFence();
> 1202             dataLayout = slots;
> 1203         }
> 1204         // return slots;  // Assume this code is inlined and
> slots[17] is accessed by the caller.

                     // To get a self-contained example, we assume this
line is really:
    1204b       tmp = slots[17];

> 1205     ...


> Does speculative read of 'dataLayout' into local variable 'slots' mean
> that 'slots' can change? Isn't this disallowed in Java (as apposed to C++)?
>

The problem occurs if this is transformed (by hardware or compiler) to

1196     ClassDataSlot[] getClassDataLayout() throws InvalidClassException {
1197         // REMIND: synchronize instead of relying on fullFence()?
                 <prefetch dataLayout>
1198         ClassDataSlot[] slots = DATA_LAYOUT_GUESS;
1199         if (slots == null) {
                     if (dataLayout != DATA_LAYOUT_GUESS) <recover>
1200             slots = getClassDataLayout0();
1201             VarHandle.fullFence();
1202             dataLayout = slots;
1203         }
1204b       tmp = slots[17];
1204.5      if (dataLayout != DATA_LAYOUT_GUESS) <recover>
1205     ...

(This is only an illustration. If the problem were to occur in real life,
it would probably occur as a
result of a different optimization. DEC Alpha allowed this sort of thing
for entirely different reasons.)

Observe that

(1) This transformation is allowed by the Java memory model, since
dataLayout is not a final field.
(2) This code breaks if another thread runs all of the initialization code,
including the code that sets
slots[17] and the code that sets dataLayout, between 1204b and 1204.5, but
the check in
1204.5 still succeeds (because we guessed well). tmp will contain the
pre-initialization value of slots[17].

The fence is not executed by the reading thread, and has no impact on
ordering within the reading thread.

C++ fences have no effect unless they are paired with another fence or
ordered atomic operation in
the other thread involved in the communication. I think that is the current
intent for Java as well.
(At the hardware level, that's not entirely true, because dependency-based
ordering can sometimes
be used instead. But the only place in which we've been more or less
successful in translating those
guarantees to the PL level is for Java final fields. And those aren't used
here.)

Hans


> Regards, Peter
>
>


More information about the core-libs-dev mailing list