RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions

Mikael Gerdin mikael.gerdin at oracle.com
Tue Nov 4 14:09:34 UTC 2014


Hi Erik,

On 2014-11-04 14:21, Erik Österlund wrote:
> Hi guys,
>
> Small comment on this discussion.
>
> On 04 Nov 2014, at 13:35, Mikael Gerdin <mikael.gerdin at oracle.com> wrote:
>
>> Hi David,
>>
>> On 2014-11-04 12:15, David Holmes wrote:
>>> Hi Mikael,
>>>
>>> Thanks for fixing this.
>>>
>>> Given x86 is a TSO system my understanding from the previous discussion
>>> in 6973570 is that once you have the compiler barrier the volatile write
>>> to the dummy variable in storestore() is no longer needed - it's only
>>> purpose was to introduce the compiler barrier (which it failed to do).
>>
>> I felt that removing the dummy store would be more risky since I have no good way of verifying that removing the volatile write doesn't break ordering for some other caller of storestore().
>>
>>>
>>> Secondly, do we observe the same bug with the release_store operations
>>> or are we just being conservative? I would not expect to need the
>>> compiler barrier prior to the Atomic::store, but I would assume this is
>>> completely harmless and performance neutral, so we should have it for
>>> completeness regardless.
>>
>> I tried changing the problematic code to do a release_store of _gc_time_stamp and saw the same reordering problem, that's the reason for adding compiler_barrier to the release_store variants.
>>
>> Any suggestion on where to put compiler_barrier in order to use it from atomic_linux_x86.inline.hpp?
>>
>>>
>>> Finally, you answered Dean regarding the simple acquire() function but
>>> the load_acquire variants all rely on volatile semantics - which we've
>>> seen not to provide what we expected. So perhaps the load_acquire
>>> functions also need a compile_barrier inserted ?
>>
>> Perhaps. I haven't observed any problems with the acquire functions so I can't determine if they need the compiler barrier or not.
>>
>> It may be a good idea to add the compiler barrier to those functions as well, what are the other reviewers' opinions on this?
>
> The issue is that volatiles in the C++ standard prevent reordering of one volatile access w.r.t. another volatile access, but not other non-volatile accesses.
> Both acquire and release semantics however should consider reordering of non-volatile accesses to. Therefore, not issuing a compiler barrier for acquire may not be shown as an issue in code yet, but I'm convinced it's hazardous not to.
>
> Example of a pair of release_store synchronizing with a load_acquire:
>
> release_store in T1:
> non-volatile write x_1
> release			<-- now enforced properly with compiler barrier
> volatile write x_2
>
> load_acquire in T2:
> volatile load x_2
> acquire 			<--- not currently enforced properly!
> non-volatile load x_1
>
> Now by fixing the compiler barrier of release store, the non-volatile write to x_1 is properly kept above the volatile x_2.
> However, by not having a compiler barrier for the load_acquire, there is nothing AFAIK that prevents the volatile load x_2 from reordering with the non-volatile load x_1, since volatile accesses are only constrained in order with respect to other volatile accesses.
>
> My 50 cents...

Thanks for your feedback Erik, I think I've been convinced to add the 
compiler barrier to the load_acquire variants as well. I'll prepare an 
updated webrev.

/Mikael

>
> /Erik
>
>>
>> Thanks for the review David.
>> /Mikael
>>
>>
>>>
>>> Thanks,
>>> David
>>>
>>> On 3/11/2014 11:01 PM, Mikael Gerdin wrote:
>>>> Hi all,
>>>>
>>>> Please review this attempt at fixing the OrderAccess functions on Linux
>>>> x86 with GCC.
>>>>
>>>> While working on another bug I recently discovered that g++ was
>>>> reordering stores across a call to OrderAccess::storestore on Linux x86.
>>>>
>>>> The G1 code attempts to do an ordered publishing of two values:
>>>> _saved_mark_word = _top;
>>>> OrderAccess::storestore();
>>>> _gc_time_stamp = curr_gc_time_stamp;
>>>>
>>>> The types involved are
>>>> HeapWord* _top, _saved_mark_word;
>>>> volatile unsigned _gc_time_stamp;
>>>>
>>>> The incorrect behavior seems to have started when JDK-6973570 was fixed
>>>> in JDK 7.
>>>> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and
>>>> _gc_time_stamp at 0x138, %rbx is "this".
>>>>
>>>> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so:
>>>>
>>>>
>>>>    3d9f4d:       39 d0                   cmp    %edx,%eax
>>>>    3d9f4f:       73 1c                   jae    3d9f6d
>>>> <G1OffsetTableContigSpace::set_saved_mark()+0x3d>
>>>>    3d9f51:       48 8b 43 58             mov    0x58(%rbx),%rax
>>>>    3d9f55:       48 89 43 18             mov    %rax,0x18(%rbx)
>>>>    3d9f59:       48 8b 05 40 f9 70 00    mov    0x70f940(%rip),%rax    #
>>>> ae98a0 <_DYNAMIC+0x12f8>
>>>>    3d9f60:       48 c7 00 00 00 00 00    movq   $0x0,(%rax)
>>>>    3d9f67:       89 93 38 01 00 00       mov    %edx,0x138(%rbx)
>>>>
>>>> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so
>>>>
>>>>
>>>>    3da05d:       39 d0                   cmp    %edx,%eax
>>>>    3da05f:       73 15                   jae    3da076
>>>> <G1OffsetTableContigSpace::set_saved_mark()+0x36>
>>>>    3da061:       48 8b 43 58             mov    0x58(%rbx),%rax
>>>>    3da065:       c7 45 f4 00 00 00 00    movl   $0x0,-0xc(%rbp)
>>>>    3da06c:       89 93 38 01 00 00       mov    %edx,0x138(%rbx)
>>>>    3da072:       48 89 43 18             mov    %rax,0x18(%rbx)
>>>>
>>>> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store
>>>> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated.
>>>>
>>>> My suggestion to fix this is to extend all the OrderAccess::release*
>>>> variants on x86 with a:
>>>> __asm__ volatile ("" : : : "memory");
>>>> to attempt to prevent GCC from reordering any memory accesses across
>>>> those function calls.
>>>>
>>>> I've verified that this solves the issue in the assembly with our
>>>> current JDK 9 build platform compilers.
>>>> I've also verified that this particular piece of code is compiled
>>>> correctly on our other x86 platforms: Solaris, Windows and OS X.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8061964
>>>> Testing:
>>>> JPRT, inspecting generated assembly for the function
>>>> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is
>>>> currently named).
>>>> Suggestions of further testing is greatly appreciated.
>>>>
>>>> Thanks
>>>> Mikael
>


More information about the hotspot-dev mailing list