RFR: 8073093: AARCH64: C2 generates poor code for ByteBuffer accesses

Vitaly Davidovich vitalyd at gmail.com
Wed Feb 18 21:03:51 UTC 2015


Thanks Vladimir.  I was actually asking about the ByteBuffer elimination
itself; when I tried Andrew's example on 7u60 (i.e. a single method with a
ByteBuffer.wrap(...).getLong(...)), the ByteBuffer allocation was not
removed.

On Wed, Feb 18, 2015 at 3:59 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com
> wrote:

> The code which eliminates MemBars for scalarized objects was added in jdk8:
>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/6f3fd5150b67
>
> An other store barrier change was also pushed into jdk8:
>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fcf521c3fbc6
>
> I don't remember we did anything special with membars in jdk9.
>
> Regards,
> Vladimir
>
>
> On 2/18/15 6:27 AM, Vitaly Davidovich wrote:
>
>> Indeed, that's quite nice and not what I saw in java 7 so good to see that
>> this case is EA'd out.  Does anyone know if there was EA work done in java
>> 9 or is this simply inlining policy change that makes EA work (as John
>> alluded to)?
>>
>> sent from my phone
>> On Feb 18, 2015 6:13 AM, "Andrew Haley" <aph at redhat.com> wrote:
>>
>>  On 02/18/2015 09:15 AM, Andrew Haley wrote:
>>>
>>>> On 18/02/15 09:14, Florian Weimer wrote:
>>>>
>>>>> Wow, looks nice.  What OpenJDK build did you use?  I want to see if
>>>>> this
>>>>> happens on x86_64, too.
>>>>>
>>>>
>>>> I'm working on JDK9.  You don't have this code yet.  I'll do an x86
>>>> build.
>>>>
>>>
>>>    0x00007f2948acbf8c: mov    0xc(%rdx),%r10d    ;*synchronization entry
>>>                                                  ; -
>>> java.nio.HeapByteBuffer::<init>@-1 (line 84)
>>>                                                  ; -
>>> java.nio.ByteBuffer::wrap at 7 (line 373)
>>>                                                  ; -
>>> java.nio.ByteBuffer::wrap at 4 (line 396)
>>>                                                  ; -
>>> bytebuffertests.ByteBufferTests3::getLong at 1 (line 23)
>>>                                                  ; implicit exception:
>>> dispatches to 0x00007f2948acbff5
>>>    ;; B2: #      B5 B3 <- B1  Freq: 0.999999
>>>
>>>    ;; MEMBAR-release ! (empty encoding)
>>>
>>>    0x00007f2948acbf90: test   %ecx,%ecx
>>>    0x00007f2948acbf92: jl     0x00007f2948acbfb5  ;*iflt
>>>                                                  ; -
>>> java.nio.Buffer::checkIndex at 1 (line 545)
>>>                                                  ; -
>>> java.nio.HeapByteBuffer::getLong at 18 (line 465)
>>>                                                  ; -
>>> bytebuffertests.ByteBufferTests3::getLong at 5 (line 23)
>>>
>>>    ;; B3: #      B6 B4 <- B2  Freq: 0.999999
>>>
>>>    0x00007f2948acbf94: mov    %r10d,%ebp
>>>    0x00007f2948acbf97: sub    %ecx,%ebp          ;*isub
>>>                                                  ; -
>>> java.nio.Buffer::checkIndex at 10 (line 545)
>>>                                                  ; -
>>> java.nio.HeapByteBuffer::getLong at 18 (line 465)
>>>                                                  ; -
>>> bytebuffertests.ByteBufferTests3::getLong at 5 (line 23)
>>>
>>>    0x00007f2948acbf99: cmp    $0x8,%ebp
>>>    0x00007f2948acbf9c: jl     0x00007f2948acbfd5  ;*if_icmple
>>>                                                  ; -
>>> java.nio.Buffer::checkIndex at 11 (line 545)
>>>                                                  ; -
>>> java.nio.HeapByteBuffer::getLong at 18 (line 465)
>>>                                                  ; -
>>> bytebuffertests.ByteBufferTests3::getLong at 5 (line 23)
>>>
>>>    ;; B4: #      N95 <- B3  Freq: 0.999998
>>>
>>>    0x00007f2948acbf9e: movslq %ecx,%r10
>>>    0x00007f2948acbfa1: mov    0x10(%rdx,%r10,1),%rax
>>>    0x00007f2948acbfa6: bswap  %rax               ;*invokestatic
>>> reverseBytes
>>>                                                  ; -
>>> java.nio.Bits::swap at 1
>>> (line 61)
>>>                                                  ; -
>>> java.nio.HeapByteBuffer::getLong at 41 (line 466)
>>>                                                  ; -
>>> bytebuffertests.ByteBufferTests3::getLong at 5 (line 23)
>>>
>>> So, just the same except that there is no explicit fence instruction
>>> to remove.  It's a shame for AArch64 because that fence really kills
>>> performance but it's bad for x86 too.  Even on machines that don't
>>> emit fence instructions the fence still acts as a compiler barrier.
>>>
>>> Andrew.
>>>
>>>



More information about the core-libs-dev mailing list