[aarch64-port-dev ] RFR: JDK-8205004: AArch64: fix failures in jtreg ArraysEqCmpTest
Dmitrij Pochepko
dmitrij.pochepko at bell-sw.com
Fri Jun 15 17:38:52 UTC 2018
Hi,
We should issue loads as early as possible, since it's a main
bottleneck. So, adding branch before ldrw will make code a bit slower.
You're right about leaving cmp(cnt2, cnt1); br(NE, DONE); in short path,
because it'll be a bit faster due to late load of cnt2, which has a
chance to be completed while jumping to SHORT
regarding LEN_IS_ZERO removal: we must keep it. Otherwise it'll compare
garbage memory later, because loaded data is shifted by lslv(tmp5, tmp4,
tmp5); to remove extra data. In case of 0 length, shift will be 0, so, a
memory after this particular array will be compared.
Can you try patch in attachment? This one move branch after load and
minimize performance impact
Thanks,
Dmitrij
On 15.06.2018 06:50, Joshua Zhu wrote:
> Hi Dmitrij,
>
> I saw the below comments in original codes.
> 5137 // on most CPUs a2 is still "locked"(surprisingly) in ldrw and it's
> 5138 // faster to perform another branch before comparing a1 and a2
> before
> 5139 cmp(cnt1, elem_per_word);
> 5140 br(LE, SHORT); // short or same
>
> and if as the comments say, I think we can move
> 5141 cmpoop(a1, a2);
> 5142 br(EQ, SAME);
> before
> 5132 cbz(a1, A_IS_NULL);
> 5133 ldrw(cnt1, Address(a1, length_offset));
> 5134 cbz(a2, A_IS_NULL);
> 5135 ldrw(cnt2, Address(a2, length_offset));
>
> and add additional code:
> cmp(cnt2, cnt1);
> br(NE, DONE);
> in the SHORT path just as my patch did after
> 5209 bind(SHORT);
>
> and label LEN_IS_ZERO and its path can also be deleted like below:
> bind(SHORT);
> cmp(cnt2, cnt1);
> br(NE, DONE);
> cbz(cnt1, SAME);
>
> Best Regards,
> Joshua
>
>> Hi,
>>
>> I think you can just move existing checks:
>>
>> 5148 cmp(cnt2, cnt1);
>> 5149 br(NE, DONE);
>>
>> and
>>
>> 5141 cmpoop(a1, a2);
>> 5142 br(EQ, SAME);
>>
>> before jump to short case: 5140 br(LE, SHORT); // short or same
>>
>> Then you don't need to add any additional code.
>>
>> (cmpoop(a1, a2) check should go first, because cnt1 and cnt2 loads are
>> possibly still executed)
>>
>> Thanks,
>> Dmitrij
>>
>> On 14.06.2018 16:26, Joshua Zhu wrote:
>>> Hi,
>>>
>>> Please review the following change.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205004
>>> Webrev: http://cr.openjdk.java.net/~zyao/8205004/
>>>
>>> This bug causes three failures in Jtreg ArraysEqCmpTest.
>>> test ArraysEqCmpTest.testArray(byte[]) test
>>> ArraysEqCmpTest.testArray(unsigned byte[]) test
>>> ArraysEqCmpTest.testArray(char[])
>>>
>>> Array length needs to be compared for short arrays.
>>> Please see detailed description in JBS.
>>>
>>> Best Regards,
>>> Joshua
>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: array_eq_fix.diff
Type: text/x-patch
Size: 1253 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/aarch64-port-dev/attachments/20180615/103d00e6/array_eq_fix.diff>
More information about the aarch64-port-dev
mailing list