[aarch64-port-dev ] [10] RFR: 8187472 - AARCH64: array_equals intrinsic doesn't use prefetch for large arrays

Dmitrij Pochepko dmitrij.pochepko at bell-sw.com
Wed Nov 15 21:31:26 UTC 2017


Hi,

Ningsheng, Andrew,

thank you for looking into in. Your pre-review of this patch helps a 
lot. I was originally focused on Arrays::equals, without paying too much 
attention to String::equals, so, haven't found these issues.


The cause of this slowdown was a small bug Andrew mentioned on first 
webrev(iRegP/iRegI change). webrev.03 you were using has leftover of 
this problem 
(http://cr.openjdk.java.net/~dpochepk/8187472/webrev.03/src/hotspot/cpu/aarch64/aarch64.ad.udiff.html). 
And it caused this intrinsic to be ignored by matcher, so, no intrinsic 
was used at all.


I changed back iRegP to iRegI and it fixed slowdown. I also changed R3 
to R2 to keep 2nd string in .ad file to fix 2nd problem Andrew found and 
it seems to work fine.

new webrev (only .ad file was changed): 
http://cr.openjdk.java.net/~dpochepk/8187472/webrev.04/



Ningsheng's benchmark numbers for this patch on ThunderX:


Before:

Benchmark                           Mode  Cnt    Score   Error Units
StringOps.jmhTimeStringEqualsLarge  avgt   10  235.631 ? 0.054 ns/op


After:

Benchmark                           Mode  Cnt    Score   Error Units
StringOps.jmhTimeStringEqualsLarge  avgt   10  216.660 ? 0.304 ns/op


I still have to test this patch a bit more to be sure no other problems 
appears.


Thanks,

Dmitrij


On 15.11.2017 15:16, Andrew Haley wrote:
> On 15/11/17 10:34, Andrew Haley wrote:
>> Gosh.  I think that's a very bad patch.
> In fact it's even worse than I thought.  It causes crashes in testing.
> Nobody seems to have noticed that the .ad file is wrong.
>
> Note this:
>
> instruct string_equalsL(iRegP_R1 str1, iRegP_R3 str2, iRegI_R4 cnt,
>                          iRegI_R0 result, rFlagsReg cr)
>
> instruct array_equalsB(iRegP_R1 ary1, iRegP_R2 ary2, iRegI_R0 result,
>                        iRegP_R4 tmp, rFlagsReg cr)
>
> Spot the difference?  str2 is in R3, ary2 is in R2.  That means you
> can't call the same stub from both of these patterns.
>
> If you're adding a stub call you must assert that the passed registers
> are correct.  Add the assertions, fix the register usage, and test it
> properly, both with and without UseSIMDForArrayEquals.
>



More information about the aarch64-port-dev mailing list