[aarch64-port-dev ] AArch64: follow up array copy investigation on misaligned peeling

Andrew Haley aph at redhat.com
Wed Feb 17 13:34:02 UTC 2016


On 02/17/2016 01:21 PM, Hui Shi wrote:
> For StringConcat test (
> http://people.linaro.org/~hui.shi/arraycopy/StringConcatTest.java), though
> array copy only takes 25% cycles in this test, entire test can still see
> 3.5% improvement with this combine load/store optimization.  However I
> wondering if this is the proper way to improve these test-bit-load-store
> code sequence. This will requires extra really “disjoint” array copy stub
> code, current disjoint array copy only means it can safely perform forward
> array copy. Or introduce no "overlap" test at runtime. My personal tradeoff
> is leaving array copy code unchanged and keep it simply and consistent now.

OK, that makes sense.

My plan (such as it is) for tidying up the tail code is to convert
three bit-test-and-branches into a single 8-way computed jump with an
optimum sequence for all 8 cases.  Sure, it will usually be
mispredicted, but it's just a single jump.

But really, once we're down to 3.5% of a contrived string-
concatenation intensive test, it's questionable whether this is what
we need to be spending time on.

Thanks,

Andrew.


More information about the aarch64-port-dev mailing list