RFR: 8179444: AArch64: Put zero_words on a diet
Andrew Haley
aph at redhat.com
Sun May 7 08:41:13 UTC 2017
Hi,
On 05/05/17 23:54, White, Derek wrote:
> src/cpu/aarch64/vm/macroAssembler_aarch64.cpp:
> - zero_words(reg, reg):
> - Comment mentions that it's used by the ClearArray pattern, but
> it's also used by generate_fill().
OK, but there's no contradiction there. The comment just explains why
zero_words must be small.
> - The old zero_words(), via block_zero() and fill_words(), would
> align base to 16-bytes before doing stps, the new code doesn't. It
> may be worth conditionally pre-aligning if AvoidUnalignedAcesses is
> true. And could conditionally remove pre-aligning in zero_blocks.
But that'd bloat zero_words, surely. And I haven't see the new code
running any slower on any hardware,
> Line 4999: This is pre-existing, but it's confusing - could you
> rename "ShortArraySize" to "SmallArraySize"?
OK.
> - zero_words(reg, int):
> - I don't see a great way to handle unaligned accesses without
> blowing up the code size. So no real request here.
> - It looks like worst case is 15 unaligned stp instructions.
> - With some benchmarking someday I might argue for calling this
> constant count version for smaller copies.
I don't understand this.
> - Unless ClearArray only zero's from the first array element on -
> then we could guess if base is 16-byte unaligned by looking at the
> array header size.
>
> - zero_dcache_blocks():
> - Suggest a new comment that mentions that it's only called from
> zero_blocks()? Or if it's meant to be more general then add comments
> about the requirements for base alignment, and that cnt has to be >=
> 2*zva_length.
OK.
> - zero_blocks DOES align base to 16-bytes, so we don't need to
> check here? Or make it a runtime assert?
I did wonder about that, but left it in as a safety net because its
cost is immeasurably small.
Thanks for such a detailed review.
Andrew.
More information about the hotspot-dev
mailing list