RFR: 8179444: AArch64: Put zero_words on a diet

Andrew Haley aph at redhat.com
Sun May 7 08:41:13 UTC 2017


Hi,

On 05/05/17 23:54, White, Derek wrote:

> src/cpu/aarch64/vm/macroAssembler_aarch64.cpp:
>  - zero_words(reg, reg):
>    - Comment mentions that it's used by the ClearArray pattern, but
> it's also used by generate_fill().

OK, but there's no contradiction there.  The comment just explains why
zero_words must be small.

>    - The old zero_words(), via block_zero() and fill_words(), would
> align base to 16-bytes before doing stps, the new code doesn't. It
> may be worth conditionally pre-aligning if AvoidUnalignedAcesses is
> true. And could conditionally remove pre-aligning in zero_blocks.

But that'd bloat zero_words, surely.  And I haven't see the new code
running any slower on any hardware,

> Line 4999: This is pre-existing, but it's confusing - could you
> rename "ShortArraySize" to "SmallArraySize"?

OK.

> - zero_words(reg, int):
>  - I don't see a great way to handle unaligned accesses without
> blowing up the code size. So no real request here.
>    - It looks like worst case is 15 unaligned stp instructions. 
>    - With some benchmarking someday I might argue for calling this
> constant count version for smaller copies.

I don't understand this.

>    - Unless ClearArray only zero's from the first array element on -
> then we could guess if base is 16-byte unaligned by looking at the
> array header size.
> 
> -  zero_dcache_blocks():

>   - Suggest a new comment that mentions that it's only called from
> zero_blocks()? Or if it's meant to be more general then add comments
> about the requirements for base alignment, and that cnt has to be >=
> 2*zva_length.

OK.

>   - zero_blocks DOES align base to 16-bytes, so we don't need to
> check here? Or make it a runtime assert?

I did wonder about that, but left it in as a safety net because its
cost is immeasurably small.

Thanks for such a detailed review.

Andrew.


More information about the hotspot-dev mailing list