RFR: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions

Jatin Bhateja jbhateja at openjdk.java.net
Mon Sep 14 13:18:39 UTC 2020


On Mon, 14 Sep 2020 05:01:24 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Adding a new product flag requires a CSR request to  be filed.
>
>> /csr needed
>> 
>> Adding a new product flag requires a CSR request to be filed.
> 
> @dholmes-ora ,  with https://github.com/openjdk/jdk/commit/5144190e  there has been a clean up of options and product
> options now accept  DIAGNOSTIC as an additional parameter.  Newly added flag is a DIAGNOSTIC flag.

> Mailing list message from Andrew Haley on hotspot-dev:
> On 13/09/2020 20:12, Jatin Bhateja wrote:
> 
> 1) Partial in-lining technique avoids call overhead penalty for
> sub-word type small array copy operations with size less than 32
> bytes. 2) At runtime, a conditional check based on copy length
> either calls an array-copy stub or executes an optimized instruction
> sequence using AVX-512 masked instructions emitted at the call site.
> 
> This may not be a good idea. See my reply at
> https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-September/043114.html
> https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-September/043155.html

Frequency level switchover is sensitive to vector size, this has been taken care of by using a 32 byte vector masked
operations in default mode.

Default value of ArrayCopyPartialInlineSize is 32 i.e. copy sizes b/w 1-32 are partially in lined at the call site
using masked vector moves operating over YMM  registers. Only if user sets it to 64 we use ZMMs registers which forces
a frequency level switch over to a lower frequency level (LVL1).

So an AVX512 lite instruction working over a 32 byte vector (YMM) will operate a maximum frequency level (LVL0).

> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

-------------

PR: https://git.openjdk.java.net/jdk/pull/144


More information about the hotspot-compiler-dev mailing list