RFR: 8264762: ByteBuffer.byteOrder(BIG_ENDIAN).asXBuffer.put(Xarray) and ByteBuffer.byteOrder(nativeOrder()).asXBuffer.put(Xarray) are slow

Mon Apr 26 17:25:06 UTC 2021

On Fri, 23 Apr 2021 19:23:13 GMT, Brian Burkhalter <bpb at openjdk.org> wrote:

> Please consider this request to accelerate absolute and relative bulk array transfer on views of heap byte buffers where the element size is greater than one. What currently happens is that the transfer devolves to a “loopy” element-by-element copy such as
> 
>         int end = offset + length;
>         for (int i = offset, j = index; i < end; i++, j++)
>             dst[i] = get(j);
> 
> for `get()`, and
> 
>         int end = offset + length;
>         for (int i = offset, j = index; i < end; i++, j++)
>             this.put(j, src[i]);
> 
> for `put()`. This is of course relatively slow.
> 
> The change proposed hoists the accelerated versions of these methods using the `ScopedMemoryAccess` methods `copyMemory()` and `copySwapMemory()` from `Direct-X-Buffer` to `X-Buffer`. The array bulk transfer methods are removed from `Direct-X-Buffer` itself. The number of lines of code in the templates decreases by 87.
> 
> With this change the throughput of array bulk `put()` and `get()` for heap view buffers is increased by a factor of 6 to 11 compared with the current code. The performance of direct view buffers does not appear to be affected.
> 
> No tests are modified or added as existing tests already cover these methods. All tests in CI tiers 1-3 passed on all platforms.

src/java.base/share/classes/java/nio/X-Buffer.java.template line 275:

> 273: 
> 274:     // Number of bytes per $type$
> 275:     private static final long ELEMENT_SIZE = 1L << $LG_BYTES_PER_VALUE$;

IIRC you could use the constant `$Type$.BYTES` rather than declare a new static field.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3660