RFR: 8274243: Implement fast-path for ASCII-compatible CharsetEncoders on aarch64

Tue Jan 4 12:37:16 UTC 2022

On Mon, 3 Jan 2022 12:25:21 GMT, Patric Hedlin <phedlin at openjdk.org> wrote:

> Implementation of ISO/ASCII char set encoding, extending current implementation with ASCII encoding support.
> 
> The motivation is found in the original x86 issue ([JDK-8274242](https://bugs.openjdk.java.net/browse/JDK-8274242)).
> 
> Implementation with some focus on balance between footprint and efficiency, trying to utilise a dual SIMD path (e.g. Neoverse N1) for the additional ASCII-check and avoid performance loss in the ISO-only case.
> 
> - Interleaved ISO and ASCII check code.
> - Avoid 'umaxv' in the ISO main flow.
> - Using post inc in main loop.
> - Retain 8-char loop.
> - Removing conditional prefetch (no upside).
> - Adding ISO-8859-1 to encode-decode benchmark.
> 
> Testing (Linux): tier1-6
> 
> The revised version compares like this (master vs. update).
> 
> Benchmark                   (size)       (type)  Mode  Cnt    Score   Error  Units
> CharsetEncodeDecode.encode   16384        UTF-8  avgt   30   17.920 ± 0.229  us/op
> CharsetEncodeDecode.encode   16384         BIG5  avgt   30   18.867 ± 0.356  us/op
> CharsetEncodeDecode.encode   16384  ISO-8859-15  avgt   30   17.419 ± 0.220  us/op
> CharsetEncodeDecode.encode   16384   ISO-8859-1  avgt   30    6.200 ± 0.134  us/op
> CharsetEncodeDecode.encode   16384        ASCII  avgt   30   17.149 ± 0.219  us/op
> CharsetEncodeDecode.encode   16384       UTF-16  avgt   30  135.115 ± 1.440  us/op
> 
> 
> Benchmark                   (size)       (type)  Mode  Cnt    Score   Error  Units
> CharsetEncodeDecode.encode   16384        UTF-8  avgt   30    9.018 ± 0.179  us/op
> CharsetEncodeDecode.encode   16384         BIG5  avgt   30   10.550 ± 0.470  us/op
> CharsetEncodeDecode.encode   16384  ISO-8859-15  avgt   30    8.843 ± 0.187  us/op
> CharsetEncodeDecode.encode   16384   ISO-8859-1  avgt   30    6.406 ± 0.155  us/op
> CharsetEncodeDecode.encode   16384        ASCII  avgt   30    8.822 ± 0.173  us/op
> CharsetEncodeDecode.encode   16384       UTF-16  avgt   30  135.195 ± 1.432  us/op

This looks good. It's clear that the acceleration doesn't prove to be useful for the usual shortish-length Java strings (i.e. about 32 chars on average) but it is very powerful when strings get to be longer than a few kbytes. This patch doesn't slow down even the encoding of very short strings, and the generated code is pretty small, so it's a reasonable compromise.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6945