RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86

Claes Redestad redestad at openjdk.java.net
Fri Sep 24 16:04:08 UTC 2021


On Tue, 21 Sep 2021 21:58:48 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to work also for ASCII encoding, which makes for example the `UTF_8$Encoder` perform on par with (or outperform) similarly getting charset encoded bytes from a String. The former took a small performance hit in JDK 9, and the latter improved greatly in the same release.
> 
> Extending the `EncodeIsoArray` intrinsics on other platforms should be possible, but I'm unfamiliar with the macro assembler in general and unlike the x86 intrinsic they don't use a simple vectorized mask to implement the latin-1 check. For example aarch64 seem to filter out the low bytes and then check if there's any bits set in the high bytes. Clever, but very different to the 0xFF80 2-byte mask that an ASCII test wants.

The current version (cef05f4) copies the ISO_8859_1.implEncodeISOArray intrinsic and adapts it to work on ASCII encoding, which makes the UTF_8$Encoder perform on par with (or outperform) encoding from a String. Using microbenchmarks provided by @carterkozak here: https://github.com/carterkozak/stringbuilder-encoding-performance

Baseline:


Benchmark                                                      (charsetName)                          (message)  (timesToAppend)  Mode  Cnt     Score    Error  Units
EncoderBenchmarks.charsetEncoder                                       UTF-8     This is a simple ASCII message                3  avgt    8   270.237 ± 10.504  ns/op
EncoderBenchmarks.charsetEncoder                                       UTF-8  This is a message with unicode ��                3  avgt    8   568.353 ±  2.331  ns/op
EncoderBenchmarks.charsetEncoderWithAllocation                         UTF-8     This is a simple ASCII message                3  avgt    8   324.889 ± 17.466  ns/op
EncoderBenchmarks.charsetEncoderWithAllocation                         UTF-8  This is a message with unicode ��                3  avgt    8   633.720 ± 22.703  ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder          UTF-8     This is a simple ASCII message                3  avgt    8  1132.436 ± 30.661  ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder          UTF-8  This is a message with unicode ��                3  avgt    8  1379.207 ± 66.982  ns/op
EncoderBenchmarks.toStringGetBytes                                     UTF-8     This is a simple ASCII message                3  avgt    8    91.253 ±  3.848  ns/op
EncoderBenchmarks.toStringGetBytes                                     UTF-8  This is a message with unicode ��                3  avgt    8   519.489 ± 12.516  ns/op


Patch:

Benchmark                                                      (charsetName)                          (message)  (timesToAppend)  Mode  Cnt     Score     Error  Units
EncoderBenchmarks.charsetEncoder                                       UTF-8     This is a simple ASCII message                3  avgt    4    82.535 ±  20.310  ns/op
EncoderBenchmarks.charsetEncoder                                       UTF-8  This is a message with unicode ��                3  avgt    4   522.679 ±  13.456  ns/op
EncoderBenchmarks.charsetEncoderWithAllocation                         UTF-8     This is a simple ASCII message                3  avgt    4   127.831 ±  32.612  ns/op
EncoderBenchmarks.charsetEncoderWithAllocation                         UTF-8  This is a message with unicode ��                3  avgt    4   549.343 ±  59.899  ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder          UTF-8     This is a simple ASCII message                3  avgt    4  1182.835 ± 153.735  ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder          UTF-8  This is a message with unicode ��                3  avgt    4  1416.407 ± 130.551  ns/op
EncoderBenchmarks.toStringGetBytes                                     UTF-8     This is a simple ASCII message                3  avgt    4    97.770 ±  15.742  ns/op
EncoderBenchmarks.toStringGetBytes                                     UTF-8  This is a message with unicode ��                3  avgt    4   516.351 ±  58.580  ns/op


This can probably be simplified further, say by adding a flag to the intrinsic of whether we're encoding ASCII only or ISO-8859-1. It also needs to be implemented and tested on all architectures.

(edit: accidentally edit rather than quote-reply, restored original comment)

On the JDK-included `CharsetEncodeDecode.encode` microbenchmark, I get these numbers in the baseline (18-b09):


Benchmark                   (size)       (type)  Mode  Cnt    Score   Error  Units
CharsetEncodeDecode.encode   16384        UTF-8  avgt   30   39.962 ± 1.703  us/op
CharsetEncodeDecode.encode   16384         BIG5  avgt   30  153.282 ± 4.521  us/op
CharsetEncodeDecode.encode   16384  ISO-8859-15  avgt   30  192.040 ± 4.543  us/op
CharsetEncodeDecode.encode   16384        ASCII  avgt   30   40.051 ± 1.210  us/op
CharsetEncodeDecode.encode   16384       UTF-16  avgt   30  302.815 ± 9.490  us/op


With the proposed patch:

Benchmark                   (size)       (type)  Mode  Cnt    Score    Error  Units
CharsetEncodeDecode.encode   16384        UTF-8  avgt   30    4.081 ±  0.182  us/op
CharsetEncodeDecode.encode   16384         BIG5  avgt   30  150.374 ±  3.579  us/op
CharsetEncodeDecode.encode   16384  ISO-8859-15  avgt   30    4.010 ±  0.179  us/op
CharsetEncodeDecode.encode   16384        ASCII  avgt   30    3.961 ±  0.176  us/op
CharsetEncodeDecode.encode   16384       UTF-16  avgt   30  302.235 ± 11.395  us/op


That is: on my system encoding 16K char ASCII data is 10x faster for UTF-8 and ASCII, and roughly 48x faster for ASCII-compatible charsets like ISO-8859-15. On 3rd party microbenchmarks we can assert that performance for non-ASCII input either doesn't change, or improves when messages have an ASCII prefix.

-------------

PR: https://git.openjdk.java.net/jdk/pull/5621


More information about the core-libs-dev mailing list