RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86
Claes Redestad
redestad at openjdk.java.net
Fri Sep 24 16:04:08 UTC 2021
On Tue, 21 Sep 2021 21:58:48 GMT, Claes Redestad <redestad at openjdk.org> wrote:
> This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to work also for ASCII encoding, which makes for example the `UTF_8$Encoder` perform on par with (or outperform) similarly getting charset encoded bytes from a String. The former took a small performance hit in JDK 9, and the latter improved greatly in the same release.
>
> Extending the `EncodeIsoArray` intrinsics on other platforms should be possible, but I'm unfamiliar with the macro assembler in general and unlike the x86 intrinsic they don't use a simple vectorized mask to implement the latin-1 check. For example aarch64 seem to filter out the low bytes and then check if there's any bits set in the high bytes. Clever, but very different to the 0xFF80 2-byte mask that an ASCII test wants.
The current version (cef05f4) copies the ISO_8859_1.implEncodeISOArray intrinsic and adapts it to work on ASCII encoding, which makes the UTF_8$Encoder perform on par with (or outperform) encoding from a String. Using microbenchmarks provided by @carterkozak here: https://github.com/carterkozak/stringbuilder-encoding-performance
Baseline:
Benchmark (charsetName) (message) (timesToAppend) Mode Cnt Score Error Units
EncoderBenchmarks.charsetEncoder UTF-8 This is a simple ASCII message 3 avgt 8 270.237 ± 10.504 ns/op
EncoderBenchmarks.charsetEncoder UTF-8 This is a message with unicode 3 avgt 8 568.353 ± 2.331 ns/op
EncoderBenchmarks.charsetEncoderWithAllocation UTF-8 This is a simple ASCII message 3 avgt 8 324.889 ± 17.466 ns/op
EncoderBenchmarks.charsetEncoderWithAllocation UTF-8 This is a message with unicode 3 avgt 8 633.720 ± 22.703 ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder UTF-8 This is a simple ASCII message 3 avgt 8 1132.436 ± 30.661 ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder UTF-8 This is a message with unicode 3 avgt 8 1379.207 ± 66.982 ns/op
EncoderBenchmarks.toStringGetBytes UTF-8 This is a simple ASCII message 3 avgt 8 91.253 ± 3.848 ns/op
EncoderBenchmarks.toStringGetBytes UTF-8 This is a message with unicode 3 avgt 8 519.489 ± 12.516 ns/op
Patch:
Benchmark (charsetName) (message) (timesToAppend) Mode Cnt Score Error Units
EncoderBenchmarks.charsetEncoder UTF-8 This is a simple ASCII message 3 avgt 4 82.535 ± 20.310 ns/op
EncoderBenchmarks.charsetEncoder UTF-8 This is a message with unicode 3 avgt 4 522.679 ± 13.456 ns/op
EncoderBenchmarks.charsetEncoderWithAllocation UTF-8 This is a simple ASCII message 3 avgt 4 127.831 ± 32.612 ns/op
EncoderBenchmarks.charsetEncoderWithAllocation UTF-8 This is a message with unicode 3 avgt 4 549.343 ± 59.899 ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder UTF-8 This is a simple ASCII message 3 avgt 4 1182.835 ± 153.735 ns/op
EncoderBenchmarks.charsetEncoderWithAllocationWrappingBuilder UTF-8 This is a message with unicode 3 avgt 4 1416.407 ± 130.551 ns/op
EncoderBenchmarks.toStringGetBytes UTF-8 This is a simple ASCII message 3 avgt 4 97.770 ± 15.742 ns/op
EncoderBenchmarks.toStringGetBytes UTF-8 This is a message with unicode 3 avgt 4 516.351 ± 58.580 ns/op
This can probably be simplified further, say by adding a flag to the intrinsic of whether we're encoding ASCII only or ISO-8859-1. It also needs to be implemented and tested on all architectures.
(edit: accidentally edit rather than quote-reply, restored original comment)
On the JDK-included `CharsetEncodeDecode.encode` microbenchmark, I get these numbers in the baseline (18-b09):
Benchmark (size) (type) Mode Cnt Score Error Units
CharsetEncodeDecode.encode 16384 UTF-8 avgt 30 39.962 ± 1.703 us/op
CharsetEncodeDecode.encode 16384 BIG5 avgt 30 153.282 ± 4.521 us/op
CharsetEncodeDecode.encode 16384 ISO-8859-15 avgt 30 192.040 ± 4.543 us/op
CharsetEncodeDecode.encode 16384 ASCII avgt 30 40.051 ± 1.210 us/op
CharsetEncodeDecode.encode 16384 UTF-16 avgt 30 302.815 ± 9.490 us/op
With the proposed patch:
Benchmark (size) (type) Mode Cnt Score Error Units
CharsetEncodeDecode.encode 16384 UTF-8 avgt 30 4.081 ± 0.182 us/op
CharsetEncodeDecode.encode 16384 BIG5 avgt 30 150.374 ± 3.579 us/op
CharsetEncodeDecode.encode 16384 ISO-8859-15 avgt 30 4.010 ± 0.179 us/op
CharsetEncodeDecode.encode 16384 ASCII avgt 30 3.961 ± 0.176 us/op
CharsetEncodeDecode.encode 16384 UTF-16 avgt 30 302.235 ± 11.395 us/op
That is: on my system encoding 16K char ASCII data is 10x faster for UTF-8 and ASCII, and roughly 48x faster for ASCII-compatible charsets like ISO-8859-15. On 3rd party microbenchmarks we can assert that performance for non-ASCII input either doesn't change, or improves when messages have an ASCII prefix.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5621
More information about the nio-dev
mailing list