<i18n dev> Integrated: 8275863: Use encodeASCII for ASCII-compatible DoubleByte encodings

Claes Redestad redestad at openjdk.java.net
Wed Oct 27 10:11:16 UTC 2021


On Mon, 25 Oct 2021 10:16:23 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> Enhance ASCII-compatible `DoubleByte` encodings to make use of the `StringCoding.implEncodeAsciiArray` intrinsic, which makes many such `CharsetEncoder`s encode ASCII text at speeds comparable to most single-byte encoders - and also more in line with how well these charsets behave when calling `String.getBytes`:
> 
> Before:
> 
> Benchmark                   (size)      (type)  Mode  Cnt   Score   Error  Units
> CharsetEncodeDecode.encode   16384  ISO-8859-1  avgt   30   3.021 ± 0.120  us/op
> CharsetEncodeDecode.encode   16384   Shift-JIS  avgt   30  47.793 ± 1.942  us/op
> CharsetEncodeDecode.encode   16384      GB2312  avgt   30  49.598 ± 2.006  us/op
> CharsetEncodeDecode.encode   16384      EUC-JP  avgt   30  68.709 ± 5.019  us/op
> CharsetEncodeDecode.encode   16384      EUC-KR  avgt   30  48.033 ± 1.651  us/op
> 
> 
> Patched:
> 
> Benchmark                   (size)      (type)  Mode  Cnt  Score   Error  Units
> CharsetEncodeDecode.encode   16384  ISO-8859-1  avgt   30  2.856 ± 0.078  us/op
> CharsetEncodeDecode.encode   16384   Shift-JIS  avgt   30  5.287 ± 0.209  us/op
> CharsetEncodeDecode.encode   16384      GB2312  avgt   30  5.490 ± 0.251  us/op
> CharsetEncodeDecode.encode   16384      EUC-JP  avgt   30  7.657 ± 0.368  us/op
> CharsetEncodeDecode.encode   16384      EUC-KR  avgt   30  5.718 ± 0.267  us/op
> 
> 
> The `isASCIICompatible` predicate on `DoubleByte` was added in JEP 254 for the purpose of implementing such ASCII fast-paths, but is only used in what is now `String.encodeWithEncoder` to speed up `String.getBytes(...)` operations.
> 
> Testing: tier1-3

This pull request has now been integrated.

Changeset: 6c05cc9d
Author:    Claes Redestad <redestad at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/6c05cc9d15fb6014b8293a66ef132f3461badca1
Stats:     35 lines in 5 files changed: 24 ins; 4 del; 7 mod

8275863: Use encodeASCII for ASCII-compatible DoubleByte encodings

Reviewed-by: naoto, rriggs, alanb

-------------

PR: https://git.openjdk.java.net/jdk/pull/6102


More information about the i18n-dev mailing list