Integrated: 8376226: CharsetEncoder.canEncode(CharSequence) is much slower than necessary

Daniel Gredler dgredler at openjdk.org
Tue Jan 27 13:23:50 UTC 2026


On Fri, 23 Jan 2026 18:41:37 GMT, Daniel Gredler <dgredler at openjdk.org> wrote:

> Subclasses of `CharsetEncoder` often override `canEncode(char)` in order to make it very fast. This is not the case for `canEncode(CharSequence)`, which currently must usually perform the full encoding process. As a result, `canEncode(CharSequence)` is about 20x slower than `canEncode(char)` when the input is encodable, and about 1600x slower than `canEncode(char)` when the input is not encodable.
> 
> The reason that performance is even slower for un-encodable input is that the internal logic is relying on a thrown exception to determine that the input cannot be encoded (requiring stack trace setup, etc).
> 
> This PR overrides `canEncode(CharSequence)` to simply check `canEncode(char)` on each character in the sequence when the encoding allows this (ASCII, ISO-8859-1, etc). Where this is not possible (e.g. UTF-8, UTF-16) this PR removes the exception-based flow control in `CharsetEncoder` so that the un-encodable scenario is at least improved.
> 
> Regression tests run locally:
> - `make test TEST="jtreg:test/jdk/java/nio/charset"`
> - `make test TEST="jtreg:test/jdk/sun/nio/cs"`
> 
> The included benchmark can be run via `make test TEST="micro:java.nio.CharsetCanEncode"`.
> 
> JMH benchmark results **before** changes:
> 
> 
> Benchmark                                    Mode  Cnt    Score    Error  Units
> CharsetCanEncode.asciiCanEncodeCharNo        avgt   30    0.502 ±  0.004  ns/op
> CharsetCanEncode.asciiCanEncodeCharYes       avgt   30    0.503 ±  0.003  ns/op
> CharsetCanEncode.asciiCanEncodeStringNo      avgt   30  821.635 ±  7.055  ns/op
> CharsetCanEncode.asciiCanEncodeStringYes     avgt   30    8.875 ±  0.115  ns/op
> CharsetCanEncode.iso88591CanEncodeCharNo     avgt   30    0.508 ±  0.006  ns/op
> CharsetCanEncode.iso88591CanEncodeCharYes    avgt   30    0.506 ±  0.004  ns/op
> CharsetCanEncode.iso88591CanEncodeStringNo   avgt   30  833.165 ±  7.315  ns/op
> CharsetCanEncode.iso88591CanEncodeStringYes  avgt   30   10.357 ±  1.427  ns/op
> CharsetCanEncode.iso88592CanEncodeCharNo     avgt   30    0.957 ±  0.009  ns/op
> CharsetCanEncode.iso88592CanEncodeCharYes    avgt   30    1.407 ±  0.010  ns/op
> CharsetCanEncode.iso88592CanEncodeStringNo   avgt   30  826.478 ±  4.409  ns/op
> CharsetCanEncode.iso88592CanEncodeStringYes  avgt   30   13.223 ±  1.479  ns/op
> CharsetCanEncode.shiftjisCanEncodeCharNo     avgt   30    1.370 ±  0.012  ns/op
> CharsetCanEncode.shiftjisCanEncodeCharYes    avgt   30    1.386 ±  0.010  ns/op
> CharsetCanEncode.shiftjisCanEncodeStringNo   avgt   30  850.336 ± 20.107  ns/op
> C...

This pull request has now been integrated.

Changeset: 992a8ef4
Author:    Daniel Gredler <dgredler at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/992a8ef46bc0a06c70fd5f4f307dbd20e402ed33
Stats:     246 lines in 6 files changed: 238 ins; 1 del; 7 mod

8376226: CharsetEncoder.canEncode(CharSequence) is much slower than necessary

Reviewed-by: alanb, naoto

-------------

PR: https://git.openjdk.org/jdk/pull/29391


More information about the nio-dev mailing list