RFR: 8376483: Avoid loading java.nio.charset.StandardCharsets in java.util.zip.ZipCoder
Please review this class which removes a dependency on `java.nio.charset.StandardCharsets` introduced in [JDK-8365703](https://bugs.openjdk.org/browse/JDK-8365703). We can use `UTF_8.INSTANCE` here as elsewhere in this class to prevent startup regression loading otherwise unused classes and charsets. Verified that running `java -cp hello.jar Hello" observes the following diff in loaded classes when run with this PR: < java.nio.charset.StandardCharsets < sun.nio.cs.US_ASCII < sun.nio.cs.ISO_8859_1 < sun.nio.cs.UTF_16BE < sun.nio.cs.UTF_16LE < sun.nio.cs.UTF_16 < sun.nio.cs.UTF_32BE < sun.nio.cs.UTF_32LE < sun.nio.cs.UTF_32 This is a strict refactoring, no tests updated, `noreg-trivial`. ------------- Commit messages: - Avoid loading java.nio.charset.StandardCharsets Changes: https://git.openjdk.org/jdk/pull/29443/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29443&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8376483 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/29443.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29443/head:pull/29443 PR: https://git.openjdk.org/jdk/pull/29443
On Tue, 27 Jan 2026 10:44:42 GMT, Eirik Bjørsnøs <eirbjo@openjdk.org> wrote:
Please review this PR which removes a dependency on `java.nio.charset.StandardCharsets` introduced in [JDK-8365703](https://bugs.openjdk.org/browse/JDK-8365703).
We can use `UTF_8.INSTANCE` here as elsewhere in this class to prevent startup regression loading otherwise unused classes and charsets.
Verified that running `java -cp hello.jar Hello` observes the following diff in loaded classes when run with this PR:
< java.nio.charset.StandardCharsets < sun.nio.cs.US_ASCII < sun.nio.cs.ISO_8859_1 < sun.nio.cs.UTF_16BE < sun.nio.cs.UTF_16LE < sun.nio.cs.UTF_16 < sun.nio.cs.UTF_32BE < sun.nio.cs.UTF_32LE < sun.nio.cs.UTF_32
This is a strict refactoring, no tests updated, `noreg-trivial`.
I don't think this is worth it - most user programs will refer to UTF_8 as StandardCharsets.UTF_8, so this doesn't really optimize anything. Also in the ArrayDeque to ArrayList case, ArrayList has other allocation benefits in addition to avoiding ArrayDeque class load. So I personally recommend withdrawing this patch, especially that AOT preimage should be able to capture the linked or even initialized state of StandardCharsets class. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29443#issuecomment-3805526018
On Tue, 27 Jan 2026 14:26:45 GMT, Chen Liang <liach@openjdk.org> wrote:
I don't think this is worth it - most user programs will refer to UTF_8 as StandardCharsets.UTF_8, so this doesn't really optimize anything. Also in the ArrayDeque to ArrayList case, ArrayList has other allocation benefits in addition to avoiding ArrayDeque class load. So I personally recommend withdrawing this patch, especially that AOT preimage should be able to capture the linked or even initialized state of StandardCharsets class.
Some context: I spotted this because I've gotten review feedback in the past (from @cl4es IIRC) about avoiding `StandardCharsets` in this class. The current version of the class has three occurrences of `UTF_8.INSTANCE` besides this one recently introduced `StandardCharsets.UTF_8` instance. So there is a consistency issue here as well, besides any (valid or invalid) performance concerns. See also the following note from `StandardCharsets`: // To avoid accidental eager initialization of often unused Charsets // from happening while the VM is booting up, which may delay // initialization of VM components, we should generally avoid depending // on this class from elsewhere in java.base. @cl4es may have opinions about the present validity of this note. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29443#issuecomment-3805613881
On Tue, 27 Jan 2026 10:44:42 GMT, Eirik Bjørsnøs <eirbjo@openjdk.org> wrote:
Please review this PR which removes a dependency on `java.nio.charset.StandardCharsets` introduced in [JDK-8365703](https://bugs.openjdk.org/browse/JDK-8365703).
We can use `UTF_8.INSTANCE` here as elsewhere in this class to prevent startup regression loading otherwise unused classes and charsets.
Verified that running `java -cp hello.jar Hello` observes the following diff in loaded classes when run with this PR:
< java.nio.charset.StandardCharsets < sun.nio.cs.US_ASCII < sun.nio.cs.ISO_8859_1 < sun.nio.cs.UTF_16BE < sun.nio.cs.UTF_16LE < sun.nio.cs.UTF_16 < sun.nio.cs.UTF_32BE < sun.nio.cs.UTF_32LE < sun.nio.cs.UTF_32
This is a strict refactoring, no tests updated, `noreg-trivial`.
There was effort a few years ago to push out the first use of StandardCharsets as its initializer loaded all the standard charsets. So this is why you will have seen usages replaced with (mostly) UTF_8.INSTANCE. This was all before Project Leyden and the recent work on AOT. It would be useful to do some measurements to see if this change shows any benefit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/29443#issuecomment-3949805217
On Tue, 24 Feb 2026 07:39:10 GMT, Alan Bateman <alanb@openjdk.org> wrote:
There was effort a few years ago to push out the first use of StandardCharsets as its initializer loaded all the standard charsets. So this is why you will have seen usages replaced with (mostly) UTF_8.INSTANCE. This was all before Project Leyden and the recent work on AOT. It would be useful to do some measurements to see if this change shows any benefit.
We could certainly do measurements. However; the use of `StandardCharsets` was introduced last August via #26822. Before that change `ZipCoder` used strictly `UTF_8.INSTANCE`. That PR did not mention `StandardCharsets` or any motivation for introducing it. To me, this looked like a simple oversight. I don't have strong opinions of whether we should use `UTF_8.INSTANCE` or `StandardCharsets`, but would prefer if we can be consistent in this class/area and that any choice is deliberate and not just something slipped in sideways. In other words, I think the burden of proof should be on #26822 for introducing `StandardCharsets.UTF_8.INSTANCE` in a class that otherwise uses `UTF_8.INSTANCE`, not for this PR reestablishing consistency and long time practise. But again, opinions loosely help, I can surely withdraw this PR if that's consenus, even if I don't quite understand why :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/29443#issuecomment-3952119241
participants (3)
-
Alan Bateman
-
Chen Liang
-
Eirik Bjørsnøs