RFR: 8321373: Build should use LC_ALL=C.UTF-8 [v2]

Magnus Ihse Bursie ihse at openjdk.org
Thu Feb 1 14:18:05 UTC 2024


On Thu, 1 Feb 2024 13:53:56 GMT, Magnus Ihse Bursie <ihse at openjdk.org> wrote:

>> We're currently setting LC_ALL=C. Not all tools will default to utf-8 as their encoding of choice when they see this locale, but use an arbitrarily encoding, which might not properly handle all UTF-8 characters. Since in practice, all our encoding is utf8, we should tell our tools this as well.
>> 
>> This will at least have effect on how Java treats path names including unicode characters.
>
> Magnus Ihse Bursie has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Explicitly load StandardCharsets ascii/utf-8 in HelloClasslist
>  - Merge branch 'master' into c.utf-8
>  - 8321373: Build should use LC_ALL=C.UTF-8

With the changes in `HelloClasslist.java`, the following classes are added in `classlist` compared to mainline, and none are removed:


java/nio/StringCharBuffer
java/nio/charset/StandardCharsets
sun/nio/cs/ISO_8859_1
sun/nio/cs/ThreadLocalCoders
sun/nio/cs/ThreadLocalCoders$1
sun/nio/cs/ThreadLocalCoders$2
sun/nio/cs/ThreadLocalCoders$Cache
sun/nio/cs/UTF_16
sun/nio/cs/UTF_16BE
sun/nio/cs/UTF_16LE
sun/nio/cs/UTF_32
sun/nio/cs/UTF_32BE
sun/nio/cs/UTF_32LE
sun/nio/cs/UTF_8$Encoder


This seem highly reasonable to me. @cl4es Can you confirm that this is okay?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16971#issuecomment-1921436996


More information about the build-dev mailing list