RFR: 8321373: Build should use LC_ALL=C.UTF-8
Magnus Ihse Bursie
ihse at openjdk.org
Thu Feb 1 13:53:56 UTC 2024
On Tue, 5 Dec 2023 10:35:05 GMT, Magnus Ihse Bursie <ihse at openjdk.org> wrote:
> We're currently setting LC_ALL=C. Not all tools will default to utf-8 as their encoding of choice when they see this locale, but use an arbitrarily encoding, which might not properly handle all UTF-8 characters. Since in practice, all our encoding is utf8, we should tell our tools this as well.
>
> This will at least have effect on how Java treats path names including unicode characters.
Of course this was not as easy. One does not simply add "utf8".
I got a diff in ./lib/classlist:
401d400
< java/nio/charset/StandardCharsets
1182d1180
< sun/nio/cs/ISO_8859_1
1184,1185d1181
< sun/nio/cs/StandardCharsets$Aliases
< sun/nio/cs/StandardCharsets$Cache
1187,1196d1182
< sun/nio/cs/Surrogate
< sun/nio/cs/Surrogate$Parser
< sun/nio/cs/US_ASCII
< sun/nio/cs/US_ASCII$Encoder
< sun/nio/cs/UTF_16
< sun/nio/cs/UTF_16BE
< sun/nio/cs/UTF_16LE
< sun/nio/cs/UTF_32
< sun/nio/cs/UTF_32BE
< sun/nio/cs/UTF_32LE
1197a1184
> sun/nio/cs/UTF_8$Encoder
1232d1218
< sun/util/PreHashedMap
The PreHashedMap thing looks weird; the other seem definitely character set related. I'll have to investigate this.
Oh, shut up Wesley!
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16971#issuecomment-1842838066
PR Comment: https://git.openjdk.org/jdk/pull/16971#issuecomment-1920847508
More information about the build-dev
mailing list