RFR: 8234835 Use UTF-8 charset in make support Java code

Wed Dec 4 00:24:08 UTC 2019

Hi Dan,

I think it's a combination of oral tradition and long-standing precedent.

Earlier this year, I raised this general issue, partly because of 
inconsistent use of -encoding in the build system.  The response was 
that there was some concern that not all tools in the tool chain could 
handle UTF-8 files.

$ find open/make -name \*.gmk | xargs grep -o -e '-encoding [^ ]*'
open/make/Docs.gmk:-encoding ISO-8859-1
open/make/Docs.gmk:-encoding ISO-8859-1
open/make/common/SetupJavaCompilers.gmk:-encoding ascii
open/make/common/SetupJavaCompilers.gmk:-encoding ascii

I think we should be consistent, but (at the time) it did not seem worth 
pushing for UTF-8 everywhere.

-- Jon

On 11/27/2019 07:23 PM, Dan Smith wrote:
>> For the other files, it seems strange to force the use of a charset
>> which is different from the charset of record for all our source files
>> (i.e. US-ASCII).
> Can you clarify where this "charset of record" rule comes from? Is this written down somewhere, or more of an oral tradition?
>
> The non-ASCII characters I'm working with are, in fact, in the original Markdown sources. If it's really important to avoid those in all sources, I could (reluctantly) use a different strategy.
>
> If the consensus is that the build tools should standardize on US-ASCII, I guess there's a separate question about whether we're willing to rely on the implicit platform default (now uniformly US-ASCII via command-line args), or whether it's better to be explicit about it (s/UTF_8/US_ASCII/ in my changeset).