Making the source code utf-8

Jonathan Gibbons jonathan.gibbons at oracle.com
Wed Feb 8 18:54:36 UTC 2023


There are places in doc comments where entities need to be used
for non-ASCII characters, such as accented letters.

-- Jon

On 2/7/23 7:49 PM, Yasumasa Suenaga wrote:
> I give big +1 to this idea, thanks Magnus!
>
>
> 2023-02-07 21:28 に Magnus Ihse Bursie さんは書きました:
>> Currently, the source code in the JDK is in an ill-defined encoding.
>> There is no official declaration of the encoding used. It is "mostly
>> ASCII", but the relatively few non-ASCII characters used are not
>> well-defined. In many cases, it is latin-1, but I am pretty certain
>> other encodings are used for e.g. Asian translations.
>>
>> This is is creating unnecessary problems when working with the JDK
>> code base, while providing no benefit. We ended up here not by choice,
>> but by historical accident. Most recently, this issue has surfaced in
>> JDK-8301853, JDK-8301854 and JDK-8301855, but there has popped up
>> issues relating to this from time to time, e.g. JDK-8263028.
>>
>> As JEP 400[1] confirms, UTF-8 is the way to go. We should follow up on
>> this by converting our code base to UTF-8.
>>
>> I have created JDK-8301971[2] with the intention of converting all
>> files to UTF-8, and updating all infrastructure to recognize this
>> fact.
>>
>> Even though 99.9% of all text in the JDK repository is ASCII only,
>> with a code base the size of the JDK there are of course many, many
>> instances that needs to be checked and/or converted. I can take care
>> of the overarching issues, like updating compiler flags and develop
>> tooling to detect, and try to convert non-ASCII files based on my best
>> guesses, but in the end, there are likely to be many files which needs
>> to be verified by their respective teams, so that I did not assume the
>> incorrect source encoding.
>>
>> So, before I go ahead and start doing this, I want to check:
>>
>> * Is everyone onboard with this idea? I do assume that in 2023, having
>> UTF-8 encoding for text files is (or should be) a no-brainer, but I
>> want to verify that there is no-one opposing this.
>>
>> * Should I open a JEP for this? On the one hand, it is likely to
>> require a non-trivial amount of work, but on the other hand, there is
>> no change visible for the end user, so it will be kind of pointless to
>> announce. For my part, I could go either way, so I'm interested in
>> hearing opinions, preferably with good rationales, for one way or the
>> other.
>>
>> /Magnus
>>
>> [1] https://openjdk.org/jeps/400
>> [2] https://bugs.openjdk.org/browse/JDK-8301971


More information about the jdk-dev mailing list