RFC: Set UTF-8 as source file encoding on Windows

David Holmes david.holmes at oracle.com
Wed Feb 8 01:39:43 UTC 2023


Hi Yasumasa,

Please see:

https://mail.openjdk.org/pipermail/jdk-dev/2023-February/007342.html

Cheers,
David

On 7/02/2023 9:25 pm, Yasumasa Suenaga wrote:
> Hi all,
> 
> We are discussing about source file encoding in PR #12436 [1]
> 
> I saw some C4819 warnings on Windows when I tried to build OpenJDK on 
> Windows with Japanese locale (CP932). C4819 means the source file 
> contains characters which cl.exe cannot be handled in the current code 
> page (CP932 in my case).
> 
> I proposed to suppress C4819 in PR #12436, #12437 [2], and #12435 [3]. I 
> heared JDK folks have discussed about source file encoding in some 
> times, and it looks like that we expect UTF-8.
> So I want to propose to add `-utf-8` to CFLAGS for Windows. What do you 
> think?
> 
> The change is here: 
> https://github.com/YaSuenag/jdk/commit/272678f8f0a74d893d98b507f2c0562bff900b9d
> 
> 
> In GCC, the compiler expects UTF-8 as a source file encoding [4].
> OTOH cl.exe will use current user code page when the source does not 
> have BOM [5] in Windows. So I think we should think about Linux (in 
> other platforms eg macOS, I guess we can ignore because we haven't see 
> any reports which relate to the locale, and they can be set the locale 
> straightly - WSL cannot do so).
> 
> This proposal affects all native components in JDK, so I want to discuss 
> about this topic before filing this to JBS and sending PR for this.
> 
> 
> And also I think we should describe about source file encoding in some 
> place. It may be "Operating System Requirements" in building.md . Let me 
> know if better place.
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> 
> [1] https://github.com/openjdk/jdk/pull/12436
> [2] https://github.com/openjdk/jdk/pull/12437
> [3] https://github.com/openjdk/jdk/pull/12435
> [4] https://gcc.gnu.org/onlinedocs/gcc-12.2.0/cpp/Character-sets.html
> [5] 
> https://learn.microsoft.com/en-us/cpp/build/reference/utf-8-set-source-and-executable-character-sets-to-utf-8?view=msvc-170



More information about the build-dev mailing list