RFR 8124977 cmdline encoding challenges on Windows

Kumar Srinivasan kumar.x.srinivasan at oracle.com
Fri Aug 7 12:04:36 UTC 2015


Hi Kirk,

We don't allow any shell tests. There are existing shell based tests,
I have not had a chance to port all of  them.

Your test must be rewritten to use the TestHelper framework, please see
the other tests in jdk/test/tools/launcher for patterns.

Kumar

>> -----Original Message-----
>> From: Xueming Shen [mailto:xueming.shen at oracle.com]
>> Sent: Monday, July 20, 2015 11:50 AM
>>
>> On 07/20/2015 10:22 AM, Kirk Shoop wrote:
>>> So when default system locale differs from the active one, we have
>> different behavior on Linux and Windows. The new options allow a windows
>> user to select the same behavior that one would expect on unix. The
>> switches can certainly be removed, if the compatibility impact is acceptable.
>>
>> Kirk, on Windows file.encoding is from the user locale and the
>> sun.jnu.encoding is from the system locale setting. sun.jnu.encoding is
>> purely for those text encoding sensitive jnu functiond to communicate with
>> the underlying windows system api, when the system locale and the user
>> locale are set to different value. On unix/linux/osx, these two are always set
>> to the same value. Yes, they might be input/output issue if the encoding
>> used by the console (oem codepage) is not compatible with the encoding
>> used by the "user locale"
>> and you are trying to use System.in/out/err for the input/output to the
>> console.
>>
>> Here is the original CCC request regarding the sun.jnu.encoding, which might
>> provide some background info.
>>
>> http://cr.openjdk.java.net/~sherman/4958170.html
>>
>> If you/we are NOT going to change the encoding used by the underlying
>> console, I don't think we need/should change the "encoding" used by the
>> java.io.Console. As I suggested in my previously email, the
>> Java_java_io_Console_encoding() implementation probably need to update
>> to return utf8 if the cp == 65001 (that was 10 years ago, I'm not sure if the
>> 65001 was really used back then when we wrote this code).  My
>> understanding of the issue here is that if you continue to use the "A" version
>> of the API to parse/get the arguments, and try to solve the possible issue
>> triggered by the "incompatibility" of the oem encoding used by the console
>> and the user locale encoding used by the System.in/ out/err, it's fine to
>> define a new system property to specify a preferred encoding for the
>> launcher to use, but this "preferred" encoding should not be used by
>> java.io.Console.
>> But isn't it more reasonable to simply always use the "W" version for this
>> purpose in launcher?
>>
>> -Sherman
>>
> Thank you for the valuable feedback. We have vastly simplified the original patch.
> The new webrev is here:
>    http://cr.openjdk.java.net/~kshoop/8124977/webrev.02/
>
> This webrev uses GetCommandLineW on windows to retrieve the UCS16 commandline and also supports the 65001(UTF-8) codepage (set by chcp 65001) so that when -Dsun.jnu.encoding="UTF-8" is supplied the console output (stdout & stderr) will be in UTF8.
>
> There are no new commandline switches.
>
> Please let us know if there is anything else that needs improvement.
>
> Thanks!
> Kirk and Valery
>




More information about the core-libs-dev mailing list