Java cmdline encoding challenges on Windows
Martin Buchholz
martinrb at google.com
Wed Jan 28 20:54:51 UTC 2015
Hi Martin,
This was vaguely on my TODO list 10 years ago.
It makes sense for Microsoft to be funding this sort of improvement.
I'm your cheerleader (but not your reviewer).
See also incomplete work here:
http://bugs.java.com/view_bug.do?bug_id=4519026
http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-March/001231.html
I once wrote,
I give Microsoft credit for doing the Right Thing, namely treating things
like filenames as TEXT with a fixed Unicode encoding instead of an
ambiguous bag of bytes. But the cost was a 30-year migration for their own
code base and a completely incompatible API from the POSIX world. Where
Win32 does offer a POSIX API (and this includes main(int argc, char**argv)
!), it is generally an imperfect emulation. All code interacting with the
OS needs to have a windows-specific version, e.g. have to have a wmain just
for windows.
On Tue, Jan 27, 2015 at 9:21 AM, Martin Sawicki (MS OPEN TECH) <
marcins at microsoft.com> wrote:
> Hello,
> We're proposing an improvement to the OpenJDK intended to fix the
> currently existing problem with handling Unicode parameters on the command
> line in Windows (via cmd.exe), which prevents users for example in China
> from properly passing text strings in their own language via the java.exe
> command line.
>
> We have a code submission figured out and tested internally. I've uploaded
> our webrev package here:
>
> https://openjdkcontrib.blob.core.windows.net/unicodecmd/webrev-20150114.zip
>
> The crux of the change lies in using the "W" (wide character) version of
> the Windows APIs for fetching the command line parameters, rather than the
> "A" (ascii) version. But this code path is taken only when the following
> options are set:
>
> -Dwindows.UnicodeConsole=true - switches on Unicode support in the Windows
> console
> -Dfile.encoding.unicode="UTF-8" - identifies Unicode charset to use; If
> not specified, UTF-8 is used by default. Ignored when
> windows.UnicodeConsole is not set to true.
>
> We'd appreciate a review and acceptance of this improvement.
>
> And, as this is our first contribution to this sub-project within the
> OpenJDK, I apologize for any steps in the submission process that I may
> have missed here and would appreciate guidance as needed.
>
> Best regards
>
> Martin Sawicki (and Kirk Shoop, and Valeriy Kopylov)
> Microsoft Open Technologies, Inc.
> A subsidiary of Microsoft Corp.
>
>
More information about the core-libs-dev
mailing list