RFR: JDK-8233829: javac cannot find non-ASCII module name under non-UTF8 environment

Thu May 14 15:20:30 UTC 2020

Hi Jon,

Thank you for comments.

Could you check webrev.02 which contains a testcase?
Actually, this is not a direct test of the original problem since
non-English Windows is required.
But, I realized the patch fixed Unicode Surrogate Pair case, as well.
It's caused by difference between Standard UTF-8 and Modified UTF-8, and it
can be checked on English environment.

I confirmed the test failed without the patch and passed with the patch.
Tier1 tests also pass on Linux and Windows.

Webrev.02: http://cr.openjdk.java.net/~tnakamura/8233829/webrev.02/

Best regards,

Toshio Nakamura

Jonathan Gibbons <jonathan.gibbons at oracle.com> wrote on 2020/05/14
11:50:03:

> From: Jonathan Gibbons <jonathan.gibbons at oracle.com>
> To: Toshio 5 Nakamura <TOSHIONA at jp.ibm.com>,
compiler-dev at openjdk.java.net
> Date: 2020/05/14 11:50
> Subject: [EXTERNAL] Re: RFR: JDK-8233829: javac cannot find non-
> ASCII module name under non-UTF8 environment
>
> Hi,
> Normally, bug fixes like this this should be accompanied by a
> corresponding regression test. While it may be a bit tricky to write
> a test for this situation, it seems like it would be worth having
> the test if possible.
> -- Jon
> On 5/13/20 7:11 PM, Toshio 5 Nakamura wrote:
> Hi,
>
> Can anyone please review this fix?
> Revised the patch simpler. In my understanding, the encoding is
> modified UTF-8 instead of standard UTF-8 in this case. So, the fix
> uses Convert utility class.
>
> Webrev.01: http://cr.openjdk.java.net/~tnakamura/8233829/webrev.01/
>
> Best regards,
> Toshio Nakamura
>
> > From: Toshio 5 Nakamura/Japan/IBM
> > To: compiler-dev at openjdk.java.net
> > Date: 2020/04/16 21:39
> > Subject: RFR: JDK-8233829: Non-ASCII module name cannot be handled
> > under non-UTF8 environment
> >
> > Hi all,
> >
> > Could you review this fix? Also, I'd like to ask a sponsor of the fix,
since
> > I'm not a committer.
> >
> > Issue: https://bugs.openjdk.java.net/browse/JDK-8233829
> > Webrev: http://cr.openjdk.java.net/~tnakamura/8233829/webrev.00/
> >
> > If module name is in non-ASCII and environment is in non-UTF8,
> > javac's "--add-modules" option cannot find the module.
> >
> > com.sun.tools.javac.jvm.ModuleNameReader.utf8Mapper uses
> > String(byte[], int, int).
> > In problematic case, the String was generated by default encoding
> > which wasn't UTF8.
> > For example, Japanese Windows uses MS932 (Shift_JIS) encoding.
> > The byte[] in utf8Mapper method is always decoded by UTF-8.
> >
> > Tier1 tests on Linux and Windows passed.
> >
> > Best Regards,
> > Toshio Nakamura
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20200515/cea35fa8/attachment.htm>