RFR: JDK-8233829: javac cannot find non-ASCII module name under non-UTF8 environment

Toshio 5 Nakamura TOSHIONA at jp.ibm.com
Mon Jun 1 09:05:29 UTC 2020


Hi Jon,

Could you look at this fix, if possible?
Any comments or suggestions are welcome.

Webrev.02: http://cr.openjdk.java.net/~tnakamura/8233829/webrev.02/
Issue: https://bugs.openjdk.java.net/browse/JDK-8233829

Best regards,

Toshio Nakamura

> From: "Toshio 5 Nakamura" <TOSHIONA at jp.ibm.com>
> To: Jonathan Gibbons <jonathan.gibbons at oracle.com>
> Cc: compiler-dev at openjdk.java.net
> Date: 2020/05/15 00:23
> Subject: [EXTERNAL] Re: RFR: JDK-8233829: javac cannot find non-
> ASCII module name under non-UTF8 environment
> Sent by: "compiler-dev" <compiler-dev-bounces at openjdk.java.net>
>
> Hi Jon,
>
> Thank you for comments.
>
> Could you check webrev.02 which contains a testcase?
> Actually, this is not a direct test of the original problem since
> non-English Windows is required.
> But, I realized the patch fixed Unicode Surrogate Pair case, as well.
> It's caused by difference between Standard UTF-8 and Modified UTF-8,
> and it can be checked on English environment.
>
> I confirmed the test failed without the patch and passed with the patch.
> Tier1 tests also pass on Linux and Windows.
>
> Webrev.02: http://cr.openjdk.java.net/~tnakamura/8233829/webrev.02/
>
> Best regards,
>
> Toshio Nakamura
>
> Jonathan Gibbons <jonathan.gibbons at oracle.com> wrote on 2020/05/14
11:50:03:
>
> > From: Jonathan Gibbons <jonathan.gibbons at oracle.com>
> > To: Toshio 5 Nakamura <TOSHIONA at jp.ibm.com>,
compiler-dev at openjdk.java.net
> > Date: 2020/05/14 11:50
> > Subject: [EXTERNAL] Re: RFR: JDK-8233829: javac cannot find non-
> > ASCII module name under non-UTF8 environment
> >
> > Hi,
> > Normally, bug fixes like this this should be accompanied by a
> > corresponding regression test. While it may be a bit tricky to write
> > a test for this situation, it seems like it would be worth having
> > the test if possible.
> > -- Jon
> > On 5/13/20 7:11 PM, Toshio 5 Nakamura wrote:
> > Hi,
> >
> > Can anyone please review this fix?
> > Revised the patch simpler. In my understanding, the encoding is
> > modified UTF-8 instead of standard UTF-8 in this case. So, the fix
> > uses Convert utility class.
> >
> > Webrev.01: http://cr.openjdk.java.net/~tnakamura/8233829/webrev.01/
> >
> > Best regards,
> > Toshio Nakamura
> >
> > > From: Toshio 5 Nakamura/Japan/IBM
> > > To: compiler-dev at openjdk.java.net
> > > Date: 2020/04/16 21:39
> > > Subject: RFR: JDK-8233829: Non-ASCII module name cannot be handled
> > > under non-UTF8 environment
> > >
> > > Hi all,
> > >
> > > Could you review this fix? Also, I'd like to ask a sponsor of the
fix, since
> > > I'm not a committer.
> > >
> > > Issue: https://bugs.openjdk.java.net/browse/JDK-8233829
> > > Webrev: http://cr.openjdk.java.net/~tnakamura/8233829/webrev.00/
> > >
> > > If module name is in non-ASCII and environment is in non-UTF8,
> > > javac's "--add-modules" option cannot find the module.
> > >
> > > com.sun.tools.javac.jvm.ModuleNameReader.utf8Mapper uses
> > > String(byte[], int, int).
> > > In problematic case, the String was generated by default encoding
> > > which wasn't UTF8.
> > > For example, Japanese Windows uses MS932 (Shift_JIS) encoding.
> > > The byte[] in utf8Mapper method is always decoded by UTF-8.
> > >
> > > Tier1 tests on Linux and Windows passed.
> > >
> > > Best Regards,
> > > Toshio Nakamura
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20200601/1975cd95/attachment.htm>


More information about the compiler-dev mailing list