[External] : Re: JDK-8268622 - Performance issues in javac `Name` class

Archie Cobbs archie.cobbs at gmail.com
Tue Mar 7 00:57:08 UTC 2023


The message from this sender included one or more files
which could not be scanned for virus detection; do not
open these files unless you are certain of the sender's intent.

----------------------------------------------------------------------
On Mon, Mar 6, 2023 at 4:15 PM Jonathan Gibbons <jonathan.gibbons at oracle.com>
wrote:

> Yes, as a general rule, the compiler and runtime should be mutually
> consistent.
>
I've updated the PR to check for classfile major version < 48. In that case
longer-than-necessary encodings are allowed.

> This discussion probably also applies to javac reading names in source
> files and having those names propagate to class files.
>
Agreed. Though I think we're good here because the Lexer/Scanner uses a
CharsetDecoder that detects errors on malformed input. As a simple test I
verified that StandardCharsets.UTF_8 returns "MALFORMED" on input with "è"
encoded as c3 e8.

And after the lexer step, you're going from char[] to byte[], and those
conversions are already being done correctly in the compiler code.

It's the byte[] to char[] step in which "non-standard" encodings can creep
in.

-Archie

-- 
Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/compiler-dev/attachments/20230306/74ba61c0/attachment.htm>


More information about the compiler-dev mailing list