JDK-8254073, unicode escape preprocessing, and \u005C
Liam Miller-Cushon
cushon at google.com
Mon Jun 21 21:28:00 UTC 2021
class T {
public static void main(String[] args) {
System.err.println("\u005C\\u005D");
}
}
Before JDK-8254073, this prints `\]`.
After JDK-8254073, unicode escape processing results in `\\\u005D`, which
results in an 'invalid escape' error for `\u`. Was that deliberate?
JLS 3.3 says
> for each raw input character that is a backslash \, input processing must
consider how many other \ characters contiguously precede it, separating it
from a non-\ character or the start of the input stream. If this number is
even, then the \ is eligible to begin a Unicode escape; if the number is
odd, then the \ is not eligible to begin a Unicode escape.
The difference is in whether `\u005C` (the unicode escape for `\`) counts
as one of the `\` preceding a valid unicode escape.
Does "how many other \ characters contiguously precede it" refer to
preceding raw input characters, or does it refer to preceding characters
after unicode escape processing is performed on them?
JLS 3.3 also mentions that a "character produced by a Unicode escape does
not participate in further Unicode escapes", but I'm not sure if that
applies here, since in the pre-JDK-8254073 interpretation the
unicode-escaped backslash isn't really 'participating' in the second
unicode escape.
Thanks,
Liam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20210621/abca9e22/attachment.htm>
More information about the compiler-dev
mailing list