JDK-8254073, unicode escape preprocessing, and \u005C

Jim Laskey james.laskey at oracle.com
Mon Jun 21 21:56:11 UTC 2021


"\u005C” should have been treated as a backslash. Will check into it. 

Cheers,

— Jim

��

> On Jun 21, 2021, at 6:28 PM, Liam Miller-Cushon <cushon at google.com> wrote:
> 
> 
> class T {
>   public static void main(String[] args) {
>     System.err.println("\u005C\\u005D");
>   }
> }
> 
> Before JDK-8254073, this prints `\]`.
> 
> After JDK-8254073, unicode escape processing results in `\\\u005D`, which results in an 'invalid escape' error for `\u`. Was that deliberate?
> 
> JLS 3.3 says
> 
> > for each raw input character that is a backslash \, input processing must consider how many other \ characters contiguously precede it, separating it from a non-\ character or the start of the input stream. If this number is even, then the \ is eligible to begin a Unicode escape; if the number is odd, then the \ is not eligible to begin a Unicode escape.
> 
> The difference is in whether `\u005C` (the unicode escape for `\`) counts as one of the `\` preceding a valid unicode escape.
> 
> Does "how many other \ characters contiguously precede it" refer to preceding raw input characters, or does it refer to preceding characters after unicode escape processing is performed on them?
> 
> JLS 3.3 also mentions that a "character produced by a Unicode escape does not participate in further Unicode escapes", but I'm not sure if that applies here, since in the pre-JDK-8254073 interpretation the unicode-escaped backslash isn't really 'participating' in the second unicode escape.
> 
> Thanks,
> Liam


More information about the compiler-dev mailing list