From alex.buckley at oracle.com  Tue Sep 22 17:41:37 2020
From: alex.buckley at oracle.com (Alex Buckley)
Date: Tue, 22 Sep 2020 10:41:37 -0700
Subject: Identifier Ignorable characters in keywords and literals
In-Reply-To: <CAMJX6f7OuKjqE7jaAiCCCnZEgc0sSt7HUPZizAnY0kuHk6R5hA@mail.gmail.com>
References: <CAMJX6f7OuKjqE7jaAiCCCnZEgc0sSt7HUPZizAnY0kuHk6R5hA@mail.gmail.com>
Message-ID: <087b829b-7568-6845-7d5c-64bf4d5fd064@oracle.com>

// Adding Dan explicitly

On 9/21/2020 10:39 PM, Pravin Jain wrote:
> The following code compiles and executes successfully.
> 
> public cl\u0001ass Identifier\u0002Ignorable {
>      public sta\u0003tic vo\u0004id ma\u0005in(String[] args) {
>          System.out.println("Hello world");
>      }
> }
> 
> The JLS mentions about the use of Identifier-Ignorable characters
> being allowed in an Identifier, but using those in a keyword, or
> literal has not been mentioned. From the specification, one does not
> gather that these characters will be ignored when used inside a
> keyword or a literal.y Is this error of compiler or the JLS has missed
> to clarify this point?

It would be legitimate for JLS 3.3 to acknowledge that some `\uxxxx` 
Unicode escapes represent UTF-16 code units which denote "ignorable" 
code points; such UTF-16 code units are _not_ included in the sequence 
of Unicode input characters resulting from this translation step.

Dan, is it possible to make this small clarification in the JLS ch.3 
update for contextual keywords?

The text in 3.8 -- "Two identifiers are the same only if, after ignoring 
characters that are ignorable, the identifiers have the same Unicode 
character for each letter or digit." -- would be slightly redundant in 
calling out ignorable characters, but it should not be changed because 
it states a clear, easy-to-understand rule for Java programmers looking 
to go beyond ASCII in their identifiers.

Alex

From alex.buckley at oracle.com  Wed Sep 23 00:10:11 2020
From: alex.buckley at oracle.com (Alex Buckley)
Date: Tue, 22 Sep 2020 17:10:11 -0700
Subject: Identifier Ignorable characters in keywords and literals
In-Reply-To: <CAMJX6f7RB6wBj0KN5tkAR67nis+SSuA9OQCXgrTN72vKYVgEeQ@mail.gmail.com>
References: <CAMJX6f7OuKjqE7jaAiCCCnZEgc0sSt7HUPZizAnY0kuHk6R5hA@mail.gmail.com>
 <087b829b-7568-6845-7d5c-64bf4d5fd064@oracle.com>
 <CAMJX6f7RB6wBj0KN5tkAR67nis+SSuA9OQCXgrTN72vKYVgEeQ@mail.gmail.com>
Message-ID: <ecf9e640-6ce1-61e3-11f3-ac3bf8c0124c@oracle.com>

An ignorable Unicode escape such as `\u0001` is a legitimate character 
in a character literal, string literal, or text block, so javac accepts 
and translates it there. In contrast, it seems that javac accepts _and 
discards_ an ignorable Unicode escape:

1. in the body of a comment;
2. as a Java-letter-or-digit in an identifier (i.e., not as the first 
character of an identifier, but as any subsequent character);
3. in a position to the right of a non-ignorable character within a 
keyword (thus allowing for appearance at the end of a keyword, and for 
consecutive ignorable escapes: `class\u0001\u0001`);
4. in a position to the right of a non-ignorable character within a 
boolean literal or null literal.

1 and 2 are to spec. 3 and 4 are new to the spec. There seems to be a 
connection between 2 and 3+4: javac is expecting keywords to follow the 
same Java-letter-followed-by-Java-letters-or-digits format as identifiers.

Alex

On 9/22/2020 4:07 PM, Pravin Jain wrote:
> Thanks for the clarifications.
> But let me point out that the Identifier Ignorable characters are
> ignored not only in keywords but also in the three literals "true",
> "false" and "null"
> 
> Thanks and Regards,
> Pravin
> 
> On Tue, Sep 22, 2020 at 11:11 PM Alex Buckley <alex.buckley at oracle.com> wrote:
>>
>> // Adding Dan explicitly
>>
>> On 9/21/2020 10:39 PM, Pravin Jain wrote:
>>> The following code compiles and executes successfully.
>>>
>>> public cl\u0001ass Identifier\u0002Ignorable {
>>>       public sta\u0003tic vo\u0004id ma\u0005in(String[] args) {
>>>           System.out.println("Hello world");
>>>       }
>>> }
>>>
>>> The JLS mentions about the use of Identifier-Ignorable characters
>>> being allowed in an Identifier, but using those in a keyword, or
>>> literal has not been mentioned. From the specification, one does not
>>> gather that these characters will be ignored when used inside a
>>> keyword or a literal.y Is this error of compiler or the JLS has missed
>>> to clarify this point?
>>
>> It would be legitimate for JLS 3.3 to acknowledge that some `\uxxxx`
>> Unicode escapes represent UTF-16 code units which denote "ignorable"
>> code points; such UTF-16 code units are _not_ included in the sequence
>> of Unicode input characters resulting from this translation step.
>>
>> Dan, is it possible to make this small clarification in the JLS ch.3
>> update for contextual keywords?
>>
>> The text in 3.8 -- "Two identifiers are the same only if, after ignoring
>> characters that are ignorable, the identifiers have the same Unicode
>> character for each letter or digit." -- would be slightly redundant in
>> calling out ignorable characters, but it should not be changed because
>> it states a clear, easy-to-understand rule for Java programmers looking
>> to go beyond ASCII in their identifiers.
>>
>> Alex
> 
> 
> 

From alex.buckley at oracle.com  Wed Sep 30 16:50:22 2020
From: alex.buckley at oracle.com (Alex Buckley)
Date: Wed, 30 Sep 2020 09:50:22 -0700
Subject: related to compile error in preview feature 'record'
In-Reply-To: <CAMJX6f6E_8nuJXaq8GQC+gLSu4GX8J+749DeQNjFS7kbDDJHSQ@mail.gmail.com>
References: <CAMJX6f6E_8nuJXaq8GQC+gLSu4GX8J+749DeQNjFS7kbDDJHSQ@mail.gmail.com>
Message-ID: <5ac0af92-a3f3-7a3a-e6aa-0c29dbb025cb@oracle.com>

In the compact constructor below, `low` refers to an implicit parameter 
of the constructor, whereas `this.low` refers to a field corresponding 
to the record component `low`.

IIRC, the field is not definitely assigned before the end of the 
constructor body, so referring to `this.low` anywhere in the constructor 
body is an error by the traditional rule at the beginning of JLS ch.16.

There is no interprocedural analysis in ch.16, so m1's independent 
reference to a field via `this.low` is allowed on the basis that the 
field was definitely assigned by the end of every constructor body.'

In an ordinary class, a constructor body that calls m1() but forgets to 
explicitly initialize the field would cause a compile-time error. There 
is no corresponding compile-time error here because the compact 
constructor body implicitly initializes the field.

Alex

On 9/30/2020 8:45 AM, Pravin Jain wrote:
> Dear sir,
> In the following code, I have commented the error.
> This error seems to be unnecessary, could have been a warning instead.
> In the constructor of record access to instance variable before
> explicit initialization, giving error, whereas same access is
> available by invoking a method.
> 
> public record TestRecord(int low, int high) {
>      public TestRecord {
>          System.out.println(low);
> //        System.out.println(this.low); // error, variable not initialized
>          m1(); // but this works, why?
>      }
>      public void m1() {
>          System.out.println(this.low);
>      }
>      public static void main(String[] args) {
>          TestRecord r1 = new TestRecord(7,12);
>      }
> }
> 
>