Valid characters in a module name

Peter Levart peter.levart at gmail.com
Mon Jan 9 08:48:46 UTC 2017


Hi Ess,

On 01/09/2017 01:55 AM, Ess Kay wrote:
> > If this sequence of characters appear in source at position where 
> identifier is expected:
> > #"\\u0022\\\""
> > then they are interpreted as an identifier with following characters:
> > \u0022\"
> Then what happens when a user wants to specify the valid 14 character 
> class name #"\\u0022\\\"" ?

He would write it in source like:

#"#\"\\\\u0022\\\\\\\"\""

...this is hard to read, but doable.

> Perhaps I am misunderstanding you?  Do you accept that according to 
> the JVM specification a module, package, class, field or method name 
> in a Java class file can legally start with the two characters #" and 
> end with a single double quote?

By JVM specification, yes.

> For an escape character sequence to work it is essential that it is 
> not otherwise legal in a particular string.  That is not the case with 
> the #"..." sequence.

We are talking about identifiers, remember?

Normally in Java,  an identifier can contain 
(https://en.wikipedia.org/wiki/Java_syntax#Identifier):

     Any Unicode character that is a letter (including numeric letters 
like Roman numerals) or digit.
     Currency sign (such as $).
     Connecting punctuation character (such as _).

An identifier cannot:

     Start with a digit.
     Be equal to a reserved keyword, null literal or boolean literal.


Therefore, you can not start a Java (the language) identifier with 
character #. The "exotic" identifiers syntax was devised to enable Java 
(the language) to express any identifier that is otherwise possible by 
JVM specification. Mainly to enable inter-operation between Java and 
other JVM based languages that might use different rules as far as 
identifiers are concerned. Because the proposal was redrawn, status quo 
now is that if you want to be inter-operable with Java, you have to play 
by Java rules at least in part where Java and any other language 
inter-operate.

> I don't think there is any character that is invalid across module, 
> package, class, field and method names. So there is no one character 
> or character sequence that can act as as a escape sequence across all 
> names.

I still don't see your problem. I showed that any sequence of characters 
is expressible using exotic identifiers syntax, because it borrows from 
the syntax of Java string literals. Even CR and LF are expressible with 
\r and \n . I showed that exotic identifiers syntax is not ambiguous, 
because normally, using plain identifiers syntax, identifiers can not 
contain character # .


Regards, Peter



More information about the jigsaw-dev mailing list