Valid characters in a module name
Peter Levart
peter.levart at gmail.com
Mon Jan 9 08:48:46 UTC 2017
Hi Ess,
On 01/09/2017 01:55 AM, Ess Kay wrote:
> > If this sequence of characters appear in source at position where
> identifier is expected:
> > #"\\u0022\\\""
> > then they are interpreted as an identifier with following characters:
> > \u0022\"
> Then what happens when a user wants to specify the valid 14 character
> class name #"\\u0022\\\"" ?
He would write it in source like:
#"#\"\\\\u0022\\\\\\\"\""
...this is hard to read, but doable.
> Perhaps I am misunderstanding you? Do you accept that according to
> the JVM specification a module, package, class, field or method name
> in a Java class file can legally start with the two characters #" and
> end with a single double quote?
By JVM specification, yes.
> For an escape character sequence to work it is essential that it is
> not otherwise legal in a particular string. That is not the case with
> the #"..." sequence.
We are talking about identifiers, remember?
Normally in Java, an identifier can contain
(https://en.wikipedia.org/wiki/Java_syntax#Identifier):
Any Unicode character that is a letter (including numeric letters
like Roman numerals) or digit.
Currency sign (such as $).
Connecting punctuation character (such as _).
An identifier cannot:
Start with a digit.
Be equal to a reserved keyword, null literal or boolean literal.
Therefore, you can not start a Java (the language) identifier with
character #. The "exotic" identifiers syntax was devised to enable Java
(the language) to express any identifier that is otherwise possible by
JVM specification. Mainly to enable inter-operation between Java and
other JVM based languages that might use different rules as far as
identifiers are concerned. Because the proposal was redrawn, status quo
now is that if you want to be inter-operable with Java, you have to play
by Java rules at least in part where Java and any other language
inter-operate.
> I don't think there is any character that is invalid across module,
> package, class, field and method names. So there is no one character
> or character sequence that can act as as a escape sequence across all
> names.
I still don't see your problem. I showed that any sequence of characters
is expressible using exotic identifiers syntax, because it borrows from
the syntax of Java string literals. Even CR and LF are expressible with
\r and \n . I showed that exotic identifiers syntax is not ambiguous,
because normally, using plain identifiers syntax, identifiers can not
contain character # .
Regards, Peter
More information about the jigsaw-dev
mailing list