Valid characters in a module name
Peter Levart
peter.levart at gmail.com
Sat Jan 7 06:44:04 UTC 2017
Hi Ess,
I have been reminded that the syntax for "Exotic identifiers" in Java
language as proposed for JDK 7 but then redrawn was using '#' character
as a prefix in front of a classical string literal:
http://mail.openjdk.java.net/pipermail/coin-dev/2009-March/001131.html
I accidentally replaced it with a syntax for Obj-C NSString literals
which uses '@'.
On 01/07/2017 01:40 AM, Ess Kay wrote:
> As far as I can tell, the complete string @"What a wonderful world!"
> is itself a valid module, package, class, field and method name.
And using the syntax for exotic identifiers, it would be expressed as:
#"@\"What a wonderful world!\""
> The '@' character has a reserved status in a module name but the JVM
> spec says that it may appear with some yet to be published meaning.
> Almost every possible string of printable characters is a valid
> module, package, class, field and method name. For example, the
> string \u0022\" is a valid 8 character Java field or method name.
Written with exotic identifier syntax as:
#"\\u0022\\\""
> The string \\u0022\\" is a valid 10 character Java field or method
> name. So a solution that uses escape characters is not as obvious as
> it may appear at first glance.You could even throw in a leading,
> embedded and trailing space and it would still be valid.
No problem. A sequence of any unicode characters is expressible as a
string literal and consequently as an exotic identifier when prefixed
with #.
>
> I haven't yet tested this but, prima facie, even non-printable
> characters such as backspaces and carriage returns are permitted in
> package, class, field and method names (but not module names.) Does
> the JVM support some escaping scheme to allow such characters in JAR
> manifests and service provider specifications? If the answer is yes
> then what is it? If the answer is no then doesn't that demonstrates
> the absurdity of the situation?
It appears that NUL, CR, and LF can't be part of header values in JAR
manifests, but other characters can:
http://docs.oracle.com/javase/7/docs/technotes/guides/jar/jar.html#Manifest_Specification
/Notes on Manifest and Signature Files//
//
// Line length://
// No line may be longer than 72 bytes (not characters), in its
UTF8-encoded form. If a value would make the initial line longer than
this, it should be continued on extra lines (each starting with a single
SPACE).//
//
// Limitations://
// Because header names cannot be continued, the maximum length
of a header name is 70 bytes (there must be a colon and a SPACE after
the name).//
// NUL, CR, and LF can't be embedded in header values, and NUL,
CR, LF and ":" can't be embedded in header names.//
// Implementations should support 65535-byte (not character)
header values, and 65535 headers per file. They might run out of memory,
but there should not be hard-coded limits below these values.//
/
Regards, Peter
>
> So at this point Alan's suggested initial 'do nothing' approach is
> attractive. At this point the flexibility that the JVM spec gives is
> totally gratuitous in that no one as yet appears to have had any
> reason to make use of it.
>
> On Sat, Jan 7, 2017 at 8:45 AM, Peter Levart <peter.levart at gmail.com
> <mailto:peter.levart at gmail.com>> wrote:
>
> Hi Ess,
>
>
> On 01/06/2017 05:27 AM, Ess Kay wrote:
>>> chances of meeting a module-info.class with funky module names is low
>> When I raised the initial question, I had no idea that the Java verifier
>> had been changed (with Java 6?) to allow "funky" package, class, field and
>> method names. Somehow that change passed right under the radar. Yes - a
>> possible option would be to simply ignore the broad character range allowed
>> by the JVM specification and trust that in practice no one would actually
>> use the usual characters in package, class, field, method or module names.
>> A downside to that option is that we will no longer be able to say to our
>> users that we fully support the JVM specification which in some cases can
>> be a problem. Anyway, I guess it is time to accept the overwhelming inertia
>> of the status quo and move on to the next problem.
>
> If I remember correctly, there was a crazy proposal in the past to
> specify a syntax for arbitrary symbol names in Java. It went
> roughly like:
>
> @"the syntax of Java string in here"
>
>
> So you could write code like:
>
>
> public class @"What a wonderful world!" {
> public static void @"Let's party..."() {
> }
> }
>
> //
> @"What a wonderful world!".@"Let's party..."();
>
>
> You could adopt this in your tool, what do you think?
>
> Regards, Peter
>
>
More information about the jigsaw-dev
mailing list