Proposal: #ModuleNameCharacters

forax at univ-mlv.fr forax at univ-mlv.fr
Thu Dec 8 10:26:12 UTC 2016


----- Mail original -----
> De: "mark reinhold" <mark.reinhold at oracle.com>
> À: forax at univ-mlv.fr
> Cc: jpms-spec-experts at openjdk.java.net
> Envoyé: Jeudi 8 Décembre 2016 00:40:17
> Objet: Re: Proposal: #ModuleNameCharacters

> 2016/12/6 0:08:58 -0800, forax at univ-mlv.fr:
>> 2016/11/29 16:11:02 -0800, mark.reinhold at oracle.com:
>>> ...
>>> 
>>> As I wrote in my reply to David, I'm open to lifting the traditional
>>> restrictions on the class-file representation of qualified names in the
>>> case of module names.
>> 
>> Ok, cool.
>> 
>>>                         Given the weight of tradition and the past value
>>> of the existing constraints, however, I'd like to have a more compelling
>>> reason than "some future hypothetical module system might need this
>>> flexibility".
>> 
>> Existing constraints exist because a package name is a part of a
>> qualified class name. There is no tradition for module names. Module
>> names in the class file are not mixed with other constrained names, so
>> i see no compelling reason to add arbitrary rules to try to restrict
>> module names.
> 
> Okay, okay ... taken together with David's examples, I get the point.
> 
> (Personally I've always considered the whole `.`-to-`/` mapping kind
> of archaic anyway.)
> 
>> Note that, JLS module names have to be parsed by the compiler, so for
>> JLS module names, having the same constraints as any other qualified
>> identifiers make sense, but here, we're talking about module names in
>> the JVM spec, not in the JLS.
> 
> Correct.
> 
>> Now, the constant pool is typed and structured, if we want to have
>> constraints on module names, in my opinion, we should introduce a new
>> constant pool item to make it clear that module names are not plain
>> names but specific names exactly like there is a Class constant pool
>> item.
> 
> Agreed.  This is, in fact, an inconsistency in the present proposal,
> since it imposes constraints on otherwise untagged CONSTANT_Utf8
> structures.  If we're going to impose constraints on free-standing
> module and package names then we should introduce the obvious new
> `CONSTANT_Module_info` and `CONSTANT_Package_info` structures.
> 
>> And with my ASM hat, having to add replace('.', '/') and replace('/',
>> '.') at the right places is error prone, if we can avoid that is a big
>> win in term of usability.
> 
> Yep.
> 
>>> In trying to think about the future I do wonder if, today, we should
>>> reserve a character or two just in case we discover five or ten years
>>> from now that we need to add more structure to module names.  Should
>>> we set aside `:`, or perhaps some other character, just in case?
>> 
>> if we want structure, we will add another constant pool item. It's
>> what valhalla does for parameterized types.
> 
> So the question is, then: Which, if any, characters should we reserve?
> 
> Peering into the myriad alternate visions swirling around in my cloudy
> crystal ball, I can see:
> 
>  - A structured namespace of modules.  `:` is a logical separator here,
>    even in the source language if need be, so let's reserve it now in
>    class files.
> 
>  - Module names encoded in class files together with specific version
>    strings, to form compound module identifiers.  We already use `@` to
>    separate module names from version strings in the module-system API
>    (e.g., the result of `ModuleDescriptor::toString`), so let's reserve
>    that in class files now.
> 
> (This is just my imagination, not specific suggestions for the future!)
> 
> Additionally we should reserve the universal escape character (`\`) and
> for sanity also forbid any character whose code point is less than 0x20
> (` `).  (Ideally we'd forbid all Unicode non-printing characters, but
> it's best not to have the JVMS depend upon the Unicode specification.)
> 
> To sum up: Reserve `:`, `@`, and `\` for future use, and forbid the ASCII
> non-printing characters (< 0x20).

You also need to reserve '/' because the java launcher (-m) use '/' to separate between the module name and the main class.

Rémi

> 
> David -- Are these restrictions acceptable in your use cases, or if not
> then at least tolerable?  I'm pretty sure I've never seen any of these
> characters in Java EE module names, JAR file base names, Maven group or
> artifact names, or the other examples you mentioned.
> 
> - Mark


More information about the jpms-spec-observers mailing list