Proposal: #ModuleNameCharacters

David M. Lloyd david.lloyd at redhat.com
Tue Dec 6 00:12:06 UTC 2016


On 12/05/2016 05:05 PM, mark.reinhold at oracle.com wrote:
> 2016/11/29 18:47:33 -0800, david.lloyd at redhat.com:
>> On 11/29/2016 06:08 PM, mark.reinhold at oracle.com wrote:
>>> ...
>>>
>>> I'm concerned mostly about the complexity of the language specification,
>>> which affects all users of this module system.  If we change the language
>>> of module declarations to allow arbitrary module names, possibly via some
>>> sort of quoting scheme, then that's something that every developer would
>>> have to understand even though it would most likely be of benefit to few.
>>
>> You don't need a quoting scheme to allow arbitrary names - you just
>> allow them - and I don't see how this complicates the language
>> specification in any way: in fact it should simplify it.  There's just
>> no reason to use "internal form".
>
> At present the JLS does not allow identifiers to be composed of arbitrary
> characters.  If that's what you really want in the source language then
> you'd have to introduce some sort of quoting scheme to allow whitespace,
> all punctuation characters, and whatever else isn't already allowed by
> `Character::isJavaIdentifierPart`.  That would add significant complexity
> to the JLS and, more importantly, would be something that every single
> Java developer would have to learn.
>
> Perhaps you don't really mean "arbitrary", however, just "something more
> than is allowed in a Java qualified identifier".

See below...

>>> I'm concerned also, though to a lesser degree, about the complexity of
>>> the class-file specification (i.e., the JVMS), the long-term evolvability
>>> of that specification, and the complexity of code that reads and writes
>>> class files, though the latter is second-order since it mostly affects
>>> maintainers of IDEs, compilers, and other kinds of tools rather than
>>> developers in general.  If there's a compelling reason to lift the usual
>>> restrictions on the representation of qualified names in class files, at
>>> least in the case of module names, then I'd like to hear it.
>>
>> I'm just really confused at this line of justification.  Making module
>> names be a qualified name is the very thing that I'm arguing against.
>> Don't make any change to the restrictions on qualified names, and don't
>> make module names be qualified names; that's all there is to it.
>
> Okay, but in the language if an identifier is not a qualified name then
> it cannot contain `.` characters, so using simple identifiers for module
> names is a non-starter.
>
> We could revise the JLS to introduce a brand-new kind of identifier,
> just for module names, that allows `.` and maybe a few other additional
> special characters, and encode these without using internal form in
> `module-info.class` files.  This new kind of identifier would, however,
> be something that every Java developer must learn about.

OK I think I see the disconnect here.  Here's what I'm reading from what 
you're saying: The JLS says that names have to be one of: a qualified 
names, an identifier, or some new invented thing that would have to be 
added to the JLS, and if it's the last, that's a problem for new 
developers and probably for a few other reasons as well.

I have no disagreement with this point at all.  I'm only requesting that 
the backing representation in the JVM/class file spec and/or 
ModuleDescriptor.Builder side have no particular restrictions.  Can 
these two concepts be decoupled somehow?  I recognize that this means 
that javac-generated descriptors can never reference modules with 
"invalid" (according to the JLS) names, but I don't see that as a 
problem: this is solely for modules which are generated and referenced 
by our container code from some other existing module scheme.

The reason I am opposed to using an obfuscation scheme in this case is 
simple: it makes the diagnostic output and calls to get the name of your 
current module have a confusing result, and this mechanism is one of 
only two value propositions of Jigsaw that cannot currently be otherwise 
achieved without special JDK support (the other being the new 
security/encapsulation feature).  The .-to-/ mapping of internal names 
is particularly irksome.

> Who, in practical terms, would benefit from this added complexity?
> You've said that JBoss Modules allows arbitrary characters in module
> names, but do you or your users actually use such names in practice?
> If so, can you show us some examples?

The names can be mapped (exactly) from another module system or 
structure, including names that have either different limitations or no 
limitations in practice like Java EE names, old-style Extension-list 
names, JAR file base names, Maven artifacts, etc.

We generally apply name restrictions on a per-loader basis, i.e. it's up 
to the loader to understand names, and if you have (for example) a 
filesystem-backed module loader and you look in it for a module with a 
name that cannot be mapped to the filesystem, you're not going to find 
anything as the invalid name will be ignored (but otherwise there's no 
harm from trying it).

So in practice our filesystem names tend to be package-ish (though we 
allow '-' in this case as well), and that loader does in fact use a 
.-to-/ mapping as well as forbidding potential filesystem separators 
(because they are specifically relevant to how the module is located on 
the filesystem).  But our container module names are considerably looser 
and could contain any valid UTF-8 text, and in fact in some cases 
consist of a literal filesystem path name (which in turn generally has 
few restrictions).

>>> I don't expect Java EE modules to map directly to Java SE modules, nor
>>> do any of the Java EE spec leads with whom I've discussed this issue.
>>> EE modules and SE modules are completely different kinds of things.
>>
>> On March 11 of this year (and other occasions) I specifically asked
>> exactly that, and you said [1] "Of course we have that expectation --
>> that's why the requirements include an entire section on dynamic
>> configuration".  Did I misunderstand you then or now, or has something
>> changed?
>
> I don't think anything has changed.
>
> I didn't mean for that statement to imply that I thought that every EE
> module would map directly to an SE module, though I see how it can be
> read that way.  Apologies for the confusion.

OK, that's definitely a relief... but I do think we need to be very 
clear about exactly what the expectations are in terms of EE 9 support 
for modules, with the participation of that expert group, so that we 
don't get blindsided if/when that spec is being prepared.  Is there some 
way we can (reasonably quickly) come to a public agreement with the Java 
EE 9 expert group on this topic?  I realize this is rapidly becoming a 
tangent, so maybe this should be spun off into a separate discussion thread.

-- 
- DML


More information about the jpms-spec-observers mailing list