Proposal: #ModuleNameCharacters
mark.reinhold at oracle.com
mark.reinhold at oracle.com
Wed Nov 30 00:08:02 UTC 2016
2016/11/22 13:05:13 -0800, david.lloyd at redhat.com:
> On 11/22/2016 10:49 AM, mark.reinhold at oracle.com wrote:
>> Proposal
>> --------
>>
>> Make no changes here.
>
> TL;DR: we can't accept this proposal as-is. Expounding more below.
>
>> Modules are a new construct of the Java programming language in the
>> present design. In the source language they are hence identified by
>> qualified names [2] in the same manner as the existing structural
>> constructs, i.e., packages and classes. As such these names do allow
>> some unusual characters, though not hyphens or slashes [3].
>>
>> Module names in compiled module-declaration class files are recorded in
>> `CONSTANT_Utf8_info` structures, and thus have fewer constraints.
>
> I believe that according to the JVM spec (until now anyway), this field
> type by itself has no constraints at all, beyond being a valid
> "modified" UTF-8 string.
It's true that a CONSTANT_Utf8_info can, as such, contain arbitrary UTF-8
text. The additional restrictions I listed are the same as those already
in place for other kinds of qualified names, i.e., the names of packages
and classes. A module name is, after all, just another kind of qualified
name in the present design.
>> They replace periods (`'.'`) with forward slashes (`'/'`), and disallow
>> periods, semicolons (`';'`), and left square brackets (`'['`) [4].
>
> These name manglings are really just plain weird in this context, and
> are clearly an implementation artifact. How did we arrive at this
> place? I feel like it springs from the long-maligned conflation of
> module descriptors with class files. But modules are not types and
> AFAICT these restrictions really make no sense from any other
> perspective than one of implementation.
These restrictions make perfect sense from the perspective of tools that
manipulate class files, which are likely already to assume that these
restrictions are observed uniformly for all kinds of qualified names.
I'm reluctant to violate that expectation without good reason.
These restrictions have, moreover, ensured a useful degree of freedom for
the evolution of the platform over the last twenty years. I'm reluctant
to give that up without good reason.
> ...
>
>> The present design is, then, consistent with the existing treatment of
>> qualified names in the language, in class files, and in the Java SE API.
>
> I do not believe that any of these statements is sufficient to justify
> the constraint... more below.
>
>> A different module system with a more-flexible naming scheme can easily
>> refer to JPMS modules, per the agreed interoperation requirement [5].
>> The requirements do not mandate bidirectional interoperation, which for
>> this issue would mean that JPMS modules must be able to refer to non-JPMS
>> modules with non-JPMS names.
>
> This is also not sufficient to justify the constraint, though it does
> help to explain the reasoning why the constraint existed in the first place.
There is no agreed requirement for bidirectional interoperation, nor any
other agreed requirement that mandates that module names be arbitrary
strings. There is thus no need, per the requirements, to support such
module names.
In the absence of a requirement to support arbitrary module names I've
chosen in the present design to be consistent with the other kinds of
qualified names already used in the language. (I acknowledge that you
don't think that modules should be a language construct in the first
place, but that's not a decision that I intend to revisit.) This will
be the least surprising choice for the vast majority of developers who
will use this module system.
>> To support that would add significant
>> complexity to this specification and its implementations.
>
> I am sympathetic to this, but I think it needs more discussion,
> particularly as the proposed complexities have not been explained.
I'm concerned mostly about the complexity of the language specification,
which affects all users of this module system. If we change the language
of module declarations to allow arbitrary module names, possibly via some
sort of quoting scheme, then that's something that every developer would
have to understand even though it would most likely be of benefit to few.
I'm concerned also, though to a lesser degree, about the complexity of
the class-file specification (i.e., the JVMS), the long-term evolvability
of that specification, and the complexity of code that reads and writes
class files, though the latter is second-order since it mostly affects
maintainers of IDEs, compilers, and other kinds of tools rather than
developers in general. If there's a compelling reason to lift the usual
restrictions on the representation of qualified names in class files, at
least in the case of module names, then I'd like to hear it.
> ...
>
>
> But the other implicit requirement that we have is to ensure that it's
> possible to adapt Java EE in some reasonable manner. It's my belief
> that in order to do so we need to be able to create modules with names
> that match the current constraints for Java EE module names (i.e.
> effectively no constraint, just valid UTF-8). Name mangling to
> accommodate this (which would be our only other option) is unnecessarily
> user-unfriendly at best, and outright incompatible at worst.
I don't expect Java EE modules to map directly to Java SE modules, nor
do any of the Java EE spec leads with whom I've discussed this issue.
EE modules and SE modules are completely different kinds of things.
- Mark
More information about the jpms-spec-observers
mailing list