Proposal: #ModuleNameCharacters (revised)

Remi Forax forax at univ-mlv.fr
Sat Dec 10 07:50:39 UTC 2016


No, i hope it's more that ModuleDescriptor will be an interface.
So we can have our own module descriptor builder.

Rémi 



On December 9, 2016 11:05:02 PM GMT+01:00, "David M. Lloyd" <david.lloyd at redhat.com> wrote:
>Whoops, hang on... one problem I didn't spot on my first read-through:
>
>>> We will therefore retain the present constraints on module names in
>the
>>> source language and also continue to enforce those constraints in
>the
>>> `ModuleDescriptor.Builder` API, which is intended to be consistent
>with
>>> the language.  (The `ModuleDescriptor` API will continue to be able
>to
>>> read class files that contain module names not expressible in the
>source
>>> language.)
>
>So... essentially a custom module system has to generate binary 
>descriptors?  That's going to be a real pain.
>
>
>On 12/09/2016 03:48 PM, David M. Lloyd wrote:
>> +1 here
>>
>> On 12/09/2016 03:45 PM, mark.reinhold at oracle.com wrote:
>>> Issue summary
>>> -------------
>>>
>>>   #ModuleNameCharacters --- Module names are presently constrained
>to
>>>   be Java identifiers.  Some existing module systems allow
>additional
>>>   characters in module names, such as hyphens and slashes.  Should
>this
>>>   restriction be lifted or, perhaps, should it somehow be made
>>>   layer-specific? [1]
>>>
>>> Proposal
>>> --------
>>>
>>> Do not change the treatment of module names in source code; they
>will
>>> remain qualified names.  Revise the encoding of module names in
>compiled
>>> module-declaration class files to lift the current constraints but
>adopt
>>> new, less onerous constraints that still provide for the future
>evolution
>>> of the platform.  Revise the format of class files to structure
>module
>>> and package names in a manner consistent with that already used for
>other
>>> kinds of constrained names.
>>>
>>>                                   * * *
>>>
>>> Modules are a new construct of the Java programming language in the
>>> present design.  In the source language they are hence identified by
>>> qualified names [2] in the same manner as the existing structural
>>> constructs, i.e., packages and classes.  As such these names do
>allow
>>> some unusual characters, though not hyphens or slashes [3].
>>>
>>> In the very long term a future version of the language may well
>support
>>> not just the declaration of modules, and of relationships between
>them,
>>> but also the expression of operations upon them as is possible in,
>e.g.,
>>> Standard ML [4], or qualified references in code to a type in some
>other
>>> named module, or yet some other kind of use that we do not imagine
>today.
>>> It would hence be unwise at this point to allow module names in
>source
>>> code to be any different in nature than the other kinds of qualified
>>> names already in the language.
>>>
>>> We will therefore retain the present constraints on module names in
>the
>>> source language and also continue to enforce those constraints in
>the
>>> `ModuleDescriptor.Builder` API, which is intended to be consistent
>with
>>> the language.  (The `ModuleDescriptor` API will continue to be able
>to
>>> read class files that contain module names not expressible in the
>source
>>> language.)
>>>
>>>                                   * * *
>>>
>>> Module names in compiled module-declaration class files are
>presently
>>> encoded in the internal form traditionally used for qualified names:
>>> Periods (`.`) are replaced with forward slashes (`/`), and periods,
>>> semicolons (`;`), and left square brackets (`[`) are forbidden [5].
>>> This encoding is inconvenient for other module systems that may
>>> interoperate with JPMS, so we will abandon it for module names
>despite
>>> the fact that doing so will increase the complexity of any code that
>>> parses class files.
>>>
>>> To allow for the future evolution of the platform we propose a
>different,
>>> less onerous encoding of module names in class files:
>>>
>>>   - If at some future point we find that we need to add structure to
>>>     module names, or combine module names with qualified type names,
>>>     then the `:` character would be a good candidate, even in the
>>>     source language if need be, so we reserve that character now.
>>>
>>>   - We presently use `@` in the API to separate module names from
>>>     version strings, where available, so it is prudent to reserve
>>>     that character in module names in class files also, just in case
>>>     we someday decide to introduce compound module identifiers that
>>>     combine module names with version strings.
>>>
>>>   - In further support of interoperation we will reserve the
>universal
>>>     escape character (`\`) and define the sequences `\\`, `\:`, and
>>>     `\@` to stand for `\`, `:`, and `@`, respectively.
>>>
>>>   - We will finally, for sanity, forbid any character whose Unicode
>code
>>>     point is less than 0x20 (` `).  (Ideally we'd forbid all Unicode
>>>     non-printing characters, but it's best not to have the JVMS
>depend
>>>     too deeply upon details of the Unicode specification.)
>>>
>>> To sum up: In module names in class files reserve `:` and `@` for
>future
>>> use; reserve `\` as an escape character and use it to quote itself,
>`:`,
>>> and `@`; and forbid the non-printing ASCII characters (< 0x20).
>>>
>>>                                   * * *
>>>
>>> The first version of this proposal [6] claimed that the present
>design is
>>> consistent with the existing treatment of qualified names in class
>files.
>>> That is, in fact, not true, since qualified names in class files
>today
>>> are always wrapped in tagged constant-pool structures rather than
>simple
>>> `CONSTANT_Utf8_info` structures.  Class names, e.g., are wrapped in
>>> `CONSTANT_Class_info` structures, which in turn reference the `Utf8`
>>> structures that represent the actual class names [7].
>>>
>>> To address this inconsistency, and particularly in light of the new
>>> encoding of module names described above, we propose to use
>consistent
>>> kinds of class-file structures for module and package names.
>>>
>>> Module names in a compiled module-declaration class file will be
>encoded
>>> as above and wrapped in tagged `CONSTANT_Module_info` structures:
>>>
>>>     CONSTANT_Module_info {
>>>         u1 tag;                 // == CONSTANT_Module == 19
>>>         u2 name_index;          // Index of a CONSTANT_Utf8_info
>>>     }
>>>
>>> Package names in class files will be encoded in the traditional
>internal
>>> form and wrapped in tagged `CONSTANT_Package_info` structures:
>>>
>>>     CONSTANT_Package_info {
>>>         u1 tag;                 // == CONSTANT_Package == 20
>>>         u2 name_index;          // Index of a CONSTANT_Utf8_info
>>>     }
>>>
>>> Existing references in the class-file format to module and package
>names
>>> will be adjusted to refer to these new kinds of tagged structures.
>>>
>>>
>>> [1]
>>>
>http://openjdk.java.net/projects/jigsaw/spec/issues/#ModuleNameCharacters
>>> [2]
>http://docs.oracle.com/javase/specs/jls/se8/html/jls-6.html#jls-6.2
>>> [3]
>http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.8
>>> [4] https://en.wikipedia.org/wiki/Standard_ML#Module_system
>>> [5]
>>>
>http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.2.1
>>> [6]
>>>
>http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-November/000468.html
>>>
>>> [7]
>>>
>http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.4.1
>>>
>>
>
>-- 
>- DML

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


More information about the jpms-spec-experts mailing list