How to name modules, automatic and otherwise
Robert Scholte
rfscholte at apache.org
Mon Feb 20 17:50:38 UTC 2017
On Fri, 17 Feb 2017 11:45:14 +0100, Remi Forax <forax at univ-mlv.fr> wrote:
> I agree that
> - we should not bake the Maven way of specify identifiers into the
> language.
> - we should not drop automatic modules, this is a practical way to do
> the migration if you are maintaining an application.
>
> On the addition of Module-Name attribute, i can understand that it's a
> simple way to reserve a name. I will agree to introduce that in the spec
> only if Robert and Brian finds that an automatic module with a
> Module-Name attribute is something that can be pushed on Maven Central.
> Otherwise, specifying a Module-Name is not very different of writing a
> module-info.java with everything exported (i see the fact that a regular
> module can not access to the classpath unlike an automatic module as a
> bonus).
This looks fine to me. It will close the gap where you want to be able to
give your project a module name for others to be used, in case you cannot
write a module-info file because it would depend on automodules.
I'm investigating some ideas which should prevent projects specifying
references to automodules to be published to Central.
>
> I fully disagree on the idea of using short names for module, the Java
> community is too big for this kind of shortcut, the reverse DNS prefix
> have serve us well and counting the number of keystrokes is not a good
> metric.
> I've spend some time with Herve Boutemy last Tuesday night (and early
> Wednesday) to discuss about existing modules in Maven Central interropt.
> The Maven group id is something you have to ask before being able to
> publish on Maven Central, so it's more a Maven Central group id that a
> Maven group id. We should recommend that existing module on Maven
> Central should use a name that starts with the Maven Central group id.
> By example, Google's Guava should be com.google.guava or
> com.google.guava.core (or anything else that starts with
> com.google.guava).
>
> This will solve the name collisions issues between artifacts of Maven
> Central at least.
From JPMS I understand that in the end this is simply an identifier. What
it looks like, it doesn't matter, as long as it is uniquely resolved on
the module path.
There is this kind of trust where developers should pick there module name
wisely, otherwise their project won't work in a modular world. In case of
classes there are the same issues, but there were these tricks of class
path order or package relocation to solve collisions, but that's not
possible with modules.
Choosing the right name will be very important, and it better not match
only the artifactId.
thanks,
Robert
>
> Rémi
>
> ----- Mail original -----
>> De: "mark reinhold" <mark.reinhold at oracle.com>
>> À: jpms-spec-experts at openjdk.java.net
>> Envoyé: Jeudi 16 Février 2017 17:48:27
>> Objet: How to name modules, automatic and otherwise
>
>> This note is in reply to the concerns about automatic modules raised by
>> Robert Scholte and Brian Fox [1], and by Stephen Colebourne and others
>> [2]. I've collected my conclusions here rather than in separate
>> messages
>> because there are several distinct yet intertwined issues.
>>
>> Summary:
>>
>> - Module names should not include Maven group identifiers, because
>> modules are more abstract than the artifacts that define them.
>>
>> - Module names should use the reverse-domain-name-prefix convention
>> or, preferably, the project-name-prefix convention.
>>
>> - We should not abandon automatic modules, since they are a key tool
>> for migration and adoption.
>>
>> - We can address the problems of automatic modules with two fairly
>> minor technical enhancements.
>>
>> If any of these points strikes you as controversial, please read on!
>>
>> * * *
>>
>> Module names should not include Maven group identifiers, as Robert
>> Scholte and Brian Fox suggest [1], even for modules declared explicitly
>> in `module-info.java` files. Modules in JPMS are a construct of the
>> Java
>> programming language, implemented in both the compiler and the virtual
>> machine. As such, they are more abstract entities than the artifacts
>> that define them. This distinction is useful, both conceptually and
>> practically, hence module names should remain more abstract.
>>
>> This distinction is useful conceptually because it makes it easier, as
>> we read source code, to think clearly about the nature of a module. We
>> can reason about a module's dependences, exports, services, and so forth
>> without cluttering our minds with the details of group identifiers and
>> version constraints. Today, e.g., we can write, and read:
>>
>> module foo.data {
>> exports com.bar.foo.data;
>> requires hibernate.core;
>> requires hibernate.jcache;
>> requires hibernate.validator;
>> }
>>
>> If we were to extend the syntax of module names to include group
>> identifiers, and encourage people to use them, then we'd be faced with
>> something much more verbose:
>>
>> module com.bar:foo.data {
>> exports com.bar.foo.data;
>> requires org.hibernate:hibernate.core;
>> requires org.hibernate:hibernate.jcache;
>> requires org.hibernate:hibernate.validator;
>> }
>>
>> Group identifiers make perfect sense in the context of a build system
>> such as Maven, where they bring necessary structure to the names of the
>> millions of artifacts available across different repositories. Such
>> structure is superfluous and distracting in the context of a module
>> system, where the number of relevant modules in any particular situation
>> is more likely to be in the tens, or hundreds, or (rarely) thousands.
>> All else being equal, simpler names are better.
>>
>> At a practical level, the distinction between modules and artifacts is
>> useful because it leaves the entire problem of artifact selection to the
>> build system. This allows us to switch from one artifact to another
>> simply by editing a `pom.xml` file to adjust a version constraint or a
>> group identifier; if module names included group identifiers then we'd
>> also have to edit the `module-info.java` file. This flexibility can be
>> helpful if, e.g., a project is forked and a new module with the same
>> name
>> and artifact identifier is published under a different group identifier.
>> We long ago decided not to do version selection in the module system,
>> which surprised some people but has worked out fairly well. We should
>> treat group selection in the same manner.
>>
>> Another practical benefit of the module/artifact distinction is that it
>> keeps the module system independent of any particular build system, so
>> that build systems can continue to improve and evolve independently over
>> time. Maven-style coordinates are the most popular way to name
>> artifacts
>> in repositories today, but that might not be true ten years from now.
>> It
>> would be unwise to adopt Maven's naming convention for module names just
>> because it's popular now, and doubly so to bake Maven's group-identifier
>> concept into the Java programming language.
>>
>> * * *
>>
>> If module names don't include group identifiers, then how should modules
>> be named? What advice should we give to someone who's creating a new
>> module from scratch, or modularizing an existing component by writing a
>> `module-info.java` file for it? (Continue to set aside, for the moment,
>> the problems of automatic modules.)
>>
>> In structuring any particular space of names we must balance (at least)
>> three fundamental tensions: We want names that are long enough to be
>> descriptive, short enough to be memorable, and unique enough to avoid
>> needless conflicts.
>>
>> If you control all of the modules upon which your module depends, and
>> all of the modules that depend upon it, then you can of course name your
>> module whatever you want, and change its name at any time. If, however,
>> you're going to publish your module for use by others -- whether just
>> within your own organization or to a global repository such as Maven
>> Central -- then you should take more care. There are two well-known
>> ways to go about this.
>>
>> - Choose module names that start with the reversed form of an Internet
>> domain name that you control, or are at least associated with. The
>> Java Language Specification has long suggested this convention as a
>> way to minimize conflicts amongst package names, and it has been
>> widely though not universally adopted for that purpose.
>>
>> - Choose module names that start with the name of your project or
>> product. Module (and package) names that start with reversed domain
>> names are less likely to conflict but they're unnecessarily verbose,
>> they start with the least-important information (e.g., `com`, `org`,
>> or `net`), and they don't read well after exogenous changes such as
>> open-source donations or corporate acquisitions (e.g., `com.sun.*`).
>>
>> The reversed domain-name approach was sensible in the early days of
>> Java,
>> before we had development tools sophisticated enough to help us deal
>> with
>> the occasional conflict. We have such tools now, so going forward the
>> superior readability of short module and package names that start with
>> project or product names is preferable to the onerous verbosity of those
>> that start with reversed domain names.
>>
>> This advice will strike some readers as controversial. I respect those
>> who will choose, for the sake of tradition or an abundance of caution,
>> to
>> use the reversed domain-name convention for module names and continue to
>> use that convention for package names. I do know, however, of at least
>> one major, well-known project whose developers intend to adopt the
>> project-name-prefix convention for their module names.
>>
>> * * *
>>
>> If module names don't include group identifiers, then how should
>> automatic
>> modules be named? Or are automatic modules so troublesome that we
>> should
>> remove them from the design?
>>
>> To answer the second question first: It would be a tragic shame to drop
>> automatic modules, since otherwise top-down migration is impossible if
>> you're not willing to modify artifacts that you don't maintain, which
>> most people (quite sensibly) aren't. Even if you limit your use of
>> automatic modules to closed systems, as Stephen Colebourne suggests [2],
>> they're still of significant value. Let's see if we can rescue them.
>>
>> The present algorithm for naming automatic modules has two problems:
>>
>> (A) Conflicts are possible across large artifact repositories, since
>> the name of an automatic module is computed from the name of the
>> artifact that defines it. [1]
>>
>> (B) It's risky to publish a module that `requires` some other module
>> that has not yet been modularized, and hence must be used as an
>> automatic module. If the maintainer of that module later chooses
>> an explicit name different from the automatic name then you must
>> publish a new version of your module with an updated `requires`
>> directive. [2]
>>
>> As to (A), yes, conflicts exist, though it's worth observing that many
>> of
>> the conflicts in the Maven Central data are due to poorly-chosen
>> artifact
>> names: `parent`, `library`, `core`, and `common` top the list, which
>> then
>> falls off in a long-tail distribution. When conflicts are detected then
>> build tools can rename artifacts either automatically or, preferably, to
>> user-specified names that map to sensible automatic-module names. If
>> renaming artifacts in the filesystem proves impractical then we could
>> extend the syntax of the `--module-path` option to allow a module name
>> to be specified for each specifically-named artifact, though strictly
>> speaking that would be a feature of the JDK rather than JPMS.
>>
>> We can address (B) by enabling the maintainers of existing components to
>> specify the module names that should be given to their components when
>> used as automatic modules, without having to write `module-info.java`
>> files. This can be done very simply, with a single new JAR-file
>> manifest
>> `Module-Name` attribute, as first suggested almost a year ago [3].
>>
>> If we add this one feature then the maintainer of an existing component
>> that, e.g., must still build and run on JDK 7 can choose a module name
>> for that component, record it in the manifest by adding a few lines to
>> the `pom.xml` file, and tell users that they can use it as an automatic
>> module on JDK 9 without fear that the module name will change when the
>> component is properly modularized some years from now. The actual
>> change
>> to the component is small and low-risk, so it can reasonably be done in
>> a patch release. There's no need to write a `module-info.java` file,
>> and in fact doing so may be inadvisable at this point if the component
>> depends on other components that have not yet been given module names.
>>
>> This approach for (B) does add one more (optional) step to the migration
>> path, but it will hopefully lead to a larger number of explicitly-named
>> modules in the world -- and in particular in Maven Central -- sooner
>> rather than later.
>>
>> - Mark
>>
>>
>> [1]
>> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2017-January/000537.html
>> [2]
>> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-January/011106.html
>> [3]
>> http://openjdk.java.net/projects/jigsaw/spec/issues/#ModuleNameInManifest
More information about the jpms-spec-experts
mailing list