Advice + proposals regarding automodule naming

Robert Scholte rfscholte at
Wed Jan 18 21:14:33 UTC 2017

Hi Rémi,

I'm getting a JavaOne 2015 déjà vu :)

It seems like you expect there will be a new pom-definition to support  
these kind of extra information.
The current POM modelVersion (4.0.0) is not only used by Maven but by a  
lot of tools, probably even more than we know of. We wonder if they do XSD  
checking, so we must be very, very careful with every adjustment. So  
pom-4.0.0 is a fact with all its restrictions. We are working on pom-5.0.0  
but we will always make sure there will also be a pom-4.0.0 available  
(either pre-generated or runtime transformed) for the current tools. Also,  
its definition should work for any software technology, not just for Java.
In the beginning I had the idea of working with new scopes to decide if a  
dependency belongs to the modulepath or classpath, but there's a strict  
set of scopes in pom-4.0.0, so again no option. And by now I know this is  
not required, the info is already there once I can read all module-info  
It would have helped if a modular jar had a different extension, so every  
can see from the *outside* what kind of jar it is.

There's no such thing as a Maven4 artifact: any artifact is a file (often  
jar) with a coordinate and an extra file with dependency declarations.  
During dependency resolution all build-information is ignored! The problem  
with the module-info file is comparable with the java bytecode version:  
you have to go in the jar to get this information.

At the moment I'm pretty far with the maven-compiler-plugin, but now every  
dependency acts like an automodule. My next step would probably be to  
analyze every module-info file and decide if jars belong to the classpath  
or modulepath, only allowing modular jars on the module path because of  
our concerns.


On Tue, 17 Jan 2017 23:11:11 +0100, <forax at> wrote:

> Robert,
> i fully agree with you that Maven can not use automatic modules.
> Automatic modules have weird name rules, everything is exported and has  
> no dependency itself*, so they are useless if you already have already a  
> trove of info like the Maven POM.
> In my opinion, the real question is not how to map existing Maven  
> artifacts to Java modules but more,
> how Maven 4 artifacts are mapped to Java modules and then how to make  
> the transition between Maven 3 artifacts to Maven 4 artifacts as smooth  
> as possible.
> Here is my take on what can be a Maven 4 artifact,
>  - a Maven 4 artifact can only depends other Maven 4 artifact (and their  
> are some way to see a Maven3 artifact as a Maven 4 artifact if the POM  
> is siple enough),
>  - a Maven 4 artifact do not allow split packages (a lot of Maven 3  
> artifact uses split packages because it's a cool way to do an after the  
> fact modularisation
>    without changing the name of the module)
>  - a Maven 4 artifact info is specified with info extracted from the  
> module-info and from the POM
>    (version is in the POM, exported packages are in the module-info, ...)
>  etc.
> once you have the precise rules, it will be easier to see how to map a  
> Maven 3 artifact to a Maven 4 and what are the compatibility rules.
> regards,
> Rémi
> * apart if you want to play with configurations that mix modulepath and  
> classpath but these kind of configurations are really hard to debug.
> ----- Mail original -----
>> De: "Robert Scholte" <rfscholte at>
>> À: "Remi Forax" <forax at>
>> Cc: jpms-spec-experts at, "Brian Fox"  
>> <brianf at>
>> Envoyé: Mardi 17 Janvier 2017 13:04:08
>> Objet: Re: Advice + proposals regarding automodule naming
>> Hi Rémi,
>> In the end every non-jdk.* and non-java.* module in the module-info will
>> be a dependency in your buildtool descriptor. Such module must match
>> exactly one versionless dependency, or conflictId as we call it, which  
>> is
>> in general the groupId + artifactId (type and classifier are not  
>> relevant
>> for this story).
>> By ignoring the groupId a module can referred by multiple dependencies.  
>> So
>> we can expect collissions. For that reason Brian did a quick scan over
>> Maven Central to count the number of duplicate artifactIds.
>> Here's the artifactIds with 100+ groupIds:
>> maven_artifact_id	count(DISTINCT maven_group_id)	count(maven_group_id)
>> library	391	6854
>> core	312	8188
>> common	142	5084
>> ui	138	1414
>> In theory I could have a Maven project with 391 'library'-jars on the
>> classpath without any problem. And as long as they are direct  
>> dependencies
>> I have control over this by simply not adding 'library' as requirement  
>> to
>> module-info. The issues start when different 'library'-jars are  
>> transitive
>> dependencies and when they are marked are required in the module-info  
>> file
>> of my direct or transitive dependencies.
>> Developers of the 'library'-jars cannot use library as the module name  
>> and
>> are forced to pick another name. As developer of my project in the end I
>> decide which versions of dependencies are used. If the 'library'-jar  
>> gets
>> a different module name and my dependency is still referring to the old
>> module name, the project can't be built.
>> What I expect is that developers are forced to remove the requirements
>> from their module-info because of the mentioned issues. So instead of
>> increasing the number requirements it will be reduced. For that reason  
>> we
>> say either use a unique module name from the beginning (GA) or wait  
>> until
>> a dependency has its own module name before adding it as requirement.
>> As far as I know this is the first time the JDK/JRE decides (proposes) a
>> name for an entity based on another entity. There are no relations  
>> between
>> method-, class-, or package-names and there doesn't have to be a  
>> relation
>> between the module name and the filename, so please don't try to do so.
>> regards,
>> Robert
>> On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax <forax at>  
>> wrote:
>>> Hi Robert,
>>> the problem with automatic modules is more general that just the name,
>>> automatics modules also creates a flat hierarchy which doesn't map well
>>> with the Maven artifact descriptor.
>>> I wonder why you want Maven to use automatic modules, or said
>>> differently Maven has a lot of information about the artifact, why do
>>> you want to forget all these information when fetching a Maven  
>>> artifact.
>>> I think that one problem is that you do not want to create a
>>> module-info.class from the Maven POM and insert it into the jar because
>>> it will change the artifact*.
>>> This kind of modules is supported by jigsaw under the name of synthetic
>>> modules. A synthetic module is a module with a module descriptor not
>>> created by javac but by another tool.
>>> In my opinion, automatic modules are interesting when you have jar that
>>> do not come from Maven central but comes from an ad-hoc build tool and
>>> will be considered as a leaf of the dependency DAG.
>>> Otherwise, for existing module system, using a synthetic module seem to
>>> be a better idea.
>>> regards,
>>> Rémi
>>> * given you have also the problem of split packages, you also need a  
>>> way
>>> to merge several artifacts into one modular jar because it's the easy
>>> way to solve the split package problem.
>>> ----- Mail original -----
>>>> De: "Robert Scholte" <rfscholte at>
>>>> À: jpms-spec-experts at
>>>> Cc: "Apache Maven Dev" <dev at>
>>>> Envoyé: Lundi 16 Janvier 2017 10:37:08
>>>> Objet: Advice + proposals regarding automodule naming
>>>> This is a message from Robert Scholte and Brian Fox. We both have been
>>>> talking about this topic several weeks with other Maven developers and
>>>> came to the conclusion that we should warn the jigsaw team with their
>>>> current approach regarding auto modules. We will share our  
>>>> experiences,
>>>> thoughts, conclusions and will suggest two proposals.
>>>> Traditionally, the Java ecosystem has been very mature in terms of
>>>> naming
>>>> and namespacing. The reverse fqdn introduced into the java package  
>>>> was a
>>>> great choice to ensure classes don’t conflict. Popular build tools  
>>>> such
>>>> as
>>>> Maven and nearly all those that followed built upon that this key
>>>> concept
>>>> with the introduction of “GroupId” also using the fqdn as part of the
>>>> name
>>>> to ensure the coordinates were properly namespaced.
>>>> We’ve seen some ecosystems diverge from this leading to new challenges
>>>> that ultimately had to be reversed. A great example can be seen in  
>>>> the “
>>>> tragic mistake from npm creators ” [1] which was to launch without a
>>>> namespace concept. Eventually, NPM started running out of useful names
>>>> and
>>>> had to backtrack to introduce “scopes” which is really just a  
>>>> namespace
>>>> [2]. The real problem here is that the major change in namespace was
>>>> backed in after several years of momentum without it. It’s taken a  
>>>> long
>>>> time for tooling and best practice to catch up to scopes and in the
>>>> interim, people have been left with a dual mode, some namespaced, some
>>>> not
>>>> namespaced situation that has created chaos. [3]
>>>> The real issue at hand here as we consider behaviors in the jigsaw
>>>> automodule revolves around two well studied concepts.
>>>> The most important is the “Default effect” [3] which states that
>>>> whatever
>>>> the default behavior is will become the most prominent best practice.  
>>>> A
>>>> default that uses a filename to generate a very short, un-namespaced
>>>> module id effectively sets the behavior to create generic names that
>>>> will
>>>> eventually conflict...exactly what we’ve seen in npm.
>>>> Additionally, The switching costs introduced in overcoming a default
>>>> un-namespaced module id to one with a unique namespace is also
>>>> significant
>>>> once you consider all the potential users. This is why API change is
>>>> hard,
>>>> and changing the module id after the fact from the default is
>>>> effectively
>>>> an API change.
>>>> The second principal at hand is the “Principle of least astonishment”.
>>>> We
>>>> want to find a default that doesn’t violate what most users would
>>>> consider
>>>> to be the most obvious. One could argue the current auto module
>>>> algorithm
>>>> doesn’t violate this principle, but it’s important to consider  
>>>> alternate
>>>> suggestions in this light.
>>>> First, lets explore the potential downsides if the default effect  
>>>> takes
>>>> hold with the currently generated auto module id. In Apache Maven, the
>>>> artifact id is the part of the coordinate that generates the filename.
>>>> This means that com.somecompany:artifact:version will become
>>>> artifact-version.jar, which would result in automodule id “artifact”.
>>>> Armed with this understanding, that does an analysis of the Maven
>>>> ecosystem have to say about potential conflicts in the automodule id?
>>>> If we ignore the groupid and version of all the components in the  
>>>> Maven
>>>> Central repository, we end up with over 13,500 (7% of the total
>>>> group:artifact combinations) conflicts. This does not consider  
>>>> conflicts
>>>> across other repositories, or within customer portfolios yet it is
>>>> pretty
>>>> telling. Conflicts will happen. In some cases, the number of conflicts
>>>> on
>>>> the same common names is well above 100. The list of conflicts as of
>>>> October, 2016 can be seen here. [6]
>>>> At this point, hopefully we’ve made the case for at least  
>>>> establishing a
>>>> default module id that
>>>> 1. Uses namespaces to minimizes id conflicts when possible
>>>> 2. Leverages the default effect to create a de facto best practice
>>>> 3. Follows the principle of least astonishment
>>>> We have two potential proposals that solve these goals.
>>>> Proposal 1: Leverage existing coordinates when available.
>>>> Maven is inarguably the most popular build system for Java components,
>>>> with Maven Central being the default and largest repository of Java
>>>> components in the world. By default, every jar built by Maven
>>>> automatically gets a simple properties file inserted into it with its
>>>> unique coordinates. Now, not every jar in Central was built with  
>>>> Maven,
>>>> however 94% of them were, as we can find the file in
>>>> 1,806,023 of the 1,913,561 central components . Talk about the default
>>>> effect in action!
>>>> It’s further important to recognize that given a jar with a
>>>> declaring coordinates, it means that the project itself has chosen  
>>>> those
>>>> coordinates as their own name. In other words, this is how they refer  
>>>> to
>>>> themselves, even if other consumers may not be using Maven directly.
>>>> If automodule were able to peek inside a jar and generate the default  
>>>> id
>>>> using the groupid and artifactid present in the file, this would  
>>>> nearly
>>>> eliminate all instances of id conflict because a significant portion  
>>>> of
>>>> the Java ecosystem is in fact built with Maven. Additionally, the fact
>>>> that 1.8 million (and counting) modules would have namespace as the
>>>> default behavior means we’ve taken a huge step in setting the best
>>>> practice of picking module ids with a namepace. Additionally, since  
>>>> the
>>>> project itself has chosen these coordinates and uses them as their
>>>> primary
>>>> distribution mechanism, this follows the principle of least  
>>>> astonishment
>>>> to consumers regardless of their chosen build system. Finally, since  
>>>> all
>>>> of the above are true, it’s unlikely the project would need to migrate
>>>> to
>>>> a new module id when they adopt jigsaw natively, thus avoiding an API
>>>> switching cost for their users.
>>>> Proposal 2: Drop automodules
>>>> Right now Jigsaw tries to calculate a module name solely based on the
>>>> name
>>>> of the jar file, which now already causes issues. Besides the fact  
>>>> that
>>>> the module name is not guaranteed unique compared with its Maven
>>>> coordinate, there are extra transformations which makes it even less
>>>> guaranteed that it is unique; e.g. dashes are replaced by dots (which
>>>> are
>>>> both valid artifactId characters), in some cases the number and their
>>>> following characters are stripped off. For artifacts like
>>>> jboss-servlet-api_4.0_spec it makes sense, however we already see  
>>>> issues
>>>> here where commons-lang, commons-lang2 and commons-lang3 get the same
>>>> module name,
>>>> even though they have different artifactIds and contain different
>>>> packages. Choosing different artifactIds and packages was a very wise
>>>> decision because it made it possible that these jars could live next  
>>>> to
>>>> each other. Removing that separation by the authors is a very unwise
>>>> decision.
>>>> Another known example is the jsrNNN jars, which now all get jsr as the
>>>> module name.
>>>> Is it highly unlikely there is one single rule to capture all the use
>>>> cases and which always result in a module name we can work with.
>>>> For that reason the other proposal is to simply drop automodules.  
>>>> Don’t
>>>> try to come up with a name for unnamed jars. It might look like the
>>>> feature of automodules makes migrating easier because every dependency
>>>> will get a name so can complete your module-info for all requirements,
>>>> but
>>>> we expect that once Jigsaw comes to speed the invalid module names are
>>>> actually blocking further development due to name collisions or forced
>>>> renaming by transitive modular jars.
>>>> The advantage of this proposal is that library builders are not forced
>>>> to
>>>> keep the proposed module name in order to maintain backwards
>>>> compatibility
>>>> with the default.. Instead library builders can pick a more suitable
>>>> module name. The modular system doesn’t allow the same package to be
>>>> exported by multiple jars (and automodules exports every package).
>>>> Library
>>>> builders can fix this is their new jars, however if end users would
>>>> require both jars because they were specified as requirements in
>>>> different
>>>> transitive jars, you cannot compile this project. There’s just no
>>>> dependency-excludes like Maven has, because “requires” in the
>>>> module-info
>>>> really means requires. Dropping automodules will prevent these kind of
>>>> issues, because a package can only be exported by a named module.
>>>> Sure, this means that for end users they cannot refer to every jar in
>>>> their module-info. But at least if they add a “requires” to their
>>>> module-info, they can ensure that it’ll always refer to the intended
>>>> modular jar. With build tools like Maven the chance of missing  
>>>> artifacts
>>>> on the classpath has already been reduced a lot. In general builds  
>>>> have
>>>> become quite stable, so we don’t expect that developers will translate
>>>> all
>>>> dependencies to the module-info file, especially if we warn them about
>>>> the
>>>> possible consequences of depending on automodules. Only referring to
>>>> named
>>>> modules and even a single “requires” is already a gain. There’s no
>>>> reason
>>>> to try to speed this up and give the developer the false impression  
>>>> that
>>>> it’ll keep working when upgrading to real modular jars. Focus should  
>>>> be
>>>> on
>>>> the target, not on the path how to reach it.
>>>> Dropping the automodules will prevent a lot of discussions about what  
>>>> is
>>>> the correct way to select a module name and will give the  
>>>> responsibility
>>>> for the name back to the place where it belongs: the developer.
>>>> [1]
>>>> [2]
>>>> [3] The fact that so much of the npm ecosystem is effectively
>>>> not-namespaced is has actually
>>>> created potential build time malware injection possibilities. If I  
>>>> know
>>>> of
>>>> a package in use by a
>>>> company through log analysis, bug report analysis etc, I could
>>>> potentially
>>>> go register the same
>>>> name in the default repo with a very high semver and know that it’s  
>>>> very
>>>> likely this would be
>>>> picked up over the intended internally developed module because  
>>>> there’s
>>>> no
>>>> namespace.
>>>> [4]
>>>> [5]
>>>> [6]
>>>> Q5M/edit?usp=sharing
>>>> [7] #Risk and assumptions
>>>> [8]
>> >>  

More information about the jpms-spec-observers mailing list