From forax at univ-mlv.fr Sun Jan 1 23:43:10 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 2 Jan 2017 00:43:10 +0100 (CET) Subject: Proposal: #ModuleNameCharacters (revised) In-Reply-To: <20161209214546.D23A721F6C@eggemoggin.niobe.net> References: <20161209214546.D23A721F6C@eggemoggin.niobe.net> Message-ID: <1185843677.6305.1483314190501.JavaMail.zimbra@u-pem.fr> Re-reading this thread after a message sent privately by Ess Kay aksing why spaces are supported, i think we should disallow 0x20 (space) too, having a module not found because a module name has a trailing space will not be fun. R?mi ----- Mail original ----- > De: "mark reinhold" > ?: jpms-spec-experts at openjdk.java.net > Envoy?: Vendredi 9 D?cembre 2016 22:45:46 > Objet: Proposal: #ModuleNameCharacters (revised) > Issue summary > ------------- > > #ModuleNameCharacters --- Module names are presently constrained to > be Java identifiers. Some existing module systems allow additional > characters in module names, such as hyphens and slashes. Should this > restriction be lifted or, perhaps, should it somehow be made > layer-specific? [1] > > Proposal > -------- > > Do not change the treatment of module names in source code; they will > remain qualified names. Revise the encoding of module names in compiled > module-declaration class files to lift the current constraints but adopt > new, less onerous constraints that still provide for the future evolution > of the platform. Revise the format of class files to structure module > and package names in a manner consistent with that already used for other > kinds of constrained names. > > * * * > > Modules are a new construct of the Java programming language in the > present design. In the source language they are hence identified by > qualified names [2] in the same manner as the existing structural > constructs, i.e., packages and classes. As such these names do allow > some unusual characters, though not hyphens or slashes [3]. > > In the very long term a future version of the language may well support > not just the declaration of modules, and of relationships between them, > but also the expression of operations upon them as is possible in, e.g., > Standard ML [4], or qualified references in code to a type in some other > named module, or yet some other kind of use that we do not imagine today. > It would hence be unwise at this point to allow module names in source > code to be any different in nature than the other kinds of qualified > names already in the language. > > We will therefore retain the present constraints on module names in the > source language and also continue to enforce those constraints in the > `ModuleDescriptor.Builder` API, which is intended to be consistent with > the language. (The `ModuleDescriptor` API will continue to be able to > read class files that contain module names not expressible in the source > language.) > > * * * > > Module names in compiled module-declaration class files are presently > encoded in the internal form traditionally used for qualified names: > Periods (`.`) are replaced with forward slashes (`/`), and periods, > semicolons (`;`), and left square brackets (`[`) are forbidden [5]. > This encoding is inconvenient for other module systems that may > interoperate with JPMS, so we will abandon it for module names despite > the fact that doing so will increase the complexity of any code that > parses class files. > > To allow for the future evolution of the platform we propose a different, > less onerous encoding of module names in class files: > > - If at some future point we find that we need to add structure to > module names, or combine module names with qualified type names, > then the `:` character would be a good candidate, even in the > source language if need be, so we reserve that character now. > > - We presently use `@` in the API to separate module names from > version strings, where available, so it is prudent to reserve > that character in module names in class files also, just in case > we someday decide to introduce compound module identifiers that > combine module names with version strings. > > - In further support of interoperation we will reserve the universal > escape character (`\`) and define the sequences `\\`, `\:`, and > `\@` to stand for `\`, `:`, and `@`, respectively. > > - We will finally, for sanity, forbid any character whose Unicode code > point is less than 0x20 (` `). (Ideally we'd forbid all Unicode > non-printing characters, but it's best not to have the JVMS depend > too deeply upon details of the Unicode specification.) > > To sum up: In module names in class files reserve `:` and `@` for future > use; reserve `\` as an escape character and use it to quote itself, `:`, > and `@`; and forbid the non-printing ASCII characters (< 0x20). > > * * * > > The first version of this proposal [6] claimed that the present design is > consistent with the existing treatment of qualified names in class files. > That is, in fact, not true, since qualified names in class files today > are always wrapped in tagged constant-pool structures rather than simple > `CONSTANT_Utf8_info` structures. Class names, e.g., are wrapped in > `CONSTANT_Class_info` structures, which in turn reference the `Utf8` > structures that represent the actual class names [7]. > > To address this inconsistency, and particularly in light of the new > encoding of module names described above, we propose to use consistent > kinds of class-file structures for module and package names. > > Module names in a compiled module-declaration class file will be encoded > as above and wrapped in tagged `CONSTANT_Module_info` structures: > > CONSTANT_Module_info { > u1 tag; // == CONSTANT_Module == 19 > u2 name_index; // Index of a CONSTANT_Utf8_info > } > > Package names in class files will be encoded in the traditional internal > form and wrapped in tagged `CONSTANT_Package_info` structures: > > CONSTANT_Package_info { > u1 tag; // == CONSTANT_Package == 20 > u2 name_index; // Index of a CONSTANT_Utf8_info > } > > Existing references in the class-file format to module and package names > will be adjusted to refer to these new kinds of tagged structures. > > > [1] http://openjdk.java.net/projects/jigsaw/spec/issues/#ModuleNameCharacters > [2] http://docs.oracle.com/javase/specs/jls/se8/html/jls-6.html#jls-6.2 > [3] http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.8 > [4] https://en.wikipedia.org/wiki/Standard_ML#Module_system > [5] http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.2.1 > [6] > http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-November/000468.html > [7] http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.4.1 From david.lloyd at redhat.com Tue Jan 3 16:06:53 2017 From: david.lloyd at redhat.com (David M. Lloyd) Date: Tue, 3 Jan 2017 10:06:53 -0600 Subject: Proposal: #ModuleNameCharacters (revised) In-Reply-To: <1185843677.6305.1483314190501.JavaMail.zimbra@u-pem.fr> References: <20161209214546.D23A721F6C@eggemoggin.niobe.net> <1185843677.6305.1483314190501.JavaMail.zimbra@u-pem.fr> Message-ID: <0ada8080-ddc3-fc0c-912a-fb54ee0777b2@redhat.com> I think that this should be a policy decision that is made by the module system in question. That's really the point of having the JVM use general rules. And if javac itself is enforcing strict parsing rules, I'd say the risk is very minimal, especially versus the inconvenience that module systems like ours will encounter from such restrictions. I do think that controls are a different story though. On 01/01/2017 05:43 PM, Remi Forax wrote: > Re-reading this thread after a message sent privately by Ess Kay aksing why spaces are supported, i think we should disallow 0x20 (space) too, > having a module not found because a module name has a trailing space will not be fun. > > R?mi > > ----- Mail original ----- >> De: "mark reinhold" >> ?: jpms-spec-experts at openjdk.java.net >> Envoy?: Vendredi 9 D?cembre 2016 22:45:46 >> Objet: Proposal: #ModuleNameCharacters (revised) > >> Issue summary >> ------------- >> >> #ModuleNameCharacters --- Module names are presently constrained to >> be Java identifiers. Some existing module systems allow additional >> characters in module names, such as hyphens and slashes. Should this >> restriction be lifted or, perhaps, should it somehow be made >> layer-specific? [1] >> >> Proposal >> -------- >> >> Do not change the treatment of module names in source code; they will >> remain qualified names. Revise the encoding of module names in compiled >> module-declaration class files to lift the current constraints but adopt >> new, less onerous constraints that still provide for the future evolution >> of the platform. Revise the format of class files to structure module >> and package names in a manner consistent with that already used for other >> kinds of constrained names. >> >> * * * >> >> Modules are a new construct of the Java programming language in the >> present design. In the source language they are hence identified by >> qualified names [2] in the same manner as the existing structural >> constructs, i.e., packages and classes. As such these names do allow >> some unusual characters, though not hyphens or slashes [3]. >> >> In the very long term a future version of the language may well support >> not just the declaration of modules, and of relationships between them, >> but also the expression of operations upon them as is possible in, e.g., >> Standard ML [4], or qualified references in code to a type in some other >> named module, or yet some other kind of use that we do not imagine today. >> It would hence be unwise at this point to allow module names in source >> code to be any different in nature than the other kinds of qualified >> names already in the language. >> >> We will therefore retain the present constraints on module names in the >> source language and also continue to enforce those constraints in the >> `ModuleDescriptor.Builder` API, which is intended to be consistent with >> the language. (The `ModuleDescriptor` API will continue to be able to >> read class files that contain module names not expressible in the source >> language.) >> >> * * * >> >> Module names in compiled module-declaration class files are presently >> encoded in the internal form traditionally used for qualified names: >> Periods (`.`) are replaced with forward slashes (`/`), and periods, >> semicolons (`;`), and left square brackets (`[`) are forbidden [5]. >> This encoding is inconvenient for other module systems that may >> interoperate with JPMS, so we will abandon it for module names despite >> the fact that doing so will increase the complexity of any code that >> parses class files. >> >> To allow for the future evolution of the platform we propose a different, >> less onerous encoding of module names in class files: >> >> - If at some future point we find that we need to add structure to >> module names, or combine module names with qualified type names, >> then the `:` character would be a good candidate, even in the >> source language if need be, so we reserve that character now. >> >> - We presently use `@` in the API to separate module names from >> version strings, where available, so it is prudent to reserve >> that character in module names in class files also, just in case >> we someday decide to introduce compound module identifiers that >> combine module names with version strings. >> >> - In further support of interoperation we will reserve the universal >> escape character (`\`) and define the sequences `\\`, `\:`, and >> `\@` to stand for `\`, `:`, and `@`, respectively. >> >> - We will finally, for sanity, forbid any character whose Unicode code >> point is less than 0x20 (` `). (Ideally we'd forbid all Unicode >> non-printing characters, but it's best not to have the JVMS depend >> too deeply upon details of the Unicode specification.) >> >> To sum up: In module names in class files reserve `:` and `@` for future >> use; reserve `\` as an escape character and use it to quote itself, `:`, >> and `@`; and forbid the non-printing ASCII characters (< 0x20). >> >> * * * >> >> The first version of this proposal [6] claimed that the present design is >> consistent with the existing treatment of qualified names in class files. >> That is, in fact, not true, since qualified names in class files today >> are always wrapped in tagged constant-pool structures rather than simple >> `CONSTANT_Utf8_info` structures. Class names, e.g., are wrapped in >> `CONSTANT_Class_info` structures, which in turn reference the `Utf8` >> structures that represent the actual class names [7]. >> >> To address this inconsistency, and particularly in light of the new >> encoding of module names described above, we propose to use consistent >> kinds of class-file structures for module and package names. >> >> Module names in a compiled module-declaration class file will be encoded >> as above and wrapped in tagged `CONSTANT_Module_info` structures: >> >> CONSTANT_Module_info { >> u1 tag; // == CONSTANT_Module == 19 >> u2 name_index; // Index of a CONSTANT_Utf8_info >> } >> >> Package names in class files will be encoded in the traditional internal >> form and wrapped in tagged `CONSTANT_Package_info` structures: >> >> CONSTANT_Package_info { >> u1 tag; // == CONSTANT_Package == 20 >> u2 name_index; // Index of a CONSTANT_Utf8_info >> } >> >> Existing references in the class-file format to module and package names >> will be adjusted to refer to these new kinds of tagged structures. >> >> >> [1] http://openjdk.java.net/projects/jigsaw/spec/issues/#ModuleNameCharacters >> [2] http://docs.oracle.com/javase/specs/jls/se8/html/jls-6.html#jls-6.2 >> [3] http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.8 >> [4] https://en.wikipedia.org/wiki/Standard_ML#Module_system >> [5] http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.2.1 >> [6] >> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-November/000468.html >> [7] http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.4.1 -- - DML From mark.reinhold at oracle.com Thu Jan 12 15:55:04 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Thu, 12 Jan 2017 07:55:04 -0800 Subject: Please welcome Robert Scholte to JSR 376 Message-ID: <20170112075504.551189961@eggemoggin.niobe.net> Robert Scholte, a long-term member of the Apache Maven Project and current PMC chair, replaces Jason van Zyl, who recently resigned. Robert leads the effort to enhance Maven to support JPMS, and in that role has already contributed significant feedback via other channels. I look forward to his direct contributions as a member of this EG as we finish up our work. - Mark From forax at univ-mlv.fr Mon Jan 16 15:44:03 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 16 Jan 2017 16:44:03 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: Message-ID: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> Hi Robert, the problem with automatic modules is more general that just the name, automatics modules also creates a flat hierarchy which doesn't map well with the Maven artifact descriptor. I wonder why you want Maven to use automatic modules, or said differently Maven has a lot of information about the artifact, why do you want to forget all these information when fetching a Maven artifact. I think that one problem is that you do not want to create a module-info.class from the Maven POM and insert it into the jar because it will change the artifact*. This kind of modules is supported by jigsaw under the name of synthetic modules. A synthetic module is a module with a module descriptor not created by javac but by another tool. In my opinion, automatic modules are interesting when you have jar that do not come from Maven central but comes from an ad-hoc build tool and will be considered as a leaf of the dependency DAG. Otherwise, for existing module system, using a synthetic module seem to be a better idea. regards, R?mi * given you have also the problem of split packages, you also need a way to merge several artifacts into one modular jar because it's the easy way to solve the split package problem. ----- Mail original ----- > De: "Robert Scholte" > ?: jpms-spec-experts at openjdk.java.net > Cc: "Apache Maven Dev" > Envoy?: Lundi 16 Janvier 2017 10:37:08 > Objet: Advice + proposals regarding automodule naming > This is a message from Robert Scholte and Brian Fox. We both have been > talking about this topic several weeks with other Maven developers and > came to the conclusion that we should warn the jigsaw team with their > current approach regarding auto modules. We will share our experiences, > thoughts, conclusions and will suggest two proposals. > > Traditionally, the Java ecosystem has been very mature in terms of naming > and namespacing. The reverse fqdn introduced into the java package was a > great choice to ensure classes don?t conflict. Popular build tools such as > Maven and nearly all those that followed built upon that this key concept > with the introduction of ?GroupId? also using the fqdn as part of the name > to ensure the coordinates were properly namespaced. > > We?ve seen some ecosystems diverge from this leading to new challenges > that ultimately had to be reversed. A great example can be seen in the ? > tragic mistake from npm creators ? [1] which was to launch without a > namespace concept. Eventually, NPM started running out of useful names and > had to backtrack to introduce ?scopes? which is really just a namespace > [2]. The real problem here is that the major change in namespace was > backed in after several years of momentum without it. It?s taken a long > time for tooling and best practice to catch up to scopes and in the > interim, people have been left with a dual mode, some namespaced, some not > namespaced situation that has created chaos. [3] > > The real issue at hand here as we consider behaviors in the jigsaw > automodule revolves around two well studied concepts. > > The most important is the ?Default effect? [3] which states that whatever > the default behavior is will become the most prominent best practice. A > default that uses a filename to generate a very short, un-namespaced > module id effectively sets the behavior to create generic names that will > eventually conflict...exactly what we?ve seen in npm. > > Additionally, The switching costs introduced in overcoming a default > un-namespaced module id to one with a unique namespace is also significant > once you consider all the potential users. This is why API change is hard, > and changing the module id after the fact from the default is effectively > an API change. > > The second principal at hand is the ?Principle of least astonishment?. We > want to find a default that doesn?t violate what most users would consider > to be the most obvious. One could argue the current auto module algorithm > doesn?t violate this principle, but it?s important to consider alternate > suggestions in this light. > > First, lets explore the potential downsides if the default effect takes > hold with the currently generated auto module id. In Apache Maven, the > artifact id is the part of the coordinate that generates the filename. > This means that com.somecompany:artifact:version will become > artifact-version.jar, which would result in automodule id ?artifact?. > Armed with this understanding, that does an analysis of the Maven > ecosystem have to say about potential conflicts in the automodule id? > > If we ignore the groupid and version of all the components in the Maven > Central repository, we end up with over 13,500 (7% of the total > group:artifact combinations) conflicts. This does not consider conflicts > across other repositories, or within customer portfolios yet it is pretty > telling. Conflicts will happen. In some cases, the number of conflicts on > the same common names is well above 100. The list of conflicts as of > October, 2016 can be seen here. [6] > > At this point, hopefully we?ve made the case for at least establishing a > default module id that > 1. Uses namespaces to minimizes id conflicts when possible > 2. Leverages the default effect to create a de facto best practice > 3. Follows the principle of least astonishment > > We have two potential proposals that solve these goals. > > Proposal 1: Leverage existing coordinates when available. > > Maven is inarguably the most popular build system for Java components, > with Maven Central being the default and largest repository of Java > components in the world. By default, every jar built by Maven > automatically gets a simple properties file inserted into it with its > unique coordinates. Now, not every jar in Central was built with Maven, > however 94% of them were, as we can find the pom.properties file in > 1,806,023 of the 1,913,561 central components . Talk about the default > effect in action! > > It?s further important to recognize that given a jar with a pom.properties > declaring coordinates, it means that the project itself has chosen those > coordinates as their own name. In other words, this is how they refer to > themselves, even if other consumers may not be using Maven directly. > > If automodule were able to peek inside a jar and generate the default id > using the groupid and artifactid present in the file, this would nearly > eliminate all instances of id conflict because a significant portion of > the Java ecosystem is in fact built with Maven. Additionally, the fact > that 1.8 million (and counting) modules would have namespace as the > default behavior means we?ve taken a huge step in setting the best > practice of picking module ids with a namepace. Additionally, since the > project itself has chosen these coordinates and uses them as their primary > distribution mechanism, this follows the principle of least astonishment > to consumers regardless of their chosen build system. Finally, since all > of the above are true, it?s unlikely the project would need to migrate to > a new module id when they adopt jigsaw natively, thus avoiding an API > switching cost for their users. > > Proposal 2: Drop automodules > Right now Jigsaw tries to calculate a module name solely based on the name > of the jar file, which now already causes issues. Besides the fact that > the module name is not guaranteed unique compared with its Maven > coordinate, there are extra transformations which makes it even less > guaranteed that it is unique; e.g. dashes are replaced by dots (which are > both valid artifactId characters), in some cases the number and their > following characters are stripped off. For artifacts like > jboss-servlet-api_4.0_spec it makes sense, however we already see issues > here where commons-lang, commons-lang2 and commons-lang3 get the same > module name, > even though they have different artifactIds and contain different > packages. Choosing different artifactIds and packages was a very wise > decision because it made it possible that these jars could live next to > each other. Removing that separation by the authors is a very unwise > decision. > > Another known example is the jsrNNN jars, which now all get jsr as the > module name. > > Is it highly unlikely there is one single rule to capture all the use > cases and which always result in a module name we can work with. > > For that reason the other proposal is to simply drop automodules. Don?t > try to come up with a name for unnamed jars. It might look like the > feature of automodules makes migrating easier because every dependency > will get a name so can complete your module-info for all requirements, but > we expect that once Jigsaw comes to speed the invalid module names are > actually blocking further development due to name collisions or forced > renaming by transitive modular jars. > > The advantage of this proposal is that library builders are not forced to > keep the proposed module name in order to maintain backwards compatibility > with the default.. Instead library builders can pick a more suitable > module name. The modular system doesn?t allow the same package to be > exported by multiple jars (and automodules exports every package). Library > builders can fix this is their new jars, however if end users would > require both jars because they were specified as requirements in different > transitive jars, you cannot compile this project. There?s just no > dependency-excludes like Maven has, because ?requires? in the module-info > really means requires. Dropping automodules will prevent these kind of > issues, because a package can only be exported by a named module. > > Sure, this means that for end users they cannot refer to every jar in > their module-info. But at least if they add a ?requires? to their > module-info, they can ensure that it?ll always refer to the intended > modular jar. With build tools like Maven the chance of missing artifacts > on the classpath has already been reduced a lot. In general builds have > become quite stable, so we don?t expect that developers will translate all > dependencies to the module-info file, especially if we warn them about the > possible consequences of depending on automodules. Only referring to named > modules and even a single ?requires? is already a gain. There?s no reason > to try to speed this up and give the developer the false impression that > it?ll keep working when upgrading to real modular jars. Focus should be on > the target, not on the path how to reach it. > > Dropping the automodules will prevent a lot of discussions about what is > the correct way to select a module name and will give the responsibility > for the name back to the place where it belongs: the developer. > > [1] > http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm > [2] > http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages > [3] The fact that so much of the npm ecosystem is effectively > not-namespaced is has actually > created potential build time malware injection possibilities. If I know of > a package in use by a > company through log analysis, bug report analysis etc, I could potentially > go register the same > name in the default repo with a very high semver and know that it?s very > likely this would be > picked up over the intended internally developed module because there?s no > namespace. > [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) > [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment > [6] > https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj > Q5M/edit?usp=sharing > [7] http://openjdk.java.net/jeps/261 #Risk and assumptions > [8] https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From rfscholte at apache.org Mon Jan 16 09:37:08 2017 From: rfscholte at apache.org (Robert Scholte) Date: Mon, 16 Jan 2017 10:37:08 +0100 Subject: Advice + proposals regarding automodule naming Message-ID: This is a message from Robert Scholte and Brian Fox. We both have been talking about this topic several weeks with other Maven developers and came to the conclusion that we should warn the jigsaw team with their current approach regarding auto modules. We will share our experiences, thoughts, conclusions and will suggest two proposals. Traditionally, the Java ecosystem has been very mature in terms of naming and namespacing. The reverse fqdn introduced into the java package was a great choice to ensure classes don?t conflict. Popular build tools such as Maven and nearly all those that followed built upon that this key concept with the introduction of ?GroupId? also using the fqdn as part of the name to ensure the coordinates were properly namespaced. We?ve seen some ecosystems diverge from this leading to new challenges that ultimately had to be reversed. A great example can be seen in the ? tragic mistake from npm creators ? [1] which was to launch without a namespace concept. Eventually, NPM started running out of useful names and had to backtrack to introduce ?scopes? which is really just a namespace [2]. The real problem here is that the major change in namespace was backed in after several years of momentum without it. It?s taken a long time for tooling and best practice to catch up to scopes and in the interim, people have been left with a dual mode, some namespaced, some not namespaced situation that has created chaos. [3] The real issue at hand here as we consider behaviors in the jigsaw automodule revolves around two well studied concepts. The most important is the ?Default effect? [3] which states that whatever the default behavior is will become the most prominent best practice. A default that uses a filename to generate a very short, un-namespaced module id effectively sets the behavior to create generic names that will eventually conflict...exactly what we?ve seen in npm. Additionally, The switching costs introduced in overcoming a default un-namespaced module id to one with a unique namespace is also significant once you consider all the potential users. This is why API change is hard, and changing the module id after the fact from the default is effectively an API change. The second principal at hand is the ?Principle of least astonishment?. We want to find a default that doesn?t violate what most users would consider to be the most obvious. One could argue the current auto module algorithm doesn?t violate this principle, but it?s important to consider alternate suggestions in this light. First, lets explore the potential downsides if the default effect takes hold with the currently generated auto module id. In Apache Maven, the artifact id is the part of the coordinate that generates the filename. This means that com.somecompany:artifact:version will become artifact-version.jar, which would result in automodule id ?artifact?. Armed with this understanding, that does an analysis of the Maven ecosystem have to say about potential conflicts in the automodule id? If we ignore the groupid and version of all the components in the Maven Central repository, we end up with over 13,500 (7% of the total group:artifact combinations) conflicts. This does not consider conflicts across other repositories, or within customer portfolios yet it is pretty telling. Conflicts will happen. In some cases, the number of conflicts on the same common names is well above 100. The list of conflicts as of October, 2016 can be seen here. [6] At this point, hopefully we?ve made the case for at least establishing a default module id that 1. Uses namespaces to minimizes id conflicts when possible 2. Leverages the default effect to create a de facto best practice 3. Follows the principle of least astonishment We have two potential proposals that solve these goals. Proposal 1: Leverage existing coordinates when available. Maven is inarguably the most popular build system for Java components, with Maven Central being the default and largest repository of Java components in the world. By default, every jar built by Maven automatically gets a simple properties file inserted into it with its unique coordinates. Now, not every jar in Central was built with Maven, however 94% of them were, as we can find the pom.properties file in 1,806,023 of the 1,913,561 central components . Talk about the default effect in action! It?s further important to recognize that given a jar with a pom.properties declaring coordinates, it means that the project itself has chosen those coordinates as their own name. In other words, this is how they refer to themselves, even if other consumers may not be using Maven directly. If automodule were able to peek inside a jar and generate the default id using the groupid and artifactid present in the file, this would nearly eliminate all instances of id conflict because a significant portion of the Java ecosystem is in fact built with Maven. Additionally, the fact that 1.8 million (and counting) modules would have namespace as the default behavior means we?ve taken a huge step in setting the best practice of picking module ids with a namepace. Additionally, since the project itself has chosen these coordinates and uses them as their primary distribution mechanism, this follows the principle of least astonishment to consumers regardless of their chosen build system. Finally, since all of the above are true, it?s unlikely the project would need to migrate to a new module id when they adopt jigsaw natively, thus avoiding an API switching cost for their users. Proposal 2: Drop automodules Right now Jigsaw tries to calculate a module name solely based on the name of the jar file, which now already causes issues. Besides the fact that the module name is not guaranteed unique compared with its Maven coordinate, there are extra transformations which makes it even less guaranteed that it is unique; e.g. dashes are replaced by dots (which are both valid artifactId characters), in some cases the number and their following characters are stripped off. For artifacts like jboss-servlet-api_4.0_spec it makes sense, however we already see issues here where commons-lang, commons-lang2 and commons-lang3 get the same module name, even though they have different artifactIds and contain different packages. Choosing different artifactIds and packages was a very wise decision because it made it possible that these jars could live next to each other. Removing that separation by the authors is a very unwise decision. Another known example is the jsrNNN jars, which now all get jsr as the module name. Is it highly unlikely there is one single rule to capture all the use cases and which always result in a module name we can work with. For that reason the other proposal is to simply drop automodules. Don?t try to come up with a name for unnamed jars. It might look like the feature of automodules makes migrating easier because every dependency will get a name so can complete your module-info for all requirements, but we expect that once Jigsaw comes to speed the invalid module names are actually blocking further development due to name collisions or forced renaming by transitive modular jars. The advantage of this proposal is that library builders are not forced to keep the proposed module name in order to maintain backwards compatibility with the default.. Instead library builders can pick a more suitable module name. The modular system doesn?t allow the same package to be exported by multiple jars (and automodules exports every package). Library builders can fix this is their new jars, however if end users would require both jars because they were specified as requirements in different transitive jars, you cannot compile this project. There?s just no dependency-excludes like Maven has, because ?requires? in the module-info really means requires. Dropping automodules will prevent these kind of issues, because a package can only be exported by a named module. Sure, this means that for end users they cannot refer to every jar in their module-info. But at least if they add a ?requires? to their module-info, they can ensure that it?ll always refer to the intended modular jar. With build tools like Maven the chance of missing artifacts on the classpath has already been reduced a lot. In general builds have become quite stable, so we don?t expect that developers will translate all dependencies to the module-info file, especially if we warn them about the possible consequences of depending on automodules. Only referring to named modules and even a single ?requires? is already a gain. There?s no reason to try to speed this up and give the developer the false impression that it?ll keep working when upgrading to real modular jars. Focus should be on the target, not on the path how to reach it. Dropping the automodules will prevent a lot of discussions about what is the correct way to select a module name and will give the responsibility for the name back to the place where it belongs: the developer. [1] http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm [2] http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages [3] The fact that so much of the npm ecosystem is effectively not-namespaced is has actually created potential build time malware injection possibilities. If I know of a package in use by a company through log analysis, bug report analysis etc, I could potentially go register the same name in the default repo with a very high semver and know that it?s very likely this would be picked up over the intended internally developed module because there?s no namespace. [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment [6] https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj Q5M/edit?usp=sharing [7] http://openjdk.java.net/jeps/261 #Risk and assumptions [8] https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From rfscholte at apache.org Tue Jan 17 12:04:08 2017 From: rfscholte at apache.org (Robert Scholte) Date: Tue, 17 Jan 2017 13:04:08 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> Message-ID: Hi R?mi, In the end every non-jdk.* and non-java.* module in the module-info will be a dependency in your buildtool descriptor. Such module must match exactly one versionless dependency, or conflictId as we call it, which is in general the groupId + artifactId (type and classifier are not relevant for this story). By ignoring the groupId a module can referred by multiple dependencies. So we can expect collissions. For that reason Brian did a quick scan over Maven Central to count the number of duplicate artifactIds. Here's the artifactIds with 100+ groupIds: maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) library 391 6854 core 312 8188 common 142 5084 ui 138 1414 In theory I could have a Maven project with 391 'library'-jars on the classpath without any problem. And as long as they are direct dependencies I have control over this by simply not adding 'library' as requirement to module-info. The issues start when different 'library'-jars are transitive dependencies and when they are marked are required in the module-info file of my direct or transitive dependencies. Developers of the 'library'-jars cannot use library as the module name and are forced to pick another name. As developer of my project in the end I decide which versions of dependencies are used. If the 'library'-jar gets a different module name and my dependency is still referring to the old module name, the project can't be built. What I expect is that developers are forced to remove the requirements from their module-info because of the mentioned issues. So instead of increasing the number requirements it will be reduced. For that reason we say either use a unique module name from the beginning (GA) or wait until a dependency has its own module name before adding it as requirement. As far as I know this is the first time the JDK/JRE decides (proposes) a name for an entity based on another entity. There are no relations between method-, class-, or package-names and there doesn't have to be a relation between the module name and the filename, so please don't try to do so. regards, Robert On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax wrote: > Hi Robert, > the problem with automatic modules is more general that just the name, > automatics modules also creates a flat hierarchy which doesn't map well > with the Maven artifact descriptor. > > I wonder why you want Maven to use automatic modules, or said > differently Maven has a lot of information about the artifact, why do > you want to forget all these information when fetching a Maven artifact. > > I think that one problem is that you do not want to create a > module-info.class from the Maven POM and insert it into the jar because > it will change the artifact*. > This kind of modules is supported by jigsaw under the name of synthetic > modules. A synthetic module is a module with a module descriptor not > created by javac but by another tool. > > In my opinion, automatic modules are interesting when you have jar that > do not come from Maven central but comes from an ad-hoc build tool and > will be considered as a leaf of the dependency DAG. > Otherwise, for existing module system, using a synthetic module seem to > be a better idea. > > regards, > R?mi > > * given you have also the problem of split packages, you also need a way > to merge several artifacts into one modular jar because it's the easy > way to solve the split package problem. > > ----- Mail original ----- >> De: "Robert Scholte" >> ?: jpms-spec-experts at openjdk.java.net >> Cc: "Apache Maven Dev" >> Envoy?: Lundi 16 Janvier 2017 10:37:08 >> Objet: Advice + proposals regarding automodule naming > >> This is a message from Robert Scholte and Brian Fox. We both have been >> talking about this topic several weeks with other Maven developers and >> came to the conclusion that we should warn the jigsaw team with their >> current approach regarding auto modules. We will share our experiences, >> thoughts, conclusions and will suggest two proposals. >> >> Traditionally, the Java ecosystem has been very mature in terms of >> naming >> and namespacing. The reverse fqdn introduced into the java package was a >> great choice to ensure classes don?t conflict. Popular build tools such >> as >> Maven and nearly all those that followed built upon that this key >> concept >> with the introduction of ?GroupId? also using the fqdn as part of the >> name >> to ensure the coordinates were properly namespaced. >> >> We?ve seen some ecosystems diverge from this leading to new challenges >> that ultimately had to be reversed. A great example can be seen in the ? >> tragic mistake from npm creators ? [1] which was to launch without a >> namespace concept. Eventually, NPM started running out of useful names >> and >> had to backtrack to introduce ?scopes? which is really just a namespace >> [2]. The real problem here is that the major change in namespace was >> backed in after several years of momentum without it. It?s taken a long >> time for tooling and best practice to catch up to scopes and in the >> interim, people have been left with a dual mode, some namespaced, some >> not >> namespaced situation that has created chaos. [3] >> >> The real issue at hand here as we consider behaviors in the jigsaw >> automodule revolves around two well studied concepts. >> >> The most important is the ?Default effect? [3] which states that >> whatever >> the default behavior is will become the most prominent best practice. A >> default that uses a filename to generate a very short, un-namespaced >> module id effectively sets the behavior to create generic names that >> will >> eventually conflict...exactly what we?ve seen in npm. >> >> Additionally, The switching costs introduced in overcoming a default >> un-namespaced module id to one with a unique namespace is also >> significant >> once you consider all the potential users. This is why API change is >> hard, >> and changing the module id after the fact from the default is >> effectively >> an API change. >> >> The second principal at hand is the ?Principle of least astonishment?. >> We >> want to find a default that doesn?t violate what most users would >> consider >> to be the most obvious. One could argue the current auto module >> algorithm >> doesn?t violate this principle, but it?s important to consider alternate >> suggestions in this light. >> >> First, lets explore the potential downsides if the default effect takes >> hold with the currently generated auto module id. In Apache Maven, the >> artifact id is the part of the coordinate that generates the filename. >> This means that com.somecompany:artifact:version will become >> artifact-version.jar, which would result in automodule id ?artifact?. >> Armed with this understanding, that does an analysis of the Maven >> ecosystem have to say about potential conflicts in the automodule id? >> >> If we ignore the groupid and version of all the components in the Maven >> Central repository, we end up with over 13,500 (7% of the total >> group:artifact combinations) conflicts. This does not consider conflicts >> across other repositories, or within customer portfolios yet it is >> pretty >> telling. Conflicts will happen. In some cases, the number of conflicts >> on >> the same common names is well above 100. The list of conflicts as of >> October, 2016 can be seen here. [6] >> >> At this point, hopefully we?ve made the case for at least establishing a >> default module id that >> 1. Uses namespaces to minimizes id conflicts when possible >> 2. Leverages the default effect to create a de facto best practice >> 3. Follows the principle of least astonishment >> >> We have two potential proposals that solve these goals. >> >> Proposal 1: Leverage existing coordinates when available. >> >> Maven is inarguably the most popular build system for Java components, >> with Maven Central being the default and largest repository of Java >> components in the world. By default, every jar built by Maven >> automatically gets a simple properties file inserted into it with its >> unique coordinates. Now, not every jar in Central was built with Maven, >> however 94% of them were, as we can find the pom.properties file in >> 1,806,023 of the 1,913,561 central components . Talk about the default >> effect in action! >> >> It?s further important to recognize that given a jar with a >> pom.properties >> declaring coordinates, it means that the project itself has chosen those >> coordinates as their own name. In other words, this is how they refer to >> themselves, even if other consumers may not be using Maven directly. >> >> If automodule were able to peek inside a jar and generate the default id >> using the groupid and artifactid present in the file, this would nearly >> eliminate all instances of id conflict because a significant portion of >> the Java ecosystem is in fact built with Maven. Additionally, the fact >> that 1.8 million (and counting) modules would have namespace as the >> default behavior means we?ve taken a huge step in setting the best >> practice of picking module ids with a namepace. Additionally, since the >> project itself has chosen these coordinates and uses them as their >> primary >> distribution mechanism, this follows the principle of least astonishment >> to consumers regardless of their chosen build system. Finally, since all >> of the above are true, it?s unlikely the project would need to migrate >> to >> a new module id when they adopt jigsaw natively, thus avoiding an API >> switching cost for their users. >> >> Proposal 2: Drop automodules >> Right now Jigsaw tries to calculate a module name solely based on the >> name >> of the jar file, which now already causes issues. Besides the fact that >> the module name is not guaranteed unique compared with its Maven >> coordinate, there are extra transformations which makes it even less >> guaranteed that it is unique; e.g. dashes are replaced by dots (which >> are >> both valid artifactId characters), in some cases the number and their >> following characters are stripped off. For artifacts like >> jboss-servlet-api_4.0_spec it makes sense, however we already see issues >> here where commons-lang, commons-lang2 and commons-lang3 get the same >> module name, >> even though they have different artifactIds and contain different >> packages. Choosing different artifactIds and packages was a very wise >> decision because it made it possible that these jars could live next to >> each other. Removing that separation by the authors is a very unwise >> decision. >> >> Another known example is the jsrNNN jars, which now all get jsr as the >> module name. >> >> Is it highly unlikely there is one single rule to capture all the use >> cases and which always result in a module name we can work with. >> >> For that reason the other proposal is to simply drop automodules. Don?t >> try to come up with a name for unnamed jars. It might look like the >> feature of automodules makes migrating easier because every dependency >> will get a name so can complete your module-info for all requirements, >> but >> we expect that once Jigsaw comes to speed the invalid module names are >> actually blocking further development due to name collisions or forced >> renaming by transitive modular jars. >> >> The advantage of this proposal is that library builders are not forced >> to >> keep the proposed module name in order to maintain backwards >> compatibility >> with the default.. Instead library builders can pick a more suitable >> module name. The modular system doesn?t allow the same package to be >> exported by multiple jars (and automodules exports every package). >> Library >> builders can fix this is their new jars, however if end users would >> require both jars because they were specified as requirements in >> different >> transitive jars, you cannot compile this project. There?s just no >> dependency-excludes like Maven has, because ?requires? in the >> module-info >> really means requires. Dropping automodules will prevent these kind of >> issues, because a package can only be exported by a named module. >> >> Sure, this means that for end users they cannot refer to every jar in >> their module-info. But at least if they add a ?requires? to their >> module-info, they can ensure that it?ll always refer to the intended >> modular jar. With build tools like Maven the chance of missing artifacts >> on the classpath has already been reduced a lot. In general builds have >> become quite stable, so we don?t expect that developers will translate >> all >> dependencies to the module-info file, especially if we warn them about >> the >> possible consequences of depending on automodules. Only referring to >> named >> modules and even a single ?requires? is already a gain. There?s no >> reason >> to try to speed this up and give the developer the false impression that >> it?ll keep working when upgrading to real modular jars. Focus should be >> on >> the target, not on the path how to reach it. >> >> Dropping the automodules will prevent a lot of discussions about what is >> the correct way to select a module name and will give the responsibility >> for the name back to the place where it belongs: the developer. >> >> [1] >> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >> [2] >> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >> [3] The fact that so much of the npm ecosystem is effectively >> not-namespaced is has actually >> created potential build time malware injection possibilities. If I know >> of >> a package in use by a >> company through log analysis, bug report analysis etc, I could >> potentially >> go register the same >> name in the default repo with a very high semver and know that it?s very >> likely this would be >> picked up over the intended internally developed module because there?s >> no >> namespace. >> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >> [6] >> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >> Q5M/edit?usp=sharing >> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >> [8] >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From forax at univ-mlv.fr Tue Jan 17 22:11:11 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 17 Jan 2017 23:11:11 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> Message-ID: <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> Robert, i fully agree with you that Maven can not use automatic modules. Automatic modules have weird name rules, everything is exported and has no dependency itself*, so they are useless if you already have already a trove of info like the Maven POM. In my opinion, the real question is not how to map existing Maven artifacts to Java modules but more, how Maven 4 artifacts are mapped to Java modules and then how to make the transition between Maven 3 artifacts to Maven 4 artifacts as smooth as possible. Here is my take on what can be a Maven 4 artifact, - a Maven 4 artifact can only depends other Maven 4 artifact (and their are some way to see a Maven3 artifact as a Maven 4 artifact if the POM is siple enough), - a Maven 4 artifact do not allow split packages (a lot of Maven 3 artifact uses split packages because it's a cool way to do an after the fact modularisation without changing the name of the module) - a Maven 4 artifact info is specified with info extracted from the module-info and from the POM (version is in the POM, exported packages are in the module-info, ...) etc. once you have the precise rules, it will be easier to see how to map a Maven 3 artifact to a Maven 4 and what are the compatibility rules. regards, R?mi * apart if you want to play with configurations that mix modulepath and classpath but these kind of configurations are really hard to debug. ----- Mail original ----- > De: "Robert Scholte" > ?: "Remi Forax" > Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" > Envoy?: Mardi 17 Janvier 2017 13:04:08 > Objet: Re: Advice + proposals regarding automodule naming > Hi R?mi, > > In the end every non-jdk.* and non-java.* module in the module-info will > be a dependency in your buildtool descriptor. Such module must match > exactly one versionless dependency, or conflictId as we call it, which is > in general the groupId + artifactId (type and classifier are not relevant > for this story). > By ignoring the groupId a module can referred by multiple dependencies. So > we can expect collissions. For that reason Brian did a quick scan over > Maven Central to count the number of duplicate artifactIds. > > Here's the artifactIds with 100+ groupIds: > maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) > library 391 6854 > core 312 8188 > common 142 5084 > ui 138 1414 > > In theory I could have a Maven project with 391 'library'-jars on the > classpath without any problem. And as long as they are direct dependencies > I have control over this by simply not adding 'library' as requirement to > module-info. The issues start when different 'library'-jars are transitive > dependencies and when they are marked are required in the module-info file > of my direct or transitive dependencies. > > Developers of the 'library'-jars cannot use library as the module name and > are forced to pick another name. As developer of my project in the end I > decide which versions of dependencies are used. If the 'library'-jar gets > a different module name and my dependency is still referring to the old > module name, the project can't be built. > > What I expect is that developers are forced to remove the requirements > from their module-info because of the mentioned issues. So instead of > increasing the number requirements it will be reduced. For that reason we > say either use a unique module name from the beginning (GA) or wait until > a dependency has its own module name before adding it as requirement. > > As far as I know this is the first time the JDK/JRE decides (proposes) a > name for an entity based on another entity. There are no relations between > method-, class-, or package-names and there doesn't have to be a relation > between the module name and the filename, so please don't try to do so. > > regards, > Robert > > On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax wrote: >> Hi Robert, >> the problem with automatic modules is more general that just the name, >> automatics modules also creates a flat hierarchy which doesn't map well >> with the Maven artifact descriptor. >> >> I wonder why you want Maven to use automatic modules, or said >> differently Maven has a lot of information about the artifact, why do >> you want to forget all these information when fetching a Maven artifact. >> >> I think that one problem is that you do not want to create a >> module-info.class from the Maven POM and insert it into the jar because >> it will change the artifact*. >> This kind of modules is supported by jigsaw under the name of synthetic >> modules. A synthetic module is a module with a module descriptor not >> created by javac but by another tool. >> >> In my opinion, automatic modules are interesting when you have jar that >> do not come from Maven central but comes from an ad-hoc build tool and >> will be considered as a leaf of the dependency DAG. >> Otherwise, for existing module system, using a synthetic module seem to >> be a better idea. >> >> regards, >> R?mi >> >> * given you have also the problem of split packages, you also need a way >> to merge several artifacts into one modular jar because it's the easy >> way to solve the split package problem. >> >> ----- Mail original ----- >>> De: "Robert Scholte" >>> ?: jpms-spec-experts at openjdk.java.net >>> Cc: "Apache Maven Dev" >>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >>> Objet: Advice + proposals regarding automodule naming >> >>> This is a message from Robert Scholte and Brian Fox. We both have been >>> talking about this topic several weeks with other Maven developers and >>> came to the conclusion that we should warn the jigsaw team with their >>> current approach regarding auto modules. We will share our experiences, >>> thoughts, conclusions and will suggest two proposals. >>> >>> Traditionally, the Java ecosystem has been very mature in terms of >>> naming >>> and namespacing. The reverse fqdn introduced into the java package was a >>> great choice to ensure classes don?t conflict. Popular build tools such >>> as >>> Maven and nearly all those that followed built upon that this key >>> concept >>> with the introduction of ?GroupId? also using the fqdn as part of the >>> name >>> to ensure the coordinates were properly namespaced. >>> >>> We?ve seen some ecosystems diverge from this leading to new challenges >>> that ultimately had to be reversed. A great example can be seen in the ? >>> tragic mistake from npm creators ? [1] which was to launch without a >>> namespace concept. Eventually, NPM started running out of useful names >>> and >>> had to backtrack to introduce ?scopes? which is really just a namespace >>> [2]. The real problem here is that the major change in namespace was >>> backed in after several years of momentum without it. It?s taken a long >>> time for tooling and best practice to catch up to scopes and in the >>> interim, people have been left with a dual mode, some namespaced, some >>> not >>> namespaced situation that has created chaos. [3] >>> >>> The real issue at hand here as we consider behaviors in the jigsaw >>> automodule revolves around two well studied concepts. >>> >>> The most important is the ?Default effect? [3] which states that >>> whatever >>> the default behavior is will become the most prominent best practice. A >>> default that uses a filename to generate a very short, un-namespaced >>> module id effectively sets the behavior to create generic names that >>> will >>> eventually conflict...exactly what we?ve seen in npm. >>> >>> Additionally, The switching costs introduced in overcoming a default >>> un-namespaced module id to one with a unique namespace is also >>> significant >>> once you consider all the potential users. This is why API change is >>> hard, >>> and changing the module id after the fact from the default is >>> effectively >>> an API change. >>> >>> The second principal at hand is the ?Principle of least astonishment?. >>> We >>> want to find a default that doesn?t violate what most users would >>> consider >>> to be the most obvious. One could argue the current auto module >>> algorithm >>> doesn?t violate this principle, but it?s important to consider alternate >>> suggestions in this light. >>> >>> First, lets explore the potential downsides if the default effect takes >>> hold with the currently generated auto module id. In Apache Maven, the >>> artifact id is the part of the coordinate that generates the filename. >>> This means that com.somecompany:artifact:version will become >>> artifact-version.jar, which would result in automodule id ?artifact?. >>> Armed with this understanding, that does an analysis of the Maven >>> ecosystem have to say about potential conflicts in the automodule id? >>> >>> If we ignore the groupid and version of all the components in the Maven >>> Central repository, we end up with over 13,500 (7% of the total >>> group:artifact combinations) conflicts. This does not consider conflicts >>> across other repositories, or within customer portfolios yet it is >>> pretty >>> telling. Conflicts will happen. In some cases, the number of conflicts >>> on >>> the same common names is well above 100. The list of conflicts as of >>> October, 2016 can be seen here. [6] >>> >>> At this point, hopefully we?ve made the case for at least establishing a >>> default module id that >>> 1. Uses namespaces to minimizes id conflicts when possible >>> 2. Leverages the default effect to create a de facto best practice >>> 3. Follows the principle of least astonishment >>> >>> We have two potential proposals that solve these goals. >>> >>> Proposal 1: Leverage existing coordinates when available. >>> >>> Maven is inarguably the most popular build system for Java components, >>> with Maven Central being the default and largest repository of Java >>> components in the world. By default, every jar built by Maven >>> automatically gets a simple properties file inserted into it with its >>> unique coordinates. Now, not every jar in Central was built with Maven, >>> however 94% of them were, as we can find the pom.properties file in >>> 1,806,023 of the 1,913,561 central components . Talk about the default >>> effect in action! >>> >>> It?s further important to recognize that given a jar with a >>> pom.properties >>> declaring coordinates, it means that the project itself has chosen those >>> coordinates as their own name. In other words, this is how they refer to >>> themselves, even if other consumers may not be using Maven directly. >>> >>> If automodule were able to peek inside a jar and generate the default id >>> using the groupid and artifactid present in the file, this would nearly >>> eliminate all instances of id conflict because a significant portion of >>> the Java ecosystem is in fact built with Maven. Additionally, the fact >>> that 1.8 million (and counting) modules would have namespace as the >>> default behavior means we?ve taken a huge step in setting the best >>> practice of picking module ids with a namepace. Additionally, since the >>> project itself has chosen these coordinates and uses them as their >>> primary >>> distribution mechanism, this follows the principle of least astonishment >>> to consumers regardless of their chosen build system. Finally, since all >>> of the above are true, it?s unlikely the project would need to migrate >>> to >>> a new module id when they adopt jigsaw natively, thus avoiding an API >>> switching cost for their users. >>> >>> Proposal 2: Drop automodules >>> Right now Jigsaw tries to calculate a module name solely based on the >>> name >>> of the jar file, which now already causes issues. Besides the fact that >>> the module name is not guaranteed unique compared with its Maven >>> coordinate, there are extra transformations which makes it even less >>> guaranteed that it is unique; e.g. dashes are replaced by dots (which >>> are >>> both valid artifactId characters), in some cases the number and their >>> following characters are stripped off. For artifacts like >>> jboss-servlet-api_4.0_spec it makes sense, however we already see issues >>> here where commons-lang, commons-lang2 and commons-lang3 get the same >>> module name, >>> even though they have different artifactIds and contain different >>> packages. Choosing different artifactIds and packages was a very wise >>> decision because it made it possible that these jars could live next to >>> each other. Removing that separation by the authors is a very unwise >>> decision. >>> >>> Another known example is the jsrNNN jars, which now all get jsr as the >>> module name. >>> >>> Is it highly unlikely there is one single rule to capture all the use >>> cases and which always result in a module name we can work with. >>> >>> For that reason the other proposal is to simply drop automodules. Don?t >>> try to come up with a name for unnamed jars. It might look like the >>> feature of automodules makes migrating easier because every dependency >>> will get a name so can complete your module-info for all requirements, >>> but >>> we expect that once Jigsaw comes to speed the invalid module names are >>> actually blocking further development due to name collisions or forced >>> renaming by transitive modular jars. >>> >>> The advantage of this proposal is that library builders are not forced >>> to >>> keep the proposed module name in order to maintain backwards >>> compatibility >>> with the default.. Instead library builders can pick a more suitable >>> module name. The modular system doesn?t allow the same package to be >>> exported by multiple jars (and automodules exports every package). >>> Library >>> builders can fix this is their new jars, however if end users would >>> require both jars because they were specified as requirements in >>> different >>> transitive jars, you cannot compile this project. There?s just no >>> dependency-excludes like Maven has, because ?requires? in the >>> module-info >>> really means requires. Dropping automodules will prevent these kind of >>> issues, because a package can only be exported by a named module. >>> >>> Sure, this means that for end users they cannot refer to every jar in >>> their module-info. But at least if they add a ?requires? to their >>> module-info, they can ensure that it?ll always refer to the intended >>> modular jar. With build tools like Maven the chance of missing artifacts >>> on the classpath has already been reduced a lot. In general builds have >>> become quite stable, so we don?t expect that developers will translate >>> all >>> dependencies to the module-info file, especially if we warn them about >>> the >>> possible consequences of depending on automodules. Only referring to >>> named >>> modules and even a single ?requires? is already a gain. There?s no >>> reason >>> to try to speed this up and give the developer the false impression that >>> it?ll keep working when upgrading to real modular jars. Focus should be >>> on >>> the target, not on the path how to reach it. >>> >>> Dropping the automodules will prevent a lot of discussions about what is >>> the correct way to select a module name and will give the responsibility >>> for the name back to the place where it belongs: the developer. >>> >>> [1] >>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >>> [2] >>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >>> [3] The fact that so much of the npm ecosystem is effectively >>> not-namespaced is has actually >>> created potential build time malware injection possibilities. If I know >>> of >>> a package in use by a >>> company through log analysis, bug report analysis etc, I could >>> potentially >>> go register the same >>> name in the default repo with a very high semver and know that it?s very >>> likely this would be >>> picked up over the intended internally developed module because there?s >>> no >>> namespace. >>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >>> [6] >>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >>> Q5M/edit?usp=sharing >>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >>> [8] > >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From mark.reinhold at oracle.com Wed Jan 18 15:15:15 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Wed, 18 Jan 2017 07:15:15 -0800 Subject: Draft JPMS EDR specification In-Reply-To: <854291806.1578063.1482584002002.JavaMail.zimbra@u-pem.fr> References: <20161221211216.71D3D27C41@eggemoggin.niobe.net> <854291806.1578063.1482584002002.JavaMail.zimbra@u-pem.fr> Message-ID: <20170118071515.344022345@eggemoggin.niobe.net> 2016/12/24 4:53:22 -0800, forax at univ-mlv.fr: > Minor nit, in the VM spec part, section 2.1, the attribute > ModuleVersion is mentioned while it has disappeared (the module > version is now a field of the Module attribute). Thanks. I fixed that in the final version of the EDR that I sent to the JCP, which has now been posted on jcp.org [1] but is more conveniently available at http://cr.openjdk.java.net/~mr/jigsaw/spec/ . The EDR period ends in 30 days, on 17 February 2017. > I will modify ASM next week to be in sync with the spec, i do not > expect any problems. > > I still think that encoding a version for the requires inside the > Module attribute is no[t] a good idea (cf my previous message). I'll return to that thread shortly. - Mark [1] https://jcp.org/aboutJava/communityprocess/edr/jsr376/index.html From rfscholte at apache.org Wed Jan 18 21:14:33 2017 From: rfscholte at apache.org (Robert Scholte) Date: Wed, 18 Jan 2017 22:14:33 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> Message-ID: Hi R?mi, I'm getting a JavaOne 2015 d?j? vu :) It seems like you expect there will be a new pom-definition to support these kind of extra information. The current POM modelVersion (4.0.0) is not only used by Maven but by a lot of tools, probably even more than we know of. We wonder if they do XSD checking, so we must be very, very careful with every adjustment. So pom-4.0.0 is a fact with all its restrictions. We are working on pom-5.0.0 but we will always make sure there will also be a pom-4.0.0 available (either pre-generated or runtime transformed) for the current tools. Also, its definition should work for any software technology, not just for Java. In the beginning I had the idea of working with new scopes to decide if a dependency belongs to the modulepath or classpath, but there's a strict set of scopes in pom-4.0.0, so again no option. And by now I know this is not required, the info is already there once I can read all module-info files. It would have helped if a modular jar had a different extension, so every can see from the *outside* what kind of jar it is. There's no such thing as a Maven4 artifact: any artifact is a file (often jar) with a coordinate and an extra file with dependency declarations. During dependency resolution all build-information is ignored! The problem with the module-info file is comparable with the java bytecode version: you have to go in the jar to get this information. At the moment I'm pretty far with the maven-compiler-plugin, but now every dependency acts like an automodule. My next step would probably be to analyze every module-info file and decide if jars belong to the classpath or modulepath, only allowing modular jars on the module path because of our concerns. regards, Robert On Tue, 17 Jan 2017 23:11:11 +0100, wrote: > Robert, > i fully agree with you that Maven can not use automatic modules. > Automatic modules have weird name rules, everything is exported and has > no dependency itself*, so they are useless if you already have already a > trove of info like the Maven POM. > > In my opinion, the real question is not how to map existing Maven > artifacts to Java modules but more, > how Maven 4 artifacts are mapped to Java modules and then how to make > the transition between Maven 3 artifacts to Maven 4 artifacts as smooth > as possible. > > Here is my take on what can be a Maven 4 artifact, > - a Maven 4 artifact can only depends other Maven 4 artifact (and their > are some way to see a Maven3 artifact as a Maven 4 artifact if the POM > is siple enough), > - a Maven 4 artifact do not allow split packages (a lot of Maven 3 > artifact uses split packages because it's a cool way to do an after the > fact modularisation > without changing the name of the module) > - a Maven 4 artifact info is specified with info extracted from the > module-info and from the POM > (version is in the POM, exported packages are in the module-info, ...) > etc. > > once you have the precise rules, it will be easier to see how to map a > Maven 3 artifact to a Maven 4 and what are the compatibility rules. > > regards, > R?mi > > * apart if you want to play with configurations that mix modulepath and > classpath but these kind of configurations are really hard to debug. > > ----- Mail original ----- >> De: "Robert Scholte" >> ?: "Remi Forax" >> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >> >> Envoy?: Mardi 17 Janvier 2017 13:04:08 >> Objet: Re: Advice + proposals regarding automodule naming > >> Hi R?mi, >> >> In the end every non-jdk.* and non-java.* module in the module-info will >> be a dependency in your buildtool descriptor. Such module must match >> exactly one versionless dependency, or conflictId as we call it, which >> is >> in general the groupId + artifactId (type and classifier are not >> relevant >> for this story). >> By ignoring the groupId a module can referred by multiple dependencies. >> So >> we can expect collissions. For that reason Brian did a quick scan over >> Maven Central to count the number of duplicate artifactIds. >> >> Here's the artifactIds with 100+ groupIds: >> maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) >> library 391 6854 >> core 312 8188 >> common 142 5084 >> ui 138 1414 >> >> In theory I could have a Maven project with 391 'library'-jars on the >> classpath without any problem. And as long as they are direct >> dependencies >> I have control over this by simply not adding 'library' as requirement >> to >> module-info. The issues start when different 'library'-jars are >> transitive >> dependencies and when they are marked are required in the module-info >> file >> of my direct or transitive dependencies. >> >> Developers of the 'library'-jars cannot use library as the module name >> and >> are forced to pick another name. As developer of my project in the end I >> decide which versions of dependencies are used. If the 'library'-jar >> gets >> a different module name and my dependency is still referring to the old >> module name, the project can't be built. >> >> What I expect is that developers are forced to remove the requirements >> from their module-info because of the mentioned issues. So instead of >> increasing the number requirements it will be reduced. For that reason >> we >> say either use a unique module name from the beginning (GA) or wait >> until >> a dependency has its own module name before adding it as requirement. >> >> As far as I know this is the first time the JDK/JRE decides (proposes) a >> name for an entity based on another entity. There are no relations >> between >> method-, class-, or package-names and there doesn't have to be a >> relation >> between the module name and the filename, so please don't try to do so. >> >> regards, >> Robert >> >> On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax >> wrote: >>> Hi Robert, >>> the problem with automatic modules is more general that just the name, >>> automatics modules also creates a flat hierarchy which doesn't map well >>> with the Maven artifact descriptor. >>> >>> I wonder why you want Maven to use automatic modules, or said >>> differently Maven has a lot of information about the artifact, why do >>> you want to forget all these information when fetching a Maven >>> artifact. >>> >>> I think that one problem is that you do not want to create a >>> module-info.class from the Maven POM and insert it into the jar because >>> it will change the artifact*. >>> This kind of modules is supported by jigsaw under the name of synthetic >>> modules. A synthetic module is a module with a module descriptor not >>> created by javac but by another tool. >>> >>> In my opinion, automatic modules are interesting when you have jar that >>> do not come from Maven central but comes from an ad-hoc build tool and >>> will be considered as a leaf of the dependency DAG. >>> Otherwise, for existing module system, using a synthetic module seem to >>> be a better idea. >>> >>> regards, >>> R?mi >>> >>> * given you have also the problem of split packages, you also need a >>> way >>> to merge several artifacts into one modular jar because it's the easy >>> way to solve the split package problem. >>> >>> ----- Mail original ----- >>>> De: "Robert Scholte" >>>> ?: jpms-spec-experts at openjdk.java.net >>>> Cc: "Apache Maven Dev" >>>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >>>> Objet: Advice + proposals regarding automodule naming >>> >>>> This is a message from Robert Scholte and Brian Fox. We both have been >>>> talking about this topic several weeks with other Maven developers and >>>> came to the conclusion that we should warn the jigsaw team with their >>>> current approach regarding auto modules. We will share our >>>> experiences, >>>> thoughts, conclusions and will suggest two proposals. >>>> >>>> Traditionally, the Java ecosystem has been very mature in terms of >>>> naming >>>> and namespacing. The reverse fqdn introduced into the java package >>>> was a >>>> great choice to ensure classes don?t conflict. Popular build tools >>>> such >>>> as >>>> Maven and nearly all those that followed built upon that this key >>>> concept >>>> with the introduction of ?GroupId? also using the fqdn as part of the >>>> name >>>> to ensure the coordinates were properly namespaced. >>>> >>>> We?ve seen some ecosystems diverge from this leading to new challenges >>>> that ultimately had to be reversed. A great example can be seen in >>>> the ? >>>> tragic mistake from npm creators ? [1] which was to launch without a >>>> namespace concept. Eventually, NPM started running out of useful names >>>> and >>>> had to backtrack to introduce ?scopes? which is really just a >>>> namespace >>>> [2]. The real problem here is that the major change in namespace was >>>> backed in after several years of momentum without it. It?s taken a >>>> long >>>> time for tooling and best practice to catch up to scopes and in the >>>> interim, people have been left with a dual mode, some namespaced, some >>>> not >>>> namespaced situation that has created chaos. [3] >>>> >>>> The real issue at hand here as we consider behaviors in the jigsaw >>>> automodule revolves around two well studied concepts. >>>> >>>> The most important is the ?Default effect? [3] which states that >>>> whatever >>>> the default behavior is will become the most prominent best practice. >>>> A >>>> default that uses a filename to generate a very short, un-namespaced >>>> module id effectively sets the behavior to create generic names that >>>> will >>>> eventually conflict...exactly what we?ve seen in npm. >>>> >>>> Additionally, The switching costs introduced in overcoming a default >>>> un-namespaced module id to one with a unique namespace is also >>>> significant >>>> once you consider all the potential users. This is why API change is >>>> hard, >>>> and changing the module id after the fact from the default is >>>> effectively >>>> an API change. >>>> >>>> The second principal at hand is the ?Principle of least astonishment?. >>>> We >>>> want to find a default that doesn?t violate what most users would >>>> consider >>>> to be the most obvious. One could argue the current auto module >>>> algorithm >>>> doesn?t violate this principle, but it?s important to consider >>>> alternate >>>> suggestions in this light. >>>> >>>> First, lets explore the potential downsides if the default effect >>>> takes >>>> hold with the currently generated auto module id. In Apache Maven, the >>>> artifact id is the part of the coordinate that generates the filename. >>>> This means that com.somecompany:artifact:version will become >>>> artifact-version.jar, which would result in automodule id ?artifact?. >>>> Armed with this understanding, that does an analysis of the Maven >>>> ecosystem have to say about potential conflicts in the automodule id? >>>> >>>> If we ignore the groupid and version of all the components in the >>>> Maven >>>> Central repository, we end up with over 13,500 (7% of the total >>>> group:artifact combinations) conflicts. This does not consider >>>> conflicts >>>> across other repositories, or within customer portfolios yet it is >>>> pretty >>>> telling. Conflicts will happen. In some cases, the number of conflicts >>>> on >>>> the same common names is well above 100. The list of conflicts as of >>>> October, 2016 can be seen here. [6] >>>> >>>> At this point, hopefully we?ve made the case for at least >>>> establishing a >>>> default module id that >>>> 1. Uses namespaces to minimizes id conflicts when possible >>>> 2. Leverages the default effect to create a de facto best practice >>>> 3. Follows the principle of least astonishment >>>> >>>> We have two potential proposals that solve these goals. >>>> >>>> Proposal 1: Leverage existing coordinates when available. >>>> >>>> Maven is inarguably the most popular build system for Java components, >>>> with Maven Central being the default and largest repository of Java >>>> components in the world. By default, every jar built by Maven >>>> automatically gets a simple properties file inserted into it with its >>>> unique coordinates. Now, not every jar in Central was built with >>>> Maven, >>>> however 94% of them were, as we can find the pom.properties file in >>>> 1,806,023 of the 1,913,561 central components . Talk about the default >>>> effect in action! >>>> >>>> It?s further important to recognize that given a jar with a >>>> pom.properties >>>> declaring coordinates, it means that the project itself has chosen >>>> those >>>> coordinates as their own name. In other words, this is how they refer >>>> to >>>> themselves, even if other consumers may not be using Maven directly. >>>> >>>> If automodule were able to peek inside a jar and generate the default >>>> id >>>> using the groupid and artifactid present in the file, this would >>>> nearly >>>> eliminate all instances of id conflict because a significant portion >>>> of >>>> the Java ecosystem is in fact built with Maven. Additionally, the fact >>>> that 1.8 million (and counting) modules would have namespace as the >>>> default behavior means we?ve taken a huge step in setting the best >>>> practice of picking module ids with a namepace. Additionally, since >>>> the >>>> project itself has chosen these coordinates and uses them as their >>>> primary >>>> distribution mechanism, this follows the principle of least >>>> astonishment >>>> to consumers regardless of their chosen build system. Finally, since >>>> all >>>> of the above are true, it?s unlikely the project would need to migrate >>>> to >>>> a new module id when they adopt jigsaw natively, thus avoiding an API >>>> switching cost for their users. >>>> >>>> Proposal 2: Drop automodules >>>> Right now Jigsaw tries to calculate a module name solely based on the >>>> name >>>> of the jar file, which now already causes issues. Besides the fact >>>> that >>>> the module name is not guaranteed unique compared with its Maven >>>> coordinate, there are extra transformations which makes it even less >>>> guaranteed that it is unique; e.g. dashes are replaced by dots (which >>>> are >>>> both valid artifactId characters), in some cases the number and their >>>> following characters are stripped off. For artifacts like >>>> jboss-servlet-api_4.0_spec it makes sense, however we already see >>>> issues >>>> here where commons-lang, commons-lang2 and commons-lang3 get the same >>>> module name, >>>> even though they have different artifactIds and contain different >>>> packages. Choosing different artifactIds and packages was a very wise >>>> decision because it made it possible that these jars could live next >>>> to >>>> each other. Removing that separation by the authors is a very unwise >>>> decision. >>>> >>>> Another known example is the jsrNNN jars, which now all get jsr as the >>>> module name. >>>> >>>> Is it highly unlikely there is one single rule to capture all the use >>>> cases and which always result in a module name we can work with. >>>> >>>> For that reason the other proposal is to simply drop automodules. >>>> Don?t >>>> try to come up with a name for unnamed jars. It might look like the >>>> feature of automodules makes migrating easier because every dependency >>>> will get a name so can complete your module-info for all requirements, >>>> but >>>> we expect that once Jigsaw comes to speed the invalid module names are >>>> actually blocking further development due to name collisions or forced >>>> renaming by transitive modular jars. >>>> >>>> The advantage of this proposal is that library builders are not forced >>>> to >>>> keep the proposed module name in order to maintain backwards >>>> compatibility >>>> with the default.. Instead library builders can pick a more suitable >>>> module name. The modular system doesn?t allow the same package to be >>>> exported by multiple jars (and automodules exports every package). >>>> Library >>>> builders can fix this is their new jars, however if end users would >>>> require both jars because they were specified as requirements in >>>> different >>>> transitive jars, you cannot compile this project. There?s just no >>>> dependency-excludes like Maven has, because ?requires? in the >>>> module-info >>>> really means requires. Dropping automodules will prevent these kind of >>>> issues, because a package can only be exported by a named module. >>>> >>>> Sure, this means that for end users they cannot refer to every jar in >>>> their module-info. But at least if they add a ?requires? to their >>>> module-info, they can ensure that it?ll always refer to the intended >>>> modular jar. With build tools like Maven the chance of missing >>>> artifacts >>>> on the classpath has already been reduced a lot. In general builds >>>> have >>>> become quite stable, so we don?t expect that developers will translate >>>> all >>>> dependencies to the module-info file, especially if we warn them about >>>> the >>>> possible consequences of depending on automodules. Only referring to >>>> named >>>> modules and even a single ?requires? is already a gain. There?s no >>>> reason >>>> to try to speed this up and give the developer the false impression >>>> that >>>> it?ll keep working when upgrading to real modular jars. Focus should >>>> be >>>> on >>>> the target, not on the path how to reach it. >>>> >>>> Dropping the automodules will prevent a lot of discussions about what >>>> is >>>> the correct way to select a module name and will give the >>>> responsibility >>>> for the name back to the place where it belongs: the developer. >>>> >>>> [1] >>>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >>>> [2] >>>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >>>> [3] The fact that so much of the npm ecosystem is effectively >>>> not-namespaced is has actually >>>> created potential build time malware injection possibilities. If I >>>> know >>>> of >>>> a package in use by a >>>> company through log analysis, bug report analysis etc, I could >>>> potentially >>>> go register the same >>>> name in the default repo with a very high semver and know that it?s >>>> very >>>> likely this would be >>>> picked up over the intended internally developed module because >>>> there?s >>>> no >>>> namespace. >>>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >>>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >>>> [6] >>>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >>>> Q5M/edit?usp=sharing >>>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >>>> [8] >> >> >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From forax at univ-mlv.fr Thu Jan 19 14:07:06 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 19 Jan 2017 15:07:06 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> Message-ID: <895398189.1781046.1484834826944.JavaMail.zimbra@u-pem.fr> Hi Brian, Maven can decide to put modular jars (a jar that contains a module-info.class) in the modulepath and plain old jars in the classpath, it will work with all existing applications and because it does not use the automatic module feature so there is no naming issue. The problem is that this solution do not offer a clean way to upgrade a jar from the classpath to the modulepath because this solution requires all dependencies to be modular first. R?mi > De: "Brian Fox" > ?: forax at univ-mlv.fr > Cc: "Robert Scholte" , jpms-spec-experts at openjdk.java.net > Envoy?: Jeudi 19 Janvier 2017 14:54:32 > Objet: Re: Advice + proposals regarding automodule naming > Not sure if this will get through Robert.... > We seem to have diverted away from the main issue. The biggest and most urgent > issue is not how Maven will/won't map directly to modules in the future. It's > an issue to be sure, but the universe of all previously developed Java > components are subject to the auto module behavior and all the issues laid out > in the original mail. If we don't get that fixed in the beginning, it will be > very difficult to change later. Reference the NPM scope issue I cited > originally. > On Tue, Jan 17, 2017 at 5:11 PM, < forax at univ-mlv.fr > wrote: >> Robert, >> i fully agree with you that Maven can not use automatic modules. >> Automatic modules have weird name rules, everything is exported and has no >> dependency itself*, so they are useless if you already have already a trove of >> info like the Maven POM. >> In my opinion, the real question is not how to map existing Maven artifacts to >> Java modules but more, >> how Maven 4 artifacts are mapped to Java modules and then how to make the >> transition between Maven 3 artifacts to Maven 4 artifacts as smooth as >> possible. >> Here is my take on what can be a Maven 4 artifact, >> - a Maven 4 artifact can only depends other Maven 4 artifact (and their are some >> way to see a Maven3 artifact as a Maven 4 artifact if the POM is siple enough), >> - a Maven 4 artifact do not allow split packages (a lot of Maven 3 artifact uses >> split packages because it's a cool way to do an after the fact modularisation >> without changing the name of the module) >> - a Maven 4 artifact info is specified with info extracted from the module-info >> and from the POM >> (version is in the POM, exported packages are in the module-info, ...) >> etc. >> once you have the precise rules, it will be easier to see how to map a Maven 3 >> artifact to a Maven 4 and what are the compatibility rules. >> regards, >> R?mi >> * apart if you want to play with configurations that mix modulepath and >> classpath but these kind of configurations are really hard to debug. >> ----- Mail original ----- >> > De: "Robert Scholte" < rfscholte at apache.org > >> > ?: "Remi Forax" < forax at univ-mlv.fr > >> > Cc: jpms-spec-experts at openjdk.java.net , "Brian Fox" < brianf at sonatype.com > >> > Envoy?: Mardi 17 Janvier 2017 13:04:08 >> > Objet: Re: Advice + proposals regarding automodule naming >> > Hi R?mi, >> > In the end every non-jdk.* and non-java.* module in the module-info will >> > be a dependency in your buildtool descriptor. Such module must match >> > exactly one versionless dependency, or conflictId as we call it, which is >> > in general the groupId + artifactId (type and classifier are not relevant >> > for this story). >> > By ignoring the groupId a module can referred by multiple dependencies. So >> > we can expect collissions. For that reason Brian did a quick scan over >> > Maven Central to count the number of duplicate artifactIds. >> > Here's the artifactIds with 100+ groupIds: >> > maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) >> > library 391 6854 >> > core 312 8188 >> > common 142 5084 >> > ui 138 1414 >> > In theory I could have a Maven project with 391 'library'-jars on the >> > classpath without any problem. And as long as they are direct dependencies >> > I have control over this by simply not adding 'library' as requirement to >> > module-info. The issues start when different 'library'-jars are transitive >> > dependencies and when they are marked are required in the module-info file >> > of my direct or transitive dependencies. >> > Developers of the 'library'-jars cannot use library as the module name and >> > are forced to pick another name. As developer of my project in the end I >> > decide which versions of dependencies are used. If the 'library'-jar gets >> > a different module name and my dependency is still referring to the old >> > module name, the project can't be built. >> > What I expect is that developers are forced to remove the requirements >> > from their module-info because of the mentioned issues. So instead of >> > increasing the number requirements it will be reduced. For that reason we >> > say either use a unique module name from the beginning (GA) or wait until >> > a dependency has its own module name before adding it as requirement. >> > As far as I know this is the first time the JDK/JRE decides (proposes) a >> > name for an entity based on another entity. There are no relations between >> > method-, class-, or package-names and there doesn't have to be a relation >> > between the module name and the filename, so please don't try to do so. >> > regards, >> > Robert >> > On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax < forax at univ-mlv.fr > wrote: >> >> Hi Robert, >> >> the problem with automatic modules is more general that just the name, >> >> automatics modules also creates a flat hierarchy which doesn't map well >> >> with the Maven artifact descriptor. >> >> I wonder why you want Maven to use automatic modules, or said >> >> differently Maven has a lot of information about the artifact, why do >> >> you want to forget all these information when fetching a Maven artifact. >> >> I think that one problem is that you do not want to create a >> >> module-info.class from the Maven POM and insert it into the jar because >> >> it will change the artifact*. >> >> This kind of modules is supported by jigsaw under the name of synthetic >> >> modules. A synthetic module is a module with a module descriptor not >> >> created by javac but by another tool. >> >> In my opinion, automatic modules are interesting when you have jar that >> >> do not come from Maven central but comes from an ad-hoc build tool and >> >> will be considered as a leaf of the dependency DAG. >> >> Otherwise, for existing module system, using a synthetic module seem to >> >> be a better idea. >> >> regards, >> >> R?mi >> >> * given you have also the problem of split packages, you also need a way >> >> to merge several artifacts into one modular jar because it's the easy >> >> way to solve the split package problem. >> >> ----- Mail original ----- >> >>> De: "Robert Scholte" < rfscholte at apache.org > >> >>> ?: jpms-spec-experts at openjdk.java.net >> >>> Cc: "Apache Maven Dev" < dev at maven.apache.org > >> >>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >> >>> Objet: Advice + proposals regarding automodule naming >> >>> This is a message from Robert Scholte and Brian Fox. We both have been >> >>> talking about this topic several weeks with other Maven developers and >> >>> came to the conclusion that we should warn the jigsaw team with their >> >>> current approach regarding auto modules. We will share our experiences, >> >>> thoughts, conclusions and will suggest two proposals. >> >>> Traditionally, the Java ecosystem has been very mature in terms of >> >>> naming >> >>> and namespacing. The reverse fqdn introduced into the java package was a >> >>> great choice to ensure classes don?t conflict. Popular build tools such >> >>> as >> >>> Maven and nearly all those that followed built upon that this key >> >>> concept >> >>> with the introduction of ?GroupId? also using the fqdn as part of the >> >>> name >> >>> to ensure the coordinates were properly namespaced. >> >>> We?ve seen some ecosystems diverge from this leading to new challenges >> >>> that ultimately had to be reversed. A great example can be seen in the ? >> >>> tragic mistake from npm creators ? [1] which was to launch without a >> >>> namespace concept. Eventually, NPM started running out of useful names >> >>> and >> >>> had to backtrack to introduce ?scopes? which is really just a namespace >> >>> [2]. The real problem here is that the major change in namespace was >> >>> backed in after several years of momentum without it. It?s taken a long >> >>> time for tooling and best practice to catch up to scopes and in the >> >>> interim, people have been left with a dual mode, some namespaced, some >> >>> not >> >>> namespaced situation that has created chaos. [3] >> >>> The real issue at hand here as we consider behaviors in the jigsaw >> >>> automodule revolves around two well studied concepts. >> >>> The most important is the ?Default effect? [3] which states that >> >>> whatever >> >>> the default behavior is will become the most prominent best practice. A >> >>> default that uses a filename to generate a very short, un-namespaced >> >>> module id effectively sets the behavior to create generic names that >> >>> will >> >>> eventually conflict...exactly what we?ve seen in npm. >> >>> Additionally, The switching costs introduced in overcoming a default >> >>> un-namespaced module id to one with a unique namespace is also >> >>> significant >> >>> once you consider all the potential users. This is why API change is >> >>> hard, >> >>> and changing the module id after the fact from the default is >> >>> effectively >> >>> an API change. >> >>> The second principal at hand is the ?Principle of least astonishment?. >> >>> We >> >>> want to find a default that doesn?t violate what most users would >> >>> consider >> >>> to be the most obvious. One could argue the current auto module >> >>> algorithm >> >>> doesn?t violate this principle, but it?s important to consider alternate >> >>> suggestions in this light. >> >>> First, lets explore the potential downsides if the default effect takes >> >>> hold with the currently generated auto module id. In Apache Maven, the >> >>> artifact id is the part of the coordinate that generates the filename. >> >>> This means that com.somecompany:artifact:version will become >> >>> artifact-version.jar, which would result in automodule id ?artifact?. >> >>> Armed with this understanding, that does an analysis of the Maven >> >>> ecosystem have to say about potential conflicts in the automodule id? >> >>> If we ignore the groupid and version of all the components in the Maven >> >>> Central repository, we end up with over 13,500 (7% of the total >> >>> group:artifact combinations) conflicts. This does not consider conflicts >> >>> across other repositories, or within customer portfolios yet it is >> >>> pretty >> >>> telling. Conflicts will happen. In some cases, the number of conflicts >> >>> on >> >>> the same common names is well above 100. The list of conflicts as of >> >>> October, 2016 can be seen here. [6] >> >>> At this point, hopefully we?ve made the case for at least establishing a >> >>> default module id that >> >>> 1. Uses namespaces to minimizes id conflicts when possible >> >>> 2. Leverages the default effect to create a de facto best practice >> >>> 3. Follows the principle of least astonishment >> >>> We have two potential proposals that solve these goals. >> >>> Proposal 1: Leverage existing coordinates when available. >> >>> Maven is inarguably the most popular build system for Java components, >> >>> with Maven Central being the default and largest repository of Java >> >>> components in the world. By default, every jar built by Maven >> >>> automatically gets a simple properties file inserted into it with its >> >>> unique coordinates. Now, not every jar in Central was built with Maven, >> >>> however 94% of them were, as we can find the pom.properties file in >> >>> 1,806,023 of the 1,913,561 central components . Talk about the default >> >>> effect in action! >> >>> It?s further important to recognize that given a jar with a >> >>> pom.properties >> >>> declaring coordinates, it means that the project itself has chosen those >> >>> coordinates as their own name. In other words, this is how they refer to >> >>> themselves, even if other consumers may not be using Maven directly. >> >>> If automodule were able to peek inside a jar and generate the default id >> >>> using the groupid and artifactid present in the file, this would nearly >> >>> eliminate all instances of id conflict because a significant portion of >> >>> the Java ecosystem is in fact built with Maven. Additionally, the fact >> >>> that 1.8 million (and counting) modules would have namespace as the >> >>> default behavior means we?ve taken a huge step in setting the best >> >>> practice of picking module ids with a namepace. Additionally, since the >> >>> project itself has chosen these coordinates and uses them as their >> >>> primary >> >>> distribution mechanism, this follows the principle of least astonishment >> >>> to consumers regardless of their chosen build system. Finally, since all >> >>> of the above are true, it?s unlikely the project would need to migrate >> >>> to >> >>> a new module id when they adopt jigsaw natively, thus avoiding an API >> >>> switching cost for their users. >> >>> Proposal 2: Drop automodules >> >>> Right now Jigsaw tries to calculate a module name solely based on the >> >>> name >> >>> of the jar file, which now already causes issues. Besides the fact that >> >>> the module name is not guaranteed unique compared with its Maven >> >>> coordinate, there are extra transformations which makes it even less >> >>> guaranteed that it is unique; e.g. dashes are replaced by dots (which >> >>> are >> >>> both valid artifactId characters), in some cases the number and their >> >>> following characters are stripped off. For artifacts like >> >>> jboss-servlet-api_4.0_spec it makes sense, however we already see issues >> >>> here where commons-lang, commons-lang2 and commons-lang3 get the same >> >>> module name, >> >>> even though they have different artifactIds and contain different >> >>> packages. Choosing different artifactIds and packages was a very wise >> >>> decision because it made it possible that these jars could live next to >> >>> each other. Removing that separation by the authors is a very unwise >> >>> decision. >> >>> Another known example is the jsrNNN jars, which now all get jsr as the >> >>> module name. >> >>> Is it highly unlikely there is one single rule to capture all the use >> >>> cases and which always result in a module name we can work with. >> >>> For that reason the other proposal is to simply drop automodules. Don?t >> >>> try to come up with a name for unnamed jars. It might look like the >> >>> feature of automodules makes migrating easier because every dependency >> >>> will get a name so can complete your module-info for all requirements, >> >>> but >> >>> we expect that once Jigsaw comes to speed the invalid module names are >> >>> actually blocking further development due to name collisions or forced >> >>> renaming by transitive modular jars. >> >>> The advantage of this proposal is that library builders are not forced >> >>> to >> >>> keep the proposed module name in order to maintain backwards >> >>> compatibility >> >>> with the default.. Instead library builders can pick a more suitable >> >>> module name. The modular system doesn?t allow the same package to be >> >>> exported by multiple jars (and automodules exports every package). >> >>> Library >> >>> builders can fix this is their new jars, however if end users would >> >>> require both jars because they were specified as requirements in >> >>> different >> >>> transitive jars, you cannot compile this project. There?s just no >> >>> dependency-excludes like Maven has, because ?requires? in the >> >>> module-info >> >>> really means requires. Dropping automodules will prevent these kind of >> >>> issues, because a package can only be exported by a named module. >> >>> Sure, this means that for end users they cannot refer to every jar in >> >>> their module-info. But at least if they add a ?requires? to their >> >>> module-info, they can ensure that it?ll always refer to the intended >> >>> modular jar. With build tools like Maven the chance of missing artifacts >> >>> on the classpath has already been reduced a lot. In general builds have >> >>> become quite stable, so we don?t expect that developers will translate >> >>> all >> >>> dependencies to the module-info file, especially if we warn them about >> >>> the >> >>> possible consequences of depending on automodules. Only referring to >> >>> named >> >>> modules and even a single ?requires? is already a gain. There?s no >> >>> reason >> >>> to try to speed this up and give the developer the false impression that >> >>> it?ll keep working when upgrading to real modular jars. Focus should be >> >>> on >> >>> the target, not on the path how to reach it. >> >>> Dropping the automodules will prevent a lot of discussions about what is >> >>> the correct way to select a module name and will give the responsibility >> >>> for the name back to the place where it belongs: the developer. >> >>> [1] >> >>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >> >>> [2] >> >>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >> >>> [3] The fact that so much of the npm ecosystem is effectively >> >>> not-namespaced is has actually >> >>> created potential build time malware injection possibilities. If I know >> >>> of >> >>> a package in use by a >> >>> company through log analysis, bug report analysis etc, I could >> >>> potentially >> >>> go register the same >> >>> name in the default repo with a very high semver and know that it?s very >> >>> likely this would be >> >>> picked up over the intended internally developed module because there?s >> >>> no >> >>> namespace. >> >>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >> >>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >> >>> [6] >> >>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >> >>> Q5M/edit?usp=sharing >> >>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >> >>> [8] >> > >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From forax at univ-mlv.fr Thu Jan 19 14:20:58 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 19 Jan 2017 15:20:58 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> Message-ID: <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Robert Scholte" > ?: forax at univ-mlv.fr > Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" > Envoy?: Mercredi 18 Janvier 2017 22:14:33 > Objet: Re: Advice + proposals regarding automodule naming > Hi R?mi, Robert, > > I'm getting a JavaOne 2015 d?j? vu :) i was not at JavaOne, so let say we're progressing, at least, i may start to understand your problem better ... > > It seems like you expect there will be a new pom-definition to support > these kind of extra information. > The current POM modelVersion (4.0.0) is not only used by Maven but by a > lot of tools, probably even more than we know of. We wonder if they do XSD > checking, so we must be very, very careful with every adjustment. So > pom-4.0.0 is a fact with all its restrictions. We are working on pom-5.0.0 > but we will always make sure there will also be a pom-4.0.0 available > (either pre-generated or runtime transformed) for the current tools. Also, > its definition should work for any software technology, not just for Java. > In the beginning I had the idea of working with new scopes to decide if a > dependency belongs to the modulepath or classpath, but there's a strict > set of scopes in pom-4.0.0, so again no option. And by now I know this is > not required, the info is already there once I can read all module-info > files. > It would have helped if a modular jar had a different extension, so every > can see from the *outside* what kind of jar it is. Testing if a jar is a modular jar or not is easy BTW, ModuleFinder.of(Paths.get("my.jar")).findAll().iterator().next().descriptor().isAutomatic() true means it's a plain old jar, false means it's a modular jar. > > There's no such thing as a Maven4 artifact: any artifact is a file (often > jar) with a coordinate and an extra file with dependency declarations. for me, Maven4 artifact == jar + POM v5 > During dependency resolution all build-information is ignored! The problem > with the module-info file is comparable with the java bytecode version: > you have to go in the jar to get this information. yes, but you do not need to know if it's a modular jar or a plain old jar during the dependency resolution, you can trust the Maven Central info, and then when installing, you can decide which jars should go the classpath, which ones should go in the modulepath (or which one should be upgraded from a plain jar to a modular jar because you can use the POM info to generate a compatible module-info.class) > > At the moment I'm pretty far with the maven-compiler-plugin, but now every > dependency acts like an automodule. My next step would probably be to > analyze every module-info file and decide if jars belong to the classpath > or modulepath, only allowing modular jars on the module path because of > our concerns. yes, as i said in the previous paragraph, you can also decide that with the help of the POM info, you can try to upgrade the jar to make it modular. > > regards, > Robert regards, R?mi > > On Tue, 17 Jan 2017 23:11:11 +0100, wrote: > >> Robert, >> i fully agree with you that Maven can not use automatic modules. >> Automatic modules have weird name rules, everything is exported and has >> no dependency itself*, so they are useless if you already have already a >> trove of info like the Maven POM. >> >> In my opinion, the real question is not how to map existing Maven >> artifacts to Java modules but more, >> how Maven 4 artifacts are mapped to Java modules and then how to make >> the transition between Maven 3 artifacts to Maven 4 artifacts as smooth >> as possible. >> >> Here is my take on what can be a Maven 4 artifact, >> - a Maven 4 artifact can only depends other Maven 4 artifact (and their >> are some way to see a Maven3 artifact as a Maven 4 artifact if the POM >> is siple enough), >> - a Maven 4 artifact do not allow split packages (a lot of Maven 3 >> artifact uses split packages because it's a cool way to do an after the >> fact modularisation >> without changing the name of the module) >> - a Maven 4 artifact info is specified with info extracted from the >> module-info and from the POM >> (version is in the POM, exported packages are in the module-info, ...) >> etc. >> >> once you have the precise rules, it will be easier to see how to map a >> Maven 3 artifact to a Maven 4 and what are the compatibility rules. >> >> regards, >> R?mi >> >> * apart if you want to play with configurations that mix modulepath and >> classpath but these kind of configurations are really hard to debug. >> >> ----- Mail original ----- >>> De: "Robert Scholte" >>> ?: "Remi Forax" >>> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >>> >>> Envoy?: Mardi 17 Janvier 2017 13:04:08 >>> Objet: Re: Advice + proposals regarding automodule naming >> >>> Hi R?mi, >>> >>> In the end every non-jdk.* and non-java.* module in the module-info will >>> be a dependency in your buildtool descriptor. Such module must match >>> exactly one versionless dependency, or conflictId as we call it, which >>> is >>> in general the groupId + artifactId (type and classifier are not >>> relevant >>> for this story). >>> By ignoring the groupId a module can referred by multiple dependencies. >>> So >>> we can expect collissions. For that reason Brian did a quick scan over >>> Maven Central to count the number of duplicate artifactIds. >>> >>> Here's the artifactIds with 100+ groupIds: >>> maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) >>> library 391 6854 >>> core 312 8188 >>> common 142 5084 >>> ui 138 1414 >>> >>> In theory I could have a Maven project with 391 'library'-jars on the >>> classpath without any problem. And as long as they are direct >>> dependencies >>> I have control over this by simply not adding 'library' as requirement >>> to >>> module-info. The issues start when different 'library'-jars are >>> transitive >>> dependencies and when they are marked are required in the module-info >>> file >>> of my direct or transitive dependencies. >>> >>> Developers of the 'library'-jars cannot use library as the module name >>> and >>> are forced to pick another name. As developer of my project in the end I >>> decide which versions of dependencies are used. If the 'library'-jar >>> gets >>> a different module name and my dependency is still referring to the old >>> module name, the project can't be built. >>> >>> What I expect is that developers are forced to remove the requirements >>> from their module-info because of the mentioned issues. So instead of >>> increasing the number requirements it will be reduced. For that reason >>> we >>> say either use a unique module name from the beginning (GA) or wait >>> until >>> a dependency has its own module name before adding it as requirement. >>> >>> As far as I know this is the first time the JDK/JRE decides (proposes) a >>> name for an entity based on another entity. There are no relations >>> between >>> method-, class-, or package-names and there doesn't have to be a >>> relation >>> between the module name and the filename, so please don't try to do so. >>> >>> regards, >>> Robert >>> >>> On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax >>> wrote: >>>> Hi Robert, >>>> the problem with automatic modules is more general that just the name, >>>> automatics modules also creates a flat hierarchy which doesn't map well >>>> with the Maven artifact descriptor. >>>> >>>> I wonder why you want Maven to use automatic modules, or said >>>> differently Maven has a lot of information about the artifact, why do >>>> you want to forget all these information when fetching a Maven >>>> artifact. >>>> >>>> I think that one problem is that you do not want to create a >>>> module-info.class from the Maven POM and insert it into the jar because >>>> it will change the artifact*. >>>> This kind of modules is supported by jigsaw under the name of synthetic >>>> modules. A synthetic module is a module with a module descriptor not >>>> created by javac but by another tool. >>>> >>>> In my opinion, automatic modules are interesting when you have jar that >>>> do not come from Maven central but comes from an ad-hoc build tool and >>>> will be considered as a leaf of the dependency DAG. >>>> Otherwise, for existing module system, using a synthetic module seem to >>>> be a better idea. >>>> >>>> regards, >>>> R?mi >>>> >>>> * given you have also the problem of split packages, you also need a >>>> way >>>> to merge several artifacts into one modular jar because it's the easy >>>> way to solve the split package problem. >>>> >>>> ----- Mail original ----- >>>>> De: "Robert Scholte" >>>>> ?: jpms-spec-experts at openjdk.java.net >>>>> Cc: "Apache Maven Dev" >>>>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >>>>> Objet: Advice + proposals regarding automodule naming >>>> >>>>> This is a message from Robert Scholte and Brian Fox. We both have been >>>>> talking about this topic several weeks with other Maven developers and >>>>> came to the conclusion that we should warn the jigsaw team with their >>>>> current approach regarding auto modules. We will share our >>>>> experiences, >>>>> thoughts, conclusions and will suggest two proposals. >>>>> >>>>> Traditionally, the Java ecosystem has been very mature in terms of >>>>> naming >>>>> and namespacing. The reverse fqdn introduced into the java package >>>>> was a >>>>> great choice to ensure classes don?t conflict. Popular build tools >>>>> such >>>>> as >>>>> Maven and nearly all those that followed built upon that this key >>>>> concept >>>>> with the introduction of ?GroupId? also using the fqdn as part of the >>>>> name >>>>> to ensure the coordinates were properly namespaced. >>>>> >>>>> We?ve seen some ecosystems diverge from this leading to new challenges >>>>> that ultimately had to be reversed. A great example can be seen in >>>>> the ? >>>>> tragic mistake from npm creators ? [1] which was to launch without a >>>>> namespace concept. Eventually, NPM started running out of useful names >>>>> and >>>>> had to backtrack to introduce ?scopes? which is really just a >>>>> namespace >>>>> [2]. The real problem here is that the major change in namespace was >>>>> backed in after several years of momentum without it. It?s taken a >>>>> long >>>>> time for tooling and best practice to catch up to scopes and in the >>>>> interim, people have been left with a dual mode, some namespaced, some >>>>> not >>>>> namespaced situation that has created chaos. [3] >>>>> >>>>> The real issue at hand here as we consider behaviors in the jigsaw >>>>> automodule revolves around two well studied concepts. >>>>> >>>>> The most important is the ?Default effect? [3] which states that >>>>> whatever >>>>> the default behavior is will become the most prominent best practice. >>>>> A >>>>> default that uses a filename to generate a very short, un-namespaced >>>>> module id effectively sets the behavior to create generic names that >>>>> will >>>>> eventually conflict...exactly what we?ve seen in npm. >>>>> >>>>> Additionally, The switching costs introduced in overcoming a default >>>>> un-namespaced module id to one with a unique namespace is also >>>>> significant >>>>> once you consider all the potential users. This is why API change is >>>>> hard, >>>>> and changing the module id after the fact from the default is >>>>> effectively >>>>> an API change. >>>>> >>>>> The second principal at hand is the ?Principle of least astonishment?. >>>>> We >>>>> want to find a default that doesn?t violate what most users would >>>>> consider >>>>> to be the most obvious. One could argue the current auto module >>>>> algorithm >>>>> doesn?t violate this principle, but it?s important to consider >>>>> alternate >>>>> suggestions in this light. >>>>> >>>>> First, lets explore the potential downsides if the default effect >>>>> takes >>>>> hold with the currently generated auto module id. In Apache Maven, the >>>>> artifact id is the part of the coordinate that generates the filename. >>>>> This means that com.somecompany:artifact:version will become >>>>> artifact-version.jar, which would result in automodule id ?artifact?. >>>>> Armed with this understanding, that does an analysis of the Maven >>>>> ecosystem have to say about potential conflicts in the automodule id? >>>>> >>>>> If we ignore the groupid and version of all the components in the >>>>> Maven >>>>> Central repository, we end up with over 13,500 (7% of the total >>>>> group:artifact combinations) conflicts. This does not consider >>>>> conflicts >>>>> across other repositories, or within customer portfolios yet it is >>>>> pretty >>>>> telling. Conflicts will happen. In some cases, the number of conflicts >>>>> on >>>>> the same common names is well above 100. The list of conflicts as of >>>>> October, 2016 can be seen here. [6] >>>>> >>>>> At this point, hopefully we?ve made the case for at least >>>>> establishing a >>>>> default module id that >>>>> 1. Uses namespaces to minimizes id conflicts when possible >>>>> 2. Leverages the default effect to create a de facto best practice >>>>> 3. Follows the principle of least astonishment >>>>> >>>>> We have two potential proposals that solve these goals. >>>>> >>>>> Proposal 1: Leverage existing coordinates when available. >>>>> >>>>> Maven is inarguably the most popular build system for Java components, >>>>> with Maven Central being the default and largest repository of Java >>>>> components in the world. By default, every jar built by Maven >>>>> automatically gets a simple properties file inserted into it with its >>>>> unique coordinates. Now, not every jar in Central was built with >>>>> Maven, >>>>> however 94% of them were, as we can find the pom.properties file in >>>>> 1,806,023 of the 1,913,561 central components . Talk about the default >>>>> effect in action! >>>>> >>>>> It?s further important to recognize that given a jar with a >>>>> pom.properties >>>>> declaring coordinates, it means that the project itself has chosen >>>>> those >>>>> coordinates as their own name. In other words, this is how they refer >>>>> to >>>>> themselves, even if other consumers may not be using Maven directly. >>>>> >>>>> If automodule were able to peek inside a jar and generate the default >>>>> id >>>>> using the groupid and artifactid present in the file, this would >>>>> nearly >>>>> eliminate all instances of id conflict because a significant portion >>>>> of >>>>> the Java ecosystem is in fact built with Maven. Additionally, the fact >>>>> that 1.8 million (and counting) modules would have namespace as the >>>>> default behavior means we?ve taken a huge step in setting the best >>>>> practice of picking module ids with a namepace. Additionally, since >>>>> the >>>>> project itself has chosen these coordinates and uses them as their >>>>> primary >>>>> distribution mechanism, this follows the principle of least >>>>> astonishment >>>>> to consumers regardless of their chosen build system. Finally, since >>>>> all >>>>> of the above are true, it?s unlikely the project would need to migrate >>>>> to >>>>> a new module id when they adopt jigsaw natively, thus avoiding an API >>>>> switching cost for their users. >>>>> >>>>> Proposal 2: Drop automodules >>>>> Right now Jigsaw tries to calculate a module name solely based on the >>>>> name >>>>> of the jar file, which now already causes issues. Besides the fact >>>>> that >>>>> the module name is not guaranteed unique compared with its Maven >>>>> coordinate, there are extra transformations which makes it even less >>>>> guaranteed that it is unique; e.g. dashes are replaced by dots (which >>>>> are >>>>> both valid artifactId characters), in some cases the number and their >>>>> following characters are stripped off. For artifacts like >>>>> jboss-servlet-api_4.0_spec it makes sense, however we already see >>>>> issues >>>>> here where commons-lang, commons-lang2 and commons-lang3 get the same >>>>> module name, >>>>> even though they have different artifactIds and contain different >>>>> packages. Choosing different artifactIds and packages was a very wise >>>>> decision because it made it possible that these jars could live next >>>>> to >>>>> each other. Removing that separation by the authors is a very unwise >>>>> decision. >>>>> >>>>> Another known example is the jsrNNN jars, which now all get jsr as the >>>>> module name. >>>>> >>>>> Is it highly unlikely there is one single rule to capture all the use >>>>> cases and which always result in a module name we can work with. >>>>> >>>>> For that reason the other proposal is to simply drop automodules. >>>>> Don?t >>>>> try to come up with a name for unnamed jars. It might look like the >>>>> feature of automodules makes migrating easier because every dependency >>>>> will get a name so can complete your module-info for all requirements, >>>>> but >>>>> we expect that once Jigsaw comes to speed the invalid module names are >>>>> actually blocking further development due to name collisions or forced >>>>> renaming by transitive modular jars. >>>>> >>>>> The advantage of this proposal is that library builders are not forced >>>>> to >>>>> keep the proposed module name in order to maintain backwards >>>>> compatibility >>>>> with the default.. Instead library builders can pick a more suitable >>>>> module name. The modular system doesn?t allow the same package to be >>>>> exported by multiple jars (and automodules exports every package). >>>>> Library >>>>> builders can fix this is their new jars, however if end users would >>>>> require both jars because they were specified as requirements in >>>>> different >>>>> transitive jars, you cannot compile this project. There?s just no >>>>> dependency-excludes like Maven has, because ?requires? in the >>>>> module-info >>>>> really means requires. Dropping automodules will prevent these kind of >>>>> issues, because a package can only be exported by a named module. >>>>> >>>>> Sure, this means that for end users they cannot refer to every jar in >>>>> their module-info. But at least if they add a ?requires? to their >>>>> module-info, they can ensure that it?ll always refer to the intended >>>>> modular jar. With build tools like Maven the chance of missing >>>>> artifacts >>>>> on the classpath has already been reduced a lot. In general builds >>>>> have >>>>> become quite stable, so we don?t expect that developers will translate >>>>> all >>>>> dependencies to the module-info file, especially if we warn them about >>>>> the >>>>> possible consequences of depending on automodules. Only referring to >>>>> named >>>>> modules and even a single ?requires? is already a gain. There?s no >>>>> reason >>>>> to try to speed this up and give the developer the false impression >>>>> that >>>>> it?ll keep working when upgrading to real modular jars. Focus should >>>>> be >>>>> on >>>>> the target, not on the path how to reach it. >>>>> >>>>> Dropping the automodules will prevent a lot of discussions about what >>>>> is >>>>> the correct way to select a module name and will give the >>>>> responsibility >>>>> for the name back to the place where it belongs: the developer. >>>>> >>>>> [1] >>>>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >>>>> [2] >>>>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >>>>> [3] The fact that so much of the npm ecosystem is effectively >>>>> not-namespaced is has actually >>>>> created potential build time malware injection possibilities. If I >>>>> know >>>>> of >>>>> a package in use by a >>>>> company through log analysis, bug report analysis etc, I could >>>>> potentially >>>>> go register the same >>>>> name in the default repo with a very high semver and know that it?s >>>>> very >>>>> likely this would be >>>>> picked up over the intended internally developed module because >>>>> there?s >>>>> no >>>>> namespace. >>>>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >>>>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >>>>> [6] >>>>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >>>>> Q5M/edit?usp=sharing >>>>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >>>>> [8] >>> >> > >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From rfscholte at apache.org Thu Jan 19 15:43:39 2017 From: rfscholte at apache.org (Robert Scholte) Date: Thu, 19 Jan 2017 16:43:39 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> Message-ID: >> Hi R?mi, > > Robert, > >> >> I'm getting a JavaOne 2015 d?j? vu :) > > i was not at JavaOne, so let say we're progressing, at least, i may > start to understand your problem better ... > >> >> It seems like you expect there will be a new pom-definition to support >> these kind of extra information. >> The current POM modelVersion (4.0.0) is not only used by Maven but by a >> lot of tools, probably even more than we know of. We wonder if they do >> XSD >> checking, so we must be very, very careful with every adjustment. So >> pom-4.0.0 is a fact with all its restrictions. We are working on >> pom-5.0.0 >> but we will always make sure there will also be a pom-4.0.0 available >> (either pre-generated or runtime transformed) for the current tools. >> Also, >> its definition should work for any software technology, not just for >> Java. >> In the beginning I had the idea of working with new scopes to decide if >> a >> dependency belongs to the modulepath or classpath, but there's a strict >> set of scopes in pom-4.0.0, so again no option. And by now I know this >> is >> not required, the info is already there once I can read all module-info >> files. >> It would have helped if a modular jar had a different extension, so >> every >> can see from the *outside* what kind of jar it is. > > Testing if a jar is a modular jar or not is easy BTW, > ModuleFinder.of(Paths.get("my.jar")).findAll().iterator().next().descriptor().isAutomatic() > > true means it's a plain old jar, false means it's a modular jar. > I know, I'm already using this trick in the maven-dependency-plugin >> >> There's no such thing as a Maven4 artifact: any artifact is a file >> (often >> jar) with a coordinate and an extra file with dependency declarations. > > for me, Maven4 artifact == jar + POM v5 > >> During dependency resolution all build-information is ignored! The >> problem >> with the module-info file is comparable with the java bytecode version: >> you have to go in the jar to get this information. > > yes, > but you do not need to know if it's a modular jar or a plain old jar > during the dependency resolution, you can trust the Maven Central info, > and then when installing, you can decide which jars should go the > classpath, which ones should go in the modulepath > (or which one should be upgraded from a plain jar to a modular jar > because you can use the POM info to generate a compatible > module-info.class) > There is absolutely no reason to introduce a new POM version for this. The required information is inside the jar; even when jars are built with other tools, the info is there. This is actually a very good thing that Java9 doesn't require a brand new Maven. I personally advertised that our challenge was to make it all work with Maven 3.0 and it does. bq. "i fully agree with you that Maven can not use automatic modules." Well, Maven could do it, but we don't want to because we cannot trace it back to the right dependency AND ensure for 100% this was indeed the intended dependency to be the automodule. But this is how we look at it from a Maven perspective. Any other buildtool is free to decide what their strategy will be. As long as developers can refer to an automatic module in their module-info, such jar can end up in Maven Central or any other repository and all build tools must be able to handle it. In our case Maven Central could think of adding rules to verify that the module-info never refers to auto modules, but that's just one repository. As long as there's support for auto modules, they will show up anywhere and will become another dependency for a Maven project. If such dependency becomes a requirement in the module-info, the build will fail since Maven has detected an auto module and cannot be 100% sure which dependency is related to it. So we must advice: don't "require" that module, which is the opposite of what we want to achieve: best practice should be to add as much *valid* requirements to the module-info as possible. regards, Robert >> >> At the moment I'm pretty far with the maven-compiler-plugin, but now >> every >> dependency acts like an automodule. My next step would probably be to >> analyze every module-info file and decide if jars belong to the >> classpath >> or modulepath, only allowing modular jars on the module path because of >> our concerns. > > yes, > as i said in the previous paragraph, you can also decide that with the > help of the POM info, you can try to upgrade the jar to make it modular. > >> >> regards, >> Robert > > regards, > R?mi > > >> >> On Tue, 17 Jan 2017 23:11:11 +0100, wrote: >> >>> Robert, >>> i fully agree with you that Maven can not use automatic modules. >>> Automatic modules have weird name rules, everything is exported and has >>> no dependency itself*, so they are useless if you already have already >>> a >>> trove of info like the Maven POM. >>> >>> In my opinion, the real question is not how to map existing Maven >>> artifacts to Java modules but more, >>> how Maven 4 artifacts are mapped to Java modules and then how to make >>> the transition between Maven 3 artifacts to Maven 4 artifacts as smooth >>> as possible. >>> >>> Here is my take on what can be a Maven 4 artifact, >>> - a Maven 4 artifact can only depends other Maven 4 artifact (and >>> their >>> are some way to see a Maven3 artifact as a Maven 4 artifact if the POM >>> is siple enough), >>> - a Maven 4 artifact do not allow split packages (a lot of Maven 3 >>> artifact uses split packages because it's a cool way to do an after the >>> fact modularisation >>> without changing the name of the module) >>> - a Maven 4 artifact info is specified with info extracted from the >>> module-info and from the POM >>> (version is in the POM, exported packages are in the module-info, >>> ...) >>> etc. >>> >>> once you have the precise rules, it will be easier to see how to map a >>> Maven 3 artifact to a Maven 4 and what are the compatibility rules. >>> >>> regards, >>> R?mi >>> >>> * apart if you want to play with configurations that mix modulepath and >>> classpath but these kind of configurations are really hard to debug. >>> >>> ----- Mail original ----- >>>> De: "Robert Scholte" >>>> ?: "Remi Forax" >>>> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >>>> >>>> Envoy?: Mardi 17 Janvier 2017 13:04:08 >>>> Objet: Re: Advice + proposals regarding automodule naming >>> >>>> Hi R?mi, >>>> >>>> In the end every non-jdk.* and non-java.* module in the module-info >>>> will >>>> be a dependency in your buildtool descriptor. Such module must match >>>> exactly one versionless dependency, or conflictId as we call it, which >>>> is >>>> in general the groupId + artifactId (type and classifier are not >>>> relevant >>>> for this story). >>>> By ignoring the groupId a module can referred by multiple >>>> dependencies. >>>> So >>>> we can expect collissions. For that reason Brian did a quick scan over >>>> Maven Central to count the number of duplicate artifactIds. >>>> >>>> Here's the artifactIds with 100+ groupIds: >>>> maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) >>>> library 391 6854 >>>> core 312 8188 >>>> common 142 5084 >>>> ui 138 1414 >>>> >>>> In theory I could have a Maven project with 391 'library'-jars on the >>>> classpath without any problem. And as long as they are direct >>>> dependencies >>>> I have control over this by simply not adding 'library' as requirement >>>> to >>>> module-info. The issues start when different 'library'-jars are >>>> transitive >>>> dependencies and when they are marked are required in the module-info >>>> file >>>> of my direct or transitive dependencies. >>>> >>>> Developers of the 'library'-jars cannot use library as the module name >>>> and >>>> are forced to pick another name. As developer of my project in the >>>> end I >>>> decide which versions of dependencies are used. If the 'library'-jar >>>> gets >>>> a different module name and my dependency is still referring to the >>>> old >>>> module name, the project can't be built. >>>> >>>> What I expect is that developers are forced to remove the requirements >>>> from their module-info because of the mentioned issues. So instead of >>>> increasing the number requirements it will be reduced. For that reason >>>> we >>>> say either use a unique module name from the beginning (GA) or wait >>>> until >>>> a dependency has its own module name before adding it as requirement. >>>> >>>> As far as I know this is the first time the JDK/JRE decides >>>> (proposes) a >>>> name for an entity based on another entity. There are no relations >>>> between >>>> method-, class-, or package-names and there doesn't have to be a >>>> relation >>>> between the module name and the filename, so please don't try to do >>>> so. >>>> >>>> regards, >>>> Robert >>>> >>>> On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax >>>> wrote: >>>>> Hi Robert, >>>>> the problem with automatic modules is more general that just the >>>>> name, >>>>> automatics modules also creates a flat hierarchy which doesn't map >>>>> well >>>>> with the Maven artifact descriptor. >>>>> >>>>> I wonder why you want Maven to use automatic modules, or said >>>>> differently Maven has a lot of information about the artifact, why do >>>>> you want to forget all these information when fetching a Maven >>>>> artifact. >>>>> >>>>> I think that one problem is that you do not want to create a >>>>> module-info.class from the Maven POM and insert it into the jar >>>>> because >>>>> it will change the artifact*. >>>>> This kind of modules is supported by jigsaw under the name of >>>>> synthetic >>>>> modules. A synthetic module is a module with a module descriptor not >>>>> created by javac but by another tool. >>>>> >>>>> In my opinion, automatic modules are interesting when you have jar >>>>> that >>>>> do not come from Maven central but comes from an ad-hoc build tool >>>>> and >>>>> will be considered as a leaf of the dependency DAG. >>>>> Otherwise, for existing module system, using a synthetic module seem >>>>> to >>>>> be a better idea. >>>>> >>>>> regards, >>>>> R?mi >>>>> >>>>> * given you have also the problem of split packages, you also need a >>>>> way >>>>> to merge several artifacts into one modular jar because it's the easy >>>>> way to solve the split package problem. >>>>> >>>>> ----- Mail original ----- >>>>>> De: "Robert Scholte" >>>>>> ?: jpms-spec-experts at openjdk.java.net >>>>>> Cc: "Apache Maven Dev" >>>>>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >>>>>> Objet: Advice + proposals regarding automodule naming >>>>> >>>>>> This is a message from Robert Scholte and Brian Fox. We both have >>>>>> been >>>>>> talking about this topic several weeks with other Maven developers >>>>>> and >>>>>> came to the conclusion that we should warn the jigsaw team with >>>>>> their >>>>>> current approach regarding auto modules. We will share our >>>>>> experiences, >>>>>> thoughts, conclusions and will suggest two proposals. >>>>>> >>>>>> Traditionally, the Java ecosystem has been very mature in terms of >>>>>> naming >>>>>> and namespacing. The reverse fqdn introduced into the java package >>>>>> was a >>>>>> great choice to ensure classes don?t conflict. Popular build tools >>>>>> such >>>>>> as >>>>>> Maven and nearly all those that followed built upon that this key >>>>>> concept >>>>>> with the introduction of ?GroupId? also using the fqdn as part of >>>>>> the >>>>>> name >>>>>> to ensure the coordinates were properly namespaced. >>>>>> >>>>>> We?ve seen some ecosystems diverge from this leading to new >>>>>> challenges >>>>>> that ultimately had to be reversed. A great example can be seen in >>>>>> the ? >>>>>> tragic mistake from npm creators ? [1] which was to launch without a >>>>>> namespace concept. Eventually, NPM started running out of useful >>>>>> names >>>>>> and >>>>>> had to backtrack to introduce ?scopes? which is really just a >>>>>> namespace >>>>>> [2]. The real problem here is that the major change in namespace was >>>>>> backed in after several years of momentum without it. It?s taken a >>>>>> long >>>>>> time for tooling and best practice to catch up to scopes and in the >>>>>> interim, people have been left with a dual mode, some namespaced, >>>>>> some >>>>>> not >>>>>> namespaced situation that has created chaos. [3] >>>>>> >>>>>> The real issue at hand here as we consider behaviors in the jigsaw >>>>>> automodule revolves around two well studied concepts. >>>>>> >>>>>> The most important is the ?Default effect? [3] which states that >>>>>> whatever >>>>>> the default behavior is will become the most prominent best >>>>>> practice. >>>>>> A >>>>>> default that uses a filename to generate a very short, un-namespaced >>>>>> module id effectively sets the behavior to create generic names that >>>>>> will >>>>>> eventually conflict...exactly what we?ve seen in npm. >>>>>> >>>>>> Additionally, The switching costs introduced in overcoming a default >>>>>> un-namespaced module id to one with a unique namespace is also >>>>>> significant >>>>>> once you consider all the potential users. This is why API change is >>>>>> hard, >>>>>> and changing the module id after the fact from the default is >>>>>> effectively >>>>>> an API change. >>>>>> >>>>>> The second principal at hand is the ?Principle of least >>>>>> astonishment?. >>>>>> We >>>>>> want to find a default that doesn?t violate what most users would >>>>>> consider >>>>>> to be the most obvious. One could argue the current auto module >>>>>> algorithm >>>>>> doesn?t violate this principle, but it?s important to consider >>>>>> alternate >>>>>> suggestions in this light. >>>>>> >>>>>> First, lets explore the potential downsides if the default effect >>>>>> takes >>>>>> hold with the currently generated auto module id. In Apache Maven, >>>>>> the >>>>>> artifact id is the part of the coordinate that generates the >>>>>> filename. >>>>>> This means that com.somecompany:artifact:version will become >>>>>> artifact-version.jar, which would result in automodule id >>>>>> ?artifact?. >>>>>> Armed with this understanding, that does an analysis of the Maven >>>>>> ecosystem have to say about potential conflicts in the automodule >>>>>> id? >>>>>> >>>>>> If we ignore the groupid and version of all the components in the >>>>>> Maven >>>>>> Central repository, we end up with over 13,500 (7% of the total >>>>>> group:artifact combinations) conflicts. This does not consider >>>>>> conflicts >>>>>> across other repositories, or within customer portfolios yet it is >>>>>> pretty >>>>>> telling. Conflicts will happen. In some cases, the number of >>>>>> conflicts >>>>>> on >>>>>> the same common names is well above 100. The list of conflicts as of >>>>>> October, 2016 can be seen here. [6] >>>>>> >>>>>> At this point, hopefully we?ve made the case for at least >>>>>> establishing a >>>>>> default module id that >>>>>> 1. Uses namespaces to minimizes id conflicts when possible >>>>>> 2. Leverages the default effect to create a de facto best practice >>>>>> 3. Follows the principle of least astonishment >>>>>> >>>>>> We have two potential proposals that solve these goals. >>>>>> >>>>>> Proposal 1: Leverage existing coordinates when available. >>>>>> >>>>>> Maven is inarguably the most popular build system for Java >>>>>> components, >>>>>> with Maven Central being the default and largest repository of Java >>>>>> components in the world. By default, every jar built by Maven >>>>>> automatically gets a simple properties file inserted into it with >>>>>> its >>>>>> unique coordinates. Now, not every jar in Central was built with >>>>>> Maven, >>>>>> however 94% of them were, as we can find the pom.properties file in >>>>>> 1,806,023 of the 1,913,561 central components . Talk about the >>>>>> default >>>>>> effect in action! >>>>>> >>>>>> It?s further important to recognize that given a jar with a >>>>>> pom.properties >>>>>> declaring coordinates, it means that the project itself has chosen >>>>>> those >>>>>> coordinates as their own name. In other words, this is how they >>>>>> refer >>>>>> to >>>>>> themselves, even if other consumers may not be using Maven directly. >>>>>> >>>>>> If automodule were able to peek inside a jar and generate the >>>>>> default >>>>>> id >>>>>> using the groupid and artifactid present in the file, this would >>>>>> nearly >>>>>> eliminate all instances of id conflict because a significant portion >>>>>> of >>>>>> the Java ecosystem is in fact built with Maven. Additionally, the >>>>>> fact >>>>>> that 1.8 million (and counting) modules would have namespace as the >>>>>> default behavior means we?ve taken a huge step in setting the best >>>>>> practice of picking module ids with a namepace. Additionally, since >>>>>> the >>>>>> project itself has chosen these coordinates and uses them as their >>>>>> primary >>>>>> distribution mechanism, this follows the principle of least >>>>>> astonishment >>>>>> to consumers regardless of their chosen build system. Finally, since >>>>>> all >>>>>> of the above are true, it?s unlikely the project would need to >>>>>> migrate >>>>>> to >>>>>> a new module id when they adopt jigsaw natively, thus avoiding an >>>>>> API >>>>>> switching cost for their users. >>>>>> >>>>>> Proposal 2: Drop automodules >>>>>> Right now Jigsaw tries to calculate a module name solely based on >>>>>> the >>>>>> name >>>>>> of the jar file, which now already causes issues. Besides the fact >>>>>> that >>>>>> the module name is not guaranteed unique compared with its Maven >>>>>> coordinate, there are extra transformations which makes it even less >>>>>> guaranteed that it is unique; e.g. dashes are replaced by dots >>>>>> (which >>>>>> are >>>>>> both valid artifactId characters), in some cases the number and >>>>>> their >>>>>> following characters are stripped off. For artifacts like >>>>>> jboss-servlet-api_4.0_spec it makes sense, however we already see >>>>>> issues >>>>>> here where commons-lang, commons-lang2 and commons-lang3 get the >>>>>> same >>>>>> module name, >>>>>> even though they have different artifactIds and contain different >>>>>> packages. Choosing different artifactIds and packages was a very >>>>>> wise >>>>>> decision because it made it possible that these jars could live next >>>>>> to >>>>>> each other. Removing that separation by the authors is a very unwise >>>>>> decision. >>>>>> >>>>>> Another known example is the jsrNNN jars, which now all get jsr as >>>>>> the >>>>>> module name. >>>>>> >>>>>> Is it highly unlikely there is one single rule to capture all the >>>>>> use >>>>>> cases and which always result in a module name we can work with. >>>>>> >>>>>> For that reason the other proposal is to simply drop automodules. >>>>>> Don?t >>>>>> try to come up with a name for unnamed jars. It might look like the >>>>>> feature of automodules makes migrating easier because every >>>>>> dependency >>>>>> will get a name so can complete your module-info for all >>>>>> requirements, >>>>>> but >>>>>> we expect that once Jigsaw comes to speed the invalid module names >>>>>> are >>>>>> actually blocking further development due to name collisions or >>>>>> forced >>>>>> renaming by transitive modular jars. >>>>>> >>>>>> The advantage of this proposal is that library builders are not >>>>>> forced >>>>>> to >>>>>> keep the proposed module name in order to maintain backwards >>>>>> compatibility >>>>>> with the default.. Instead library builders can pick a more suitable >>>>>> module name. The modular system doesn?t allow the same package to be >>>>>> exported by multiple jars (and automodules exports every package). >>>>>> Library >>>>>> builders can fix this is their new jars, however if end users would >>>>>> require both jars because they were specified as requirements in >>>>>> different >>>>>> transitive jars, you cannot compile this project. There?s just no >>>>>> dependency-excludes like Maven has, because ?requires? in the >>>>>> module-info >>>>>> really means requires. Dropping automodules will prevent these kind >>>>>> of >>>>>> issues, because a package can only be exported by a named module. >>>>>> >>>>>> Sure, this means that for end users they cannot refer to every jar >>>>>> in >>>>>> their module-info. But at least if they add a ?requires? to their >>>>>> module-info, they can ensure that it?ll always refer to the intended >>>>>> modular jar. With build tools like Maven the chance of missing >>>>>> artifacts >>>>>> on the classpath has already been reduced a lot. In general builds >>>>>> have >>>>>> become quite stable, so we don?t expect that developers will >>>>>> translate >>>>>> all >>>>>> dependencies to the module-info file, especially if we warn them >>>>>> about >>>>>> the >>>>>> possible consequences of depending on automodules. Only referring to >>>>>> named >>>>>> modules and even a single ?requires? is already a gain. There?s no >>>>>> reason >>>>>> to try to speed this up and give the developer the false impression >>>>>> that >>>>>> it?ll keep working when upgrading to real modular jars. Focus should >>>>>> be >>>>>> on >>>>>> the target, not on the path how to reach it. >>>>>> >>>>>> Dropping the automodules will prevent a lot of discussions about >>>>>> what >>>>>> is >>>>>> the correct way to select a module name and will give the >>>>>> responsibility >>>>>> for the name back to the place where it belongs: the developer. >>>>>> >>>>>> [1] >>>>>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >>>>>> [2] >>>>>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >>>>>> [3] The fact that so much of the npm ecosystem is effectively >>>>>> not-namespaced is has actually >>>>>> created potential build time malware injection possibilities. If I >>>>>> know >>>>>> of >>>>>> a package in use by a >>>>>> company through log analysis, bug report analysis etc, I could >>>>>> potentially >>>>>> go register the same >>>>>> name in the default repo with a very high semver and know that it?s >>>>>> very >>>>>> likely this would be >>>>>> picked up over the intended internally developed module because >>>>>> there?s >>>>>> no >>>>>> namespace. >>>>>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >>>>>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >>>>>> [6] >>>>>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >>>>>> Q5M/edit?usp=sharing >>>>>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >>>>>> [8] >>>> >> >> >> >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From forax at univ-mlv.fr Thu Jan 19 17:47:12 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 19 Jan 2017 18:47:12 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <895398189.1781046.1484834826944.JavaMail.zimbra@u-pem.fr> Message-ID: <1931717348.1907687.1484848032713.JavaMail.zimbra@u-pem.fr> Hi Brian, > De: "Brian Fox" > ?: forax at univ-mlv.fr > Cc: "Robert Scholte" , jpms-spec-experts at openjdk.java.net > Envoy?: Jeudi 19 Janvier 2017 15:09:17 > Objet: Re: Advice + proposals regarding automodule naming > Hi R?mi, > This isn't a maven problem...it's potentially a problem for _everyone_. potentially if build tools do nothing. > The only reason Maven is introduced into the conversation is because for all the > stuff in Central, it is a source of a sensible default for a module name, > otherwise it's completely unrelated. I agree that Maven Central has sensible defaults so these values can be injected into jars that are not modular. The other solution, which do not requires any information is to create the module-info directly from the bytecodes with the caveat that dynamic dependencies (Class.forName, etc) will not be visible. The later solution can not be done by the VM because it's slow, a build tool can do that once while the VM will have to do that everytime the application is started. R?mi > On Thu, Jan 19, 2017 at 9:07 AM, < forax at univ-mlv.fr > wrote: >> Hi Brian, >> Maven can decide to put modular jars (a jar that contains a module-info.class) >> in the modulepath and plain old jars in the classpath, >> it will work with all existing applications and because it does not use the >> automatic module feature so there is no naming issue. >> The problem is that this solution do not offer a clean way to upgrade a jar from >> the classpath to the modulepath because this solution requires all dependencies >> to be modular first. >> R?mi >>> De: "Brian Fox" < brianf at sonatype.com > >>> ?: forax at univ-mlv.fr >>> Cc: "Robert Scholte" < rfscholte at apache.org >, >>> jpms-spec-experts at openjdk.java.net >>> Envoy?: Jeudi 19 Janvier 2017 14:54:32 >>> Objet: Re: Advice + proposals regarding automodule naming >>> Not sure if this will get through Robert.... >>> We seem to have diverted away from the main issue. The biggest and most urgent >>> issue is not how Maven will/won't map directly to modules in the future. It's >>> an issue to be sure, but the universe of all previously developed Java >>> components are subject to the auto module behavior and all the issues laid out >>> in the original mail. If we don't get that fixed in the beginning, it will be >>> very difficult to change later. Reference the NPM scope issue I cited >>> originally. >>> On Tue, Jan 17, 2017 at 5:11 PM, < forax at univ-mlv.fr > wrote: >>>> Robert, >>>> i fully agree with you that Maven can not use automatic modules. >>>> Automatic modules have weird name rules, everything is exported and has no >>>> dependency itself*, so they are useless if you already have already a trove of >>>> info like the Maven POM. >>>> In my opinion, the real question is not how to map existing Maven artifacts to >>>> Java modules but more, >>>> how Maven 4 artifacts are mapped to Java modules and then how to make the >>>> transition between Maven 3 artifacts to Maven 4 artifacts as smooth as >>>> possible. >>>> Here is my take on what can be a Maven 4 artifact, >>>> - a Maven 4 artifact can only depends other Maven 4 artifact (and their are some >>>> way to see a Maven3 artifact as a Maven 4 artifact if the POM is siple enough), >>>> - a Maven 4 artifact do not allow split packages (a lot of Maven 3 artifact uses >>>> split packages because it's a cool way to do an after the fact modularisation >>>> without changing the name of the module) >>>> - a Maven 4 artifact info is specified with info extracted from the module-info >>>> and from the POM >>>> (version is in the POM, exported packages are in the module-info, ...) >>>> etc. >>>> once you have the precise rules, it will be easier to see how to map a Maven 3 >>>> artifact to a Maven 4 and what are the compatibility rules. >>>> regards, >>>> R?mi >>>> * apart if you want to play with configurations that mix modulepath and >>>> classpath but these kind of configurations are really hard to debug. >>>> ----- Mail original ----- >>>> > De: "Robert Scholte" < rfscholte at apache.org > >>>> > ?: "Remi Forax" < forax at univ-mlv.fr > >>>> > Cc: jpms-spec-experts at openjdk.java.net , "Brian Fox" < brianf at sonatype.com > >>>> > Envoy?: Mardi 17 Janvier 2017 13:04:08 >>>> > Objet: Re: Advice + proposals regarding automodule naming >>>> > Hi R?mi, >>>> > In the end every non-jdk.* and non-java.* module in the module-info will >>>> > be a dependency in your buildtool descriptor. Such module must match >>>> > exactly one versionless dependency, or conflictId as we call it, which is >>>> > in general the groupId + artifactId (type and classifier are not relevant >>>> > for this story). >>>> > By ignoring the groupId a module can referred by multiple dependencies. So >>>> > we can expect collissions. For that reason Brian did a quick scan over >>>> > Maven Central to count the number of duplicate artifactIds. >>>> > Here's the artifactIds with 100+ groupIds: >>>> > maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) >>>> > library 391 6854 >>>> > core 312 8188 >>>> > common 142 5084 >>>> > ui 138 1414 >>>> > In theory I could have a Maven project with 391 'library'-jars on the >>>> > classpath without any problem. And as long as they are direct dependencies >>>> > I have control over this by simply not adding 'library' as requirement to >>>> > module-info. The issues start when different 'library'-jars are transitive >>>> > dependencies and when they are marked are required in the module-info file >>>> > of my direct or transitive dependencies. >>>> > Developers of the 'library'-jars cannot use library as the module name and >>>> > are forced to pick another name. As developer of my project in the end I >>>> > decide which versions of dependencies are used. If the 'library'-jar gets >>>> > a different module name and my dependency is still referring to the old >>>> > module name, the project can't be built. >>>> > What I expect is that developers are forced to remove the requirements >>>> > from their module-info because of the mentioned issues. So instead of >>>> > increasing the number requirements it will be reduced. For that reason we >>>> > say either use a unique module name from the beginning (GA) or wait until >>>> > a dependency has its own module name before adding it as requirement. >>>> > As far as I know this is the first time the JDK/JRE decides (proposes) a >>>> > name for an entity based on another entity. There are no relations between >>>> > method-, class-, or package-names and there doesn't have to be a relation >>>> > between the module name and the filename, so please don't try to do so. >>>> > regards, >>>> > Robert >>>> > On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax < forax at univ-mlv.fr > wrote: >>>> >> Hi Robert, >>>> >> the problem with automatic modules is more general that just the name, >>>> >> automatics modules also creates a flat hierarchy which doesn't map well >>>> >> with the Maven artifact descriptor. >>>> >> I wonder why you want Maven to use automatic modules, or said >>>> >> differently Maven has a lot of information about the artifact, why do >>>> >> you want to forget all these information when fetching a Maven artifact. >>>> >> I think that one problem is that you do not want to create a >>>> >> module-info.class from the Maven POM and insert it into the jar because >>>> >> it will change the artifact*. >>>> >> This kind of modules is supported by jigsaw under the name of synthetic >>>> >> modules. A synthetic module is a module with a module descriptor not >>>> >> created by javac but by another tool. >>>> >> In my opinion, automatic modules are interesting when you have jar that >>>> >> do not come from Maven central but comes from an ad-hoc build tool and >>>> >> will be considered as a leaf of the dependency DAG. >>>> >> Otherwise, for existing module system, using a synthetic module seem to >>>> >> be a better idea. >>>> >> regards, >>>> >> R?mi >>>> >> * given you have also the problem of split packages, you also need a way >>>> >> to merge several artifacts into one modular jar because it's the easy >>>> >> way to solve the split package problem. >>>> >> ----- Mail original ----- >>>> >>> De: "Robert Scholte" < rfscholte at apache.org > >>>> >>> ?: jpms-spec-experts at openjdk.java.net >>>> >>> Cc: "Apache Maven Dev" < dev at maven.apache.org > >>>> >>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >>>> >>> Objet: Advice + proposals regarding automodule naming >>>> >>> This is a message from Robert Scholte and Brian Fox. We both have been >>>> >>> talking about this topic several weeks with other Maven developers and >>>> >>> came to the conclusion that we should warn the jigsaw team with their >>>> >>> current approach regarding auto modules. We will share our experiences, >>>> >>> thoughts, conclusions and will suggest two proposals. >>>> >>> Traditionally, the Java ecosystem has been very mature in terms of >>>> >>> naming >>>> >>> and namespacing. The reverse fqdn introduced into the java package was a >>>> >>> great choice to ensure classes don?t conflict. Popular build tools such >>>> >>> as >>>> >>> Maven and nearly all those that followed built upon that this key >>>> >>> concept >>>> >>> with the introduction of ?GroupId? also using the fqdn as part of the >>>> >>> name >>>> >>> to ensure the coordinates were properly namespaced. >>>> >>> We?ve seen some ecosystems diverge from this leading to new challenges >>>> >>> that ultimately had to be reversed. A great example can be seen in the ? >>>> >>> tragic mistake from npm creators ? [1] which was to launch without a >>>> >>> namespace concept. Eventually, NPM started running out of useful names >>>> >>> and >>>> >>> had to backtrack to introduce ?scopes? which is really just a namespace >>>> >>> [2]. The real problem here is that the major change in namespace was >>>> >>> backed in after several years of momentum without it. It?s taken a long >>>> >>> time for tooling and best practice to catch up to scopes and in the >>>> >>> interim, people have been left with a dual mode, some namespaced, some >>>> >>> not >>>> >>> namespaced situation that has created chaos. [3] >>>> >>> The real issue at hand here as we consider behaviors in the jigsaw >>>> >>> automodule revolves around two well studied concepts. >>>> >>> The most important is the ?Default effect? [3] which states that >>>> >>> whatever >>>> >>> the default behavior is will become the most prominent best practice. A >>>> >>> default that uses a filename to generate a very short, un-namespaced >>>> >>> module id effectively sets the behavior to create generic names that >>>> >>> will >>>> >>> eventually conflict...exactly what we?ve seen in npm. >>>> >>> Additionally, The switching costs introduced in overcoming a default >>>> >>> un-namespaced module id to one with a unique namespace is also >>>> >>> significant >>>> >>> once you consider all the potential users. This is why API change is >>>> >>> hard, >>>> >>> and changing the module id after the fact from the default is >>>> >>> effectively >>>> >>> an API change. >>>> >>> The second principal at hand is the ?Principle of least astonishment?. >>>> >>> We >>>> >>> want to find a default that doesn?t violate what most users would >>>> >>> consider >>>> >>> to be the most obvious. One could argue the current auto module >>>> >>> algorithm >>>> >>> doesn?t violate this principle, but it?s important to consider alternate >>>> >>> suggestions in this light. >>>> >>> First, lets explore the potential downsides if the default effect takes >>>> >>> hold with the currently generated auto module id. In Apache Maven, the >>>> >>> artifact id is the part of the coordinate that generates the filename. >>>> >>> This means that com.somecompany:artifact:version will become >>>> >>> artifact-version.jar, which would result in automodule id ?artifact?. >>>> >>> Armed with this understanding, that does an analysis of the Maven >>>> >>> ecosystem have to say about potential conflicts in the automodule id? >>>> >>> If we ignore the groupid and version of all the components in the Maven >>>> >>> Central repository, we end up with over 13,500 (7% of the total >>>> >>> group:artifact combinations) conflicts. This does not consider conflicts >>>> >>> across other repositories, or within customer portfolios yet it is >>>> >>> pretty >>>> >>> telling. Conflicts will happen. In some cases, the number of conflicts >>>> >>> on >>>> >>> the same common names is well above 100. The list of conflicts as of >>>> >>> October, 2016 can be seen here. [6] >>>> >>> At this point, hopefully we?ve made the case for at least establishing a >>>> >>> default module id that >>>> >>> 1. Uses namespaces to minimizes id conflicts when possible >>>> >>> 2. Leverages the default effect to create a de facto best practice >>>> >>> 3. Follows the principle of least astonishment >>>> >>> We have two potential proposals that solve these goals. >>>> >>> Proposal 1: Leverage existing coordinates when available. >>>> >>> Maven is inarguably the most popular build system for Java components, >>>> >>> with Maven Central being the default and largest repository of Java >>>> >>> components in the world. By default, every jar built by Maven >>>> >>> automatically gets a simple properties file inserted into it with its >>>> >>> unique coordinates. Now, not every jar in Central was built with Maven, >>>> >>> however 94% of them were, as we can find the pom.properties file in >>>> >>> 1,806,023 of the 1,913,561 central components . Talk about the default >>>> >>> effect in action! >>>> >>> It?s further important to recognize that given a jar with a >>>> >>> pom.properties >>>> >>> declaring coordinates, it means that the project itself has chosen those >>>> >>> coordinates as their own name. In other words, this is how they refer to >>>> >>> themselves, even if other consumers may not be using Maven directly. >>>> >>> If automodule were able to peek inside a jar and generate the default id >>>> >>> using the groupid and artifactid present in the file, this would nearly >>>> >>> eliminate all instances of id conflict because a significant portion of >>>> >>> the Java ecosystem is in fact built with Maven. Additionally, the fact >>>> >>> that 1.8 million (and counting) modules would have namespace as the >>>> >>> default behavior means we?ve taken a huge step in setting the best >>>> >>> practice of picking module ids with a namepace. Additionally, since the >>>> >>> project itself has chosen these coordinates and uses them as their >>>> >>> primary >>>> >>> distribution mechanism, this follows the principle of least astonishment >>>> >>> to consumers regardless of their chosen build system. Finally, since all >>>> >>> of the above are true, it?s unlikely the project would need to migrate >>>> >>> to >>>> >>> a new module id when they adopt jigsaw natively, thus avoiding an API >>>> >>> switching cost for their users. >>>> >>> Proposal 2: Drop automodules >>>> >>> Right now Jigsaw tries to calculate a module name solely based on the >>>> >>> name >>>> >>> of the jar file, which now already causes issues. Besides the fact that >>>> >>> the module name is not guaranteed unique compared with its Maven >>>> >>> coordinate, there are extra transformations which makes it even less >>>> >>> guaranteed that it is unique; e.g. dashes are replaced by dots (which >>>> >>> are >>>> >>> both valid artifactId characters), in some cases the number and their >>>> >>> following characters are stripped off. For artifacts like >>>> >>> jboss-servlet-api_4.0_spec it makes sense, however we already see issues >>>> >>> here where commons-lang, commons-lang2 and commons-lang3 get the same >>>> >>> module name, >>>> >>> even though they have different artifactIds and contain different >>>> >>> packages. Choosing different artifactIds and packages was a very wise >>>> >>> decision because it made it possible that these jars could live next to >>>> >>> each other. Removing that separation by the authors is a very unwise >>>> >>> decision. >>>> >>> Another known example is the jsrNNN jars, which now all get jsr as the >>>> >>> module name. >>>> >>> Is it highly unlikely there is one single rule to capture all the use >>>> >>> cases and which always result in a module name we can work with. >>>> >>> For that reason the other proposal is to simply drop automodules. Don?t >>>> >>> try to come up with a name for unnamed jars. It might look like the >>>> >>> feature of automodules makes migrating easier because every dependency >>>> >>> will get a name so can complete your module-info for all requirements, >>>> >>> but >>>> >>> we expect that once Jigsaw comes to speed the invalid module names are >>>> >>> actually blocking further development due to name collisions or forced >>>> >>> renaming by transitive modular jars. >>>> >>> The advantage of this proposal is that library builders are not forced >>>> >>> to >>>> >>> keep the proposed module name in order to maintain backwards >>>> >>> compatibility >>>> >>> with the default.. Instead library builders can pick a more suitable >>>> >>> module name. The modular system doesn?t allow the same package to be >>>> >>> exported by multiple jars (and automodules exports every package). >>>> >>> Library >>>> >>> builders can fix this is their new jars, however if end users would >>>> >>> require both jars because they were specified as requirements in >>>> >>> different >>>> >>> transitive jars, you cannot compile this project. There?s just no >>>> >>> dependency-excludes like Maven has, because ?requires? in the >>>> >>> module-info >>>> >>> really means requires. Dropping automodules will prevent these kind of >>>> >>> issues, because a package can only be exported by a named module. >>>> >>> Sure, this means that for end users they cannot refer to every jar in >>>> >>> their module-info. But at least if they add a ?requires? to their >>>> >>> module-info, they can ensure that it?ll always refer to the intended >>>> >>> modular jar. With build tools like Maven the chance of missing artifacts >>>> >>> on the classpath has already been reduced a lot. In general builds have >>>> >>> become quite stable, so we don?t expect that developers will translate >>>> >>> all >>>> >>> dependencies to the module-info file, especially if we warn them about >>>> >>> the >>>> >>> possible consequences of depending on automodules. Only referring to >>>> >>> named >>>> >>> modules and even a single ?requires? is already a gain. There?s no >>>> >>> reason >>>> >>> to try to speed this up and give the developer the false impression that >>>> >>> it?ll keep working when upgrading to real modular jars. Focus should be >>>> >>> on >>>> >>> the target, not on the path how to reach it. >>>> >>> Dropping the automodules will prevent a lot of discussions about what is >>>> >>> the correct way to select a module name and will give the responsibility >>>> >>> for the name back to the place where it belongs: the developer. >>>> >>> [1] >>>> >>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >>>> >>> [2] >>>> >>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >>>> >>> [3] The fact that so much of the npm ecosystem is effectively >>>> >>> not-namespaced is has actually >>>> >>> created potential build time malware injection possibilities. If I know >>>> >>> of >>>> >>> a package in use by a >>>> >>> company through log analysis, bug report analysis etc, I could >>>> >>> potentially >>>> >>> go register the same >>>> >>> name in the default repo with a very high semver and know that it?s very >>>> >>> likely this would be >>>> >>> picked up over the intended internally developed module because there?s >>>> >>> no >>>> >>> namespace. >>>> >>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >>>> >>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >>>> >>> [6] >>>> >>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >>>> >>> Q5M/edit?usp=sharing >>>> >>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >>>> >>> [8] >>>> > >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From forax at univ-mlv.fr Thu Jan 19 18:14:17 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 19 Jan 2017 19:14:17 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> Message-ID: <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Robert Scholte" > ?: forax at univ-mlv.fr > Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" > Envoy?: Jeudi 19 Janvier 2017 16:43:39 > Objet: Re: Advice + proposals regarding automodule naming >>> Hi R?mi, >> >> Robert, >> >>> >>> I'm getting a JavaOne 2015 d?j? vu :) >> >> i was not at JavaOne, so let say we're progressing, at least, i may >> start to understand your problem better ... >> >>> >>> It seems like you expect there will be a new pom-definition to support >>> these kind of extra information. >>> The current POM modelVersion (4.0.0) is not only used by Maven but by a >>> lot of tools, probably even more than we know of. We wonder if they do >>> XSD >>> checking, so we must be very, very careful with every adjustment. So >>> pom-4.0.0 is a fact with all its restrictions. We are working on >>> pom-5.0.0 >>> but we will always make sure there will also be a pom-4.0.0 available >>> (either pre-generated or runtime transformed) for the current tools. >>> Also, >>> its definition should work for any software technology, not just for >>> Java. >>> In the beginning I had the idea of working with new scopes to decide if >>> a >>> dependency belongs to the modulepath or classpath, but there's a strict >>> set of scopes in pom-4.0.0, so again no option. And by now I know this >>> is >>> not required, the info is already there once I can read all module-info >>> files. >>> It would have helped if a modular jar had a different extension, so >>> every >>> can see from the *outside* what kind of jar it is. >> >> Testing if a jar is a modular jar or not is easy BTW, >> ModuleFinder.of(Paths.get("my.jar")).findAll().iterator().next().descriptor().isAutomatic() >> >> true means it's a plain old jar, false means it's a modular jar. >> > > I know, I'm already using this trick in the maven-dependency-plugin > >>> >>> There's no such thing as a Maven4 artifact: any artifact is a file >>> (often >>> jar) with a coordinate and an extra file with dependency declarations. >> >> for me, Maven4 artifact == jar + POM v5 >> >>> During dependency resolution all build-information is ignored! The >>> problem >>> with the module-info file is comparable with the java bytecode version: >>> you have to go in the jar to get this information. >> >> yes, >> but you do not need to know if it's a modular jar or a plain old jar >> during the dependency resolution, you can trust the Maven Central info, >> and then when installing, you can decide which jars should go the >> classpath, which ones should go in the modulepath >> (or which one should be upgraded from a plain jar to a modular jar >> because you can use the POM info to generate a compatible >> module-info.class) >> > > There is absolutely no reason to introduce a new POM version for this. The > required information is inside the jar; even when jars are built with > other tools, the info is there. This is actually a very good thing that > Java9 doesn't require a brand new Maven. I personally advertised that our > challenge was to make it all work with Maven 3.0 and it does. I agree that it can work with Maven 3, as you said the information are already there. But for the future, having almost the same dependency metadata encoded at two places is something that it will be weird to people that will start to use Java. But let's focus on Maven 3 ... > > bq. "i fully agree with you that Maven can not use automatic modules." > Well, Maven could do it, but we don't want to because we cannot trace it > back to the right dependency AND ensure for 100% this was indeed the > intended dependency to be the automodule. > But this is how we look at it from a Maven perspective. Any other > buildtool is free to decide what their strategy will be. As long as > developers can refer to an automatic module in their module-info, such jar > can end up in Maven Central or any other repository and all build tools > must be able to handle it. The real problem is that you can have jars in the classpath and in the module path, and it doesn't mean the same thing. Build tool do not have to manage automatic modules, you have to manage automatic modules only if you introduce a rule that decide where to put each jar. And also do not forget that deciding to put every jars in the modulepath do not work because of split packages. > > In our case Maven Central could think of adding rules to verify that the > module-info never refers to auto modules, but that's just one repository. > As long as there's support for auto modules, they will show up anywhere > and will become another dependency for a Maven project. If such dependency > becomes a requirement in the module-info, the build will fail since Maven > has detected an auto module and cannot be 100% sure which dependency is > related to it. So we must advice: don't "require" that module, which is > the opposite of what we want to achieve: best practice should be to add as > much *valid* requirements to the module-info as possible. Do not try to require a non modularized jars because you know neither its name nor its exported package. And because we have Maven Central, help developers of libraries that are popular to add a module-info seems a good strategy. At least, it avoid the problem we have with generics which is that we still have libraries that have raw types. > > regards, > Robert > R?mi >>> >>> At the moment I'm pretty far with the maven-compiler-plugin, but now >>> every >>> dependency acts like an automodule. My next step would probably be to >>> analyze every module-info file and decide if jars belong to the >>> classpath >>> or modulepath, only allowing modular jars on the module path because of >>> our concerns. >> >> yes, >> as i said in the previous paragraph, you can also decide that with the >> help of the POM info, you can try to upgrade the jar to make it modular. >> >>> >>> regards, >>> Robert >> >> regards, >> R?mi >> >> >>> >>> On Tue, 17 Jan 2017 23:11:11 +0100, wrote: >>> >>>> Robert, >>>> i fully agree with you that Maven can not use automatic modules. >>>> Automatic modules have weird name rules, everything is exported and has >>>> no dependency itself*, so they are useless if you already have already >>>> a >>>> trove of info like the Maven POM. >>>> >>>> In my opinion, the real question is not how to map existing Maven >>>> artifacts to Java modules but more, >>>> how Maven 4 artifacts are mapped to Java modules and then how to make >>>> the transition between Maven 3 artifacts to Maven 4 artifacts as smooth >>>> as possible. >>>> >>>> Here is my take on what can be a Maven 4 artifact, >>>> - a Maven 4 artifact can only depends other Maven 4 artifact (and >>>> their >>>> are some way to see a Maven3 artifact as a Maven 4 artifact if the POM >>>> is siple enough), >>>> - a Maven 4 artifact do not allow split packages (a lot of Maven 3 >>>> artifact uses split packages because it's a cool way to do an after the >>>> fact modularisation >>>> without changing the name of the module) >>>> - a Maven 4 artifact info is specified with info extracted from the >>>> module-info and from the POM >>>> (version is in the POM, exported packages are in the module-info, >>>> ...) >>>> etc. >>>> >>>> once you have the precise rules, it will be easier to see how to map a >>>> Maven 3 artifact to a Maven 4 and what are the compatibility rules. >>>> >>>> regards, >>>> R?mi >>>> >>>> * apart if you want to play with configurations that mix modulepath and >>>> classpath but these kind of configurations are really hard to debug. >>>> >>>> ----- Mail original ----- >>>>> De: "Robert Scholte" >>>>> ?: "Remi Forax" >>>>> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >>>>> >>>>> Envoy?: Mardi 17 Janvier 2017 13:04:08 >>>>> Objet: Re: Advice + proposals regarding automodule naming >>>> >>>>> Hi R?mi, >>>>> >>>>> In the end every non-jdk.* and non-java.* module in the module-info >>>>> will >>>>> be a dependency in your buildtool descriptor. Such module must match >>>>> exactly one versionless dependency, or conflictId as we call it, which >>>>> is >>>>> in general the groupId + artifactId (type and classifier are not >>>>> relevant >>>>> for this story). >>>>> By ignoring the groupId a module can referred by multiple >>>>> dependencies. >>>>> So >>>>> we can expect collissions. For that reason Brian did a quick scan over >>>>> Maven Central to count the number of duplicate artifactIds. >>>>> >>>>> Here's the artifactIds with 100+ groupIds: >>>>> maven_artifact_id count(DISTINCT maven_group_id) count(maven_group_id) >>>>> library 391 6854 >>>>> core 312 8188 >>>>> common 142 5084 >>>>> ui 138 1414 >>>>> >>>>> In theory I could have a Maven project with 391 'library'-jars on the >>>>> classpath without any problem. And as long as they are direct >>>>> dependencies >>>>> I have control over this by simply not adding 'library' as requirement >>>>> to >>>>> module-info. The issues start when different 'library'-jars are >>>>> transitive >>>>> dependencies and when they are marked are required in the module-info >>>>> file >>>>> of my direct or transitive dependencies. >>>>> >>>>> Developers of the 'library'-jars cannot use library as the module name >>>>> and >>>>> are forced to pick another name. As developer of my project in the >>>>> end I >>>>> decide which versions of dependencies are used. If the 'library'-jar >>>>> gets >>>>> a different module name and my dependency is still referring to the >>>>> old >>>>> module name, the project can't be built. >>>>> >>>>> What I expect is that developers are forced to remove the requirements >>>>> from their module-info because of the mentioned issues. So instead of >>>>> increasing the number requirements it will be reduced. For that reason >>>>> we >>>>> say either use a unique module name from the beginning (GA) or wait >>>>> until >>>>> a dependency has its own module name before adding it as requirement. >>>>> >>>>> As far as I know this is the first time the JDK/JRE decides >>>>> (proposes) a >>>>> name for an entity based on another entity. There are no relations >>>>> between >>>>> method-, class-, or package-names and there doesn't have to be a >>>>> relation >>>>> between the module name and the filename, so please don't try to do >>>>> so. >>>>> >>>>> regards, >>>>> Robert >>>>> >>>>> On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax >>>>> wrote: >>>>>> Hi Robert, >>>>>> the problem with automatic modules is more general that just the >>>>>> name, >>>>>> automatics modules also creates a flat hierarchy which doesn't map >>>>>> well >>>>>> with the Maven artifact descriptor. >>>>>> >>>>>> I wonder why you want Maven to use automatic modules, or said >>>>>> differently Maven has a lot of information about the artifact, why do >>>>>> you want to forget all these information when fetching a Maven >>>>>> artifact. >>>>>> >>>>>> I think that one problem is that you do not want to create a >>>>>> module-info.class from the Maven POM and insert it into the jar >>>>>> because >>>>>> it will change the artifact*. >>>>>> This kind of modules is supported by jigsaw under the name of >>>>>> synthetic >>>>>> modules. A synthetic module is a module with a module descriptor not >>>>>> created by javac but by another tool. >>>>>> >>>>>> In my opinion, automatic modules are interesting when you have jar >>>>>> that >>>>>> do not come from Maven central but comes from an ad-hoc build tool >>>>>> and >>>>>> will be considered as a leaf of the dependency DAG. >>>>>> Otherwise, for existing module system, using a synthetic module seem >>>>>> to >>>>>> be a better idea. >>>>>> >>>>>> regards, >>>>>> R?mi >>>>>> >>>>>> * given you have also the problem of split packages, you also need a >>>>>> way >>>>>> to merge several artifacts into one modular jar because it's the easy >>>>>> way to solve the split package problem. >>>>>> >>>>>> ----- Mail original ----- >>>>>>> De: "Robert Scholte" >>>>>>> ?: jpms-spec-experts at openjdk.java.net >>>>>>> Cc: "Apache Maven Dev" >>>>>>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >>>>>>> Objet: Advice + proposals regarding automodule naming >>>>>> >>>>>>> This is a message from Robert Scholte and Brian Fox. We both have >>>>>>> been >>>>>>> talking about this topic several weeks with other Maven developers >>>>>>> and >>>>>>> came to the conclusion that we should warn the jigsaw team with >>>>>>> their >>>>>>> current approach regarding auto modules. We will share our >>>>>>> experiences, >>>>>>> thoughts, conclusions and will suggest two proposals. >>>>>>> >>>>>>> Traditionally, the Java ecosystem has been very mature in terms of >>>>>>> naming >>>>>>> and namespacing. The reverse fqdn introduced into the java package >>>>>>> was a >>>>>>> great choice to ensure classes don?t conflict. Popular build tools >>>>>>> such >>>>>>> as >>>>>>> Maven and nearly all those that followed built upon that this key >>>>>>> concept >>>>>>> with the introduction of ?GroupId? also using the fqdn as part of >>>>>>> the >>>>>>> name >>>>>>> to ensure the coordinates were properly namespaced. >>>>>>> >>>>>>> We?ve seen some ecosystems diverge from this leading to new >>>>>>> challenges >>>>>>> that ultimately had to be reversed. A great example can be seen in >>>>>>> the ? >>>>>>> tragic mistake from npm creators ? [1] which was to launch without a >>>>>>> namespace concept. Eventually, NPM started running out of useful >>>>>>> names >>>>>>> and >>>>>>> had to backtrack to introduce ?scopes? which is really just a >>>>>>> namespace >>>>>>> [2]. The real problem here is that the major change in namespace was >>>>>>> backed in after several years of momentum without it. It?s taken a >>>>>>> long >>>>>>> time for tooling and best practice to catch up to scopes and in the >>>>>>> interim, people have been left with a dual mode, some namespaced, >>>>>>> some >>>>>>> not >>>>>>> namespaced situation that has created chaos. [3] >>>>>>> >>>>>>> The real issue at hand here as we consider behaviors in the jigsaw >>>>>>> automodule revolves around two well studied concepts. >>>>>>> >>>>>>> The most important is the ?Default effect? [3] which states that >>>>>>> whatever >>>>>>> the default behavior is will become the most prominent best >>>>>>> practice. >>>>>>> A >>>>>>> default that uses a filename to generate a very short, un-namespaced >>>>>>> module id effectively sets the behavior to create generic names that >>>>>>> will >>>>>>> eventually conflict...exactly what we?ve seen in npm. >>>>>>> >>>>>>> Additionally, The switching costs introduced in overcoming a default >>>>>>> un-namespaced module id to one with a unique namespace is also >>>>>>> significant >>>>>>> once you consider all the potential users. This is why API change is >>>>>>> hard, >>>>>>> and changing the module id after the fact from the default is >>>>>>> effectively >>>>>>> an API change. >>>>>>> >>>>>>> The second principal at hand is the ?Principle of least >>>>>>> astonishment?. >>>>>>> We >>>>>>> want to find a default that doesn?t violate what most users would >>>>>>> consider >>>>>>> to be the most obvious. One could argue the current auto module >>>>>>> algorithm >>>>>>> doesn?t violate this principle, but it?s important to consider >>>>>>> alternate >>>>>>> suggestions in this light. >>>>>>> >>>>>>> First, lets explore the potential downsides if the default effect >>>>>>> takes >>>>>>> hold with the currently generated auto module id. In Apache Maven, >>>>>>> the >>>>>>> artifact id is the part of the coordinate that generates the >>>>>>> filename. >>>>>>> This means that com.somecompany:artifact:version will become >>>>>>> artifact-version.jar, which would result in automodule id >>>>>>> ?artifact?. >>>>>>> Armed with this understanding, that does an analysis of the Maven >>>>>>> ecosystem have to say about potential conflicts in the automodule >>>>>>> id? >>>>>>> >>>>>>> If we ignore the groupid and version of all the components in the >>>>>>> Maven >>>>>>> Central repository, we end up with over 13,500 (7% of the total >>>>>>> group:artifact combinations) conflicts. This does not consider >>>>>>> conflicts >>>>>>> across other repositories, or within customer portfolios yet it is >>>>>>> pretty >>>>>>> telling. Conflicts will happen. In some cases, the number of >>>>>>> conflicts >>>>>>> on >>>>>>> the same common names is well above 100. The list of conflicts as of >>>>>>> October, 2016 can be seen here. [6] >>>>>>> >>>>>>> At this point, hopefully we?ve made the case for at least >>>>>>> establishing a >>>>>>> default module id that >>>>>>> 1. Uses namespaces to minimizes id conflicts when possible >>>>>>> 2. Leverages the default effect to create a de facto best practice >>>>>>> 3. Follows the principle of least astonishment >>>>>>> >>>>>>> We have two potential proposals that solve these goals. >>>>>>> >>>>>>> Proposal 1: Leverage existing coordinates when available. >>>>>>> >>>>>>> Maven is inarguably the most popular build system for Java >>>>>>> components, >>>>>>> with Maven Central being the default and largest repository of Java >>>>>>> components in the world. By default, every jar built by Maven >>>>>>> automatically gets a simple properties file inserted into it with >>>>>>> its >>>>>>> unique coordinates. Now, not every jar in Central was built with >>>>>>> Maven, >>>>>>> however 94% of them were, as we can find the pom.properties file in >>>>>>> 1,806,023 of the 1,913,561 central components . Talk about the >>>>>>> default >>>>>>> effect in action! >>>>>>> >>>>>>> It?s further important to recognize that given a jar with a >>>>>>> pom.properties >>>>>>> declaring coordinates, it means that the project itself has chosen >>>>>>> those >>>>>>> coordinates as their own name. In other words, this is how they >>>>>>> refer >>>>>>> to >>>>>>> themselves, even if other consumers may not be using Maven directly. >>>>>>> >>>>>>> If automodule were able to peek inside a jar and generate the >>>>>>> default >>>>>>> id >>>>>>> using the groupid and artifactid present in the file, this would >>>>>>> nearly >>>>>>> eliminate all instances of id conflict because a significant portion >>>>>>> of >>>>>>> the Java ecosystem is in fact built with Maven. Additionally, the >>>>>>> fact >>>>>>> that 1.8 million (and counting) modules would have namespace as the >>>>>>> default behavior means we?ve taken a huge step in setting the best >>>>>>> practice of picking module ids with a namepace. Additionally, since >>>>>>> the >>>>>>> project itself has chosen these coordinates and uses them as their >>>>>>> primary >>>>>>> distribution mechanism, this follows the principle of least >>>>>>> astonishment >>>>>>> to consumers regardless of their chosen build system. Finally, since >>>>>>> all >>>>>>> of the above are true, it?s unlikely the project would need to >>>>>>> migrate >>>>>>> to >>>>>>> a new module id when they adopt jigsaw natively, thus avoiding an >>>>>>> API >>>>>>> switching cost for their users. >>>>>>> >>>>>>> Proposal 2: Drop automodules >>>>>>> Right now Jigsaw tries to calculate a module name solely based on >>>>>>> the >>>>>>> name >>>>>>> of the jar file, which now already causes issues. Besides the fact >>>>>>> that >>>>>>> the module name is not guaranteed unique compared with its Maven >>>>>>> coordinate, there are extra transformations which makes it even less >>>>>>> guaranteed that it is unique; e.g. dashes are replaced by dots >>>>>>> (which >>>>>>> are >>>>>>> both valid artifactId characters), in some cases the number and >>>>>>> their >>>>>>> following characters are stripped off. For artifacts like >>>>>>> jboss-servlet-api_4.0_spec it makes sense, however we already see >>>>>>> issues >>>>>>> here where commons-lang, commons-lang2 and commons-lang3 get the >>>>>>> same >>>>>>> module name, >>>>>>> even though they have different artifactIds and contain different >>>>>>> packages. Choosing different artifactIds and packages was a very >>>>>>> wise >>>>>>> decision because it made it possible that these jars could live next >>>>>>> to >>>>>>> each other. Removing that separation by the authors is a very unwise >>>>>>> decision. >>>>>>> >>>>>>> Another known example is the jsrNNN jars, which now all get jsr as >>>>>>> the >>>>>>> module name. >>>>>>> >>>>>>> Is it highly unlikely there is one single rule to capture all the >>>>>>> use >>>>>>> cases and which always result in a module name we can work with. >>>>>>> >>>>>>> For that reason the other proposal is to simply drop automodules. >>>>>>> Don?t >>>>>>> try to come up with a name for unnamed jars. It might look like the >>>>>>> feature of automodules makes migrating easier because every >>>>>>> dependency >>>>>>> will get a name so can complete your module-info for all >>>>>>> requirements, >>>>>>> but >>>>>>> we expect that once Jigsaw comes to speed the invalid module names >>>>>>> are >>>>>>> actually blocking further development due to name collisions or >>>>>>> forced >>>>>>> renaming by transitive modular jars. >>>>>>> >>>>>>> The advantage of this proposal is that library builders are not >>>>>>> forced >>>>>>> to >>>>>>> keep the proposed module name in order to maintain backwards >>>>>>> compatibility >>>>>>> with the default.. Instead library builders can pick a more suitable >>>>>>> module name. The modular system doesn?t allow the same package to be >>>>>>> exported by multiple jars (and automodules exports every package). >>>>>>> Library >>>>>>> builders can fix this is their new jars, however if end users would >>>>>>> require both jars because they were specified as requirements in >>>>>>> different >>>>>>> transitive jars, you cannot compile this project. There?s just no >>>>>>> dependency-excludes like Maven has, because ?requires? in the >>>>>>> module-info >>>>>>> really means requires. Dropping automodules will prevent these kind >>>>>>> of >>>>>>> issues, because a package can only be exported by a named module. >>>>>>> >>>>>>> Sure, this means that for end users they cannot refer to every jar >>>>>>> in >>>>>>> their module-info. But at least if they add a ?requires? to their >>>>>>> module-info, they can ensure that it?ll always refer to the intended >>>>>>> modular jar. With build tools like Maven the chance of missing >>>>>>> artifacts >>>>>>> on the classpath has already been reduced a lot. In general builds >>>>>>> have >>>>>>> become quite stable, so we don?t expect that developers will >>>>>>> translate >>>>>>> all >>>>>>> dependencies to the module-info file, especially if we warn them >>>>>>> about >>>>>>> the >>>>>>> possible consequences of depending on automodules. Only referring to >>>>>>> named >>>>>>> modules and even a single ?requires? is already a gain. There?s no >>>>>>> reason >>>>>>> to try to speed this up and give the developer the false impression >>>>>>> that >>>>>>> it?ll keep working when upgrading to real modular jars. Focus should >>>>>>> be >>>>>>> on >>>>>>> the target, not on the path how to reach it. >>>>>>> >>>>>>> Dropping the automodules will prevent a lot of discussions about >>>>>>> what >>>>>>> is >>>>>>> the correct way to select a module name and will give the >>>>>>> responsibility >>>>>>> for the name back to the place where it belongs: the developer. >>>>>>> >>>>>>> [1] >>>>>>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >>>>>>> [2] >>>>>>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >>>>>>> [3] The fact that so much of the npm ecosystem is effectively >>>>>>> not-namespaced is has actually >>>>>>> created potential build time malware injection possibilities. If I >>>>>>> know >>>>>>> of >>>>>>> a package in use by a >>>>>>> company through log analysis, bug report analysis etc, I could >>>>>>> potentially >>>>>>> go register the same >>>>>>> name in the default repo with a very high semver and know that it?s >>>>>>> very >>>>>>> likely this would be >>>>>>> picked up over the intended internally developed module because >>>>>>> there?s >>>>>>> no >>>>>>> namespace. >>>>>>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >>>>>>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >>>>>>> [6] >>>>>>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >>>>>>> Q5M/edit?usp=sharing >>>>>>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >>>>>>> [8] >>>>> >> >>> >> > >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From rfscholte at apache.org Thu Jan 19 20:07:51 2017 From: rfscholte at apache.org (Robert Scholte) Date: Thu, 19 Jan 2017 21:07:51 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> References: <652339338.395860.1484581443472.JavaMail.zimbra@u-pem.fr> <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> Message-ID: On Thu, 19 Jan 2017 19:14:17 +0100, wrote: > > > ----- Mail original ----- >> De: "Robert Scholte" >> ?: forax at univ-mlv.fr >> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >> >> Envoy?: Jeudi 19 Janvier 2017 16:43:39 >> Objet: Re: Advice + proposals regarding automodule naming > >>>> Hi R?mi, >>> >>> Robert, >>> >>>> >>>> I'm getting a JavaOne 2015 d?j? vu :) >>> >>> i was not at JavaOne, so let say we're progressing, at least, i may >>> start to understand your problem better ... >>> >>>> >>>> It seems like you expect there will be a new pom-definition to support >>>> these kind of extra information. >>>> The current POM modelVersion (4.0.0) is not only used by Maven but by >>>> a >>>> lot of tools, probably even more than we know of. We wonder if they do >>>> XSD >>>> checking, so we must be very, very careful with every adjustment. So >>>> pom-4.0.0 is a fact with all its restrictions. We are working on >>>> pom-5.0.0 >>>> but we will always make sure there will also be a pom-4.0.0 available >>>> (either pre-generated or runtime transformed) for the current tools. >>>> Also, >>>> its definition should work for any software technology, not just for >>>> Java. >>>> In the beginning I had the idea of working with new scopes to decide >>>> if >>>> a >>>> dependency belongs to the modulepath or classpath, but there's a >>>> strict >>>> set of scopes in pom-4.0.0, so again no option. And by now I know this >>>> is >>>> not required, the info is already there once I can read all >>>> module-info >>>> files. >>>> It would have helped if a modular jar had a different extension, so >>>> every >>>> can see from the *outside* what kind of jar it is. >>> >>> Testing if a jar is a modular jar or not is easy BTW, >>> ModuleFinder.of(Paths.get("my.jar")).findAll().iterator().next().descriptor().isAutomatic() >>> >>> true means it's a plain old jar, false means it's a modular jar. >>> >> >> I know, I'm already using this trick in the maven-dependency-plugin >> >>>> >>>> There's no such thing as a Maven4 artifact: any artifact is a file >>>> (often >>>> jar) with a coordinate and an extra file with dependency declarations. >>> >>> for me, Maven4 artifact == jar + POM v5 >>> >>>> During dependency resolution all build-information is ignored! The >>>> problem >>>> with the module-info file is comparable with the java bytecode >>>> version: >>>> you have to go in the jar to get this information. >>> >>> yes, >>> but you do not need to know if it's a modular jar or a plain old jar >>> during the dependency resolution, you can trust the Maven Central info, >>> and then when installing, you can decide which jars should go the >>> classpath, which ones should go in the modulepath >>> (or which one should be upgraded from a plain jar to a modular jar >>> because you can use the POM info to generate a compatible >>> module-info.class) >>> >> >> There is absolutely no reason to introduce a new POM version for this. >> The >> required information is inside the jar; even when jars are built with >> other tools, the info is there. This is actually a very good thing that >> Java9 doesn't require a brand new Maven. I personally advertised that >> our >> challenge was to make it all work with Maven 3.0 and it does. > > I agree that it can work with Maven 3, as you said the information are > already there. > But for the future, having almost the same dependency metadata encoded > at two places is something that it will be weird to people that will > start to use Java. > > But let's focus on Maven 3 ... > >> >> bq. "i fully agree with you that Maven can not use automatic modules." >> Well, Maven could do it, but we don't want to because we cannot trace it >> back to the right dependency AND ensure for 100% this was indeed the >> intended dependency to be the automodule. >> But this is how we look at it from a Maven perspective. Any other >> buildtool is free to decide what their strategy will be. As long as >> developers can refer to an automatic module in their module-info, such >> jar >> can end up in Maven Central or any other repository and all build tools >> must be able to handle it. > > The real problem is that you can have jars in the classpath and in the > module path, and it doesn't mean the same thing. Exactly! > Build tool do not have to manage automatic modules, you have to manage > automatic modules only if you introduce a rule that decide where to put > each jar. That rule is already there, it is the "require M.N" in the module-info file, but it isn't aware if that module automatic or not. This file should be the only place where people should care about Jigsaw related definitions. At conferences THE returning question is: can you generate the module-info for me, because I've already specified my dependencies in the pom. So there is already some frustration here, so we most be very clear what belongs where and what every responsibility is: pom.xml : specify the dependency coordinates you want to use, so Maven can download them and can make them available for the maven-plugins module-info.java : specify the modules required to compile and/or run; these will be checked upfront to confirm everything is there. maven-compiler-plugin : build up the correct arguments to call the java compiler. In the of the maven-compiler-plugin it is of course possible to define which jars belong on which path, but that would make it very hard to use and maintain. For example: for test-compile I could have added a parameter called "moduleName" and use this value for the -Xmodule: argument. I've decided not to do so, because the name is already specified in the module-info. For me it was kind of frustrating that one assumed the name of the module was a given fact, which implied I had to parse the module-info file. b+148 killed this strategy for a while, but it is fixed again. If only I could have point to target/classes ... ;) Anyway, this is solved once the structure of the module-info file is final. To prevent users to add their requirements to the module-info file AND specify which jars belong on which path in either the or the of the maven-compiler-plugin, I think we should "simply" read the module-info.java*, gather all requirements and find the matching jar. And do that recursively. * I maintain QDox, a Java source parser, as well. > And also do not forget that deciding to put every jars in the modulepath > do not work because of split packages. > Exactly! >> >> In our case Maven Central could think of adding rules to verify that the >> module-info never refers to auto modules, but that's just one >> repository. >> As long as there's support for auto modules, they will show up anywhere >> and will become another dependency for a Maven project. If such >> dependency >> becomes a requirement in the module-info, the build will fail since >> Maven >> has detected an auto module and cannot be 100% sure which dependency is >> related to it. So we must advice: don't "require" that module, which is >> the opposite of what we want to achieve: best practice should be to add >> as >> much *valid* requirements to the module-info as possible. > > Do not try to require a non modularized jars because you know neither > its name nor its exported package. > And because we have Maven Central, help developers of libraries that are > popular to add a module-info seems a good strategy. > > At least, it avoid the problem we have with generics which is that we > still have libraries that have raw types. > >> >> regards, >> Robert >> > > R?mi > >>>> >>>> At the moment I'm pretty far with the maven-compiler-plugin, but now >>>> every >>>> dependency acts like an automodule. My next step would probably be to >>>> analyze every module-info file and decide if jars belong to the >>>> classpath >>>> or modulepath, only allowing modular jars on the module path because >>>> of >>>> our concerns. >>> >>> yes, >>> as i said in the previous paragraph, you can also decide that with the >>> help of the POM info, you can try to upgrade the jar to make it >>> modular. >>> >>>> >>>> regards, >>>> Robert >>> >>> regards, >>> R?mi >>> >>> >>>> >>>> On Tue, 17 Jan 2017 23:11:11 +0100, wrote: >>>> >>>>> Robert, >>>>> i fully agree with you that Maven can not use automatic modules. >>>>> Automatic modules have weird name rules, everything is exported and >>>>> has >>>>> no dependency itself*, so they are useless if you already have >>>>> already >>>>> a >>>>> trove of info like the Maven POM. >>>>> >>>>> In my opinion, the real question is not how to map existing Maven >>>>> artifacts to Java modules but more, >>>>> how Maven 4 artifacts are mapped to Java modules and then how to make >>>>> the transition between Maven 3 artifacts to Maven 4 artifacts as >>>>> smooth >>>>> as possible. >>>>> >>>>> Here is my take on what can be a Maven 4 artifact, >>>>> - a Maven 4 artifact can only depends other Maven 4 artifact (and >>>>> their >>>>> are some way to see a Maven3 artifact as a Maven 4 artifact if the >>>>> POM >>>>> is siple enough), >>>>> - a Maven 4 artifact do not allow split packages (a lot of Maven 3 >>>>> artifact uses split packages because it's a cool way to do an after >>>>> the >>>>> fact modularisation >>>>> without changing the name of the module) >>>>> - a Maven 4 artifact info is specified with info extracted from the >>>>> module-info and from the POM >>>>> (version is in the POM, exported packages are in the module-info, >>>>> ...) >>>>> etc. >>>>> >>>>> once you have the precise rules, it will be easier to see how to map >>>>> a >>>>> Maven 3 artifact to a Maven 4 and what are the compatibility rules. >>>>> >>>>> regards, >>>>> R?mi >>>>> >>>>> * apart if you want to play with configurations that mix modulepath >>>>> and >>>>> classpath but these kind of configurations are really hard to debug. >>>>> >>>>> ----- Mail original ----- >>>>>> De: "Robert Scholte" >>>>>> ?: "Remi Forax" >>>>>> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >>>>>> >>>>>> Envoy?: Mardi 17 Janvier 2017 13:04:08 >>>>>> Objet: Re: Advice + proposals regarding automodule naming >>>>> >>>>>> Hi R?mi, >>>>>> >>>>>> In the end every non-jdk.* and non-java.* module in the module-info >>>>>> will >>>>>> be a dependency in your buildtool descriptor. Such module must match >>>>>> exactly one versionless dependency, or conflictId as we call it, >>>>>> which >>>>>> is >>>>>> in general the groupId + artifactId (type and classifier are not >>>>>> relevant >>>>>> for this story). >>>>>> By ignoring the groupId a module can referred by multiple >>>>>> dependencies. >>>>>> So >>>>>> we can expect collissions. For that reason Brian did a quick scan >>>>>> over >>>>>> Maven Central to count the number of duplicate artifactIds. >>>>>> >>>>>> Here's the artifactIds with 100+ groupIds: >>>>>> maven_artifact_id count(DISTINCT >>>>>> maven_group_id) count(maven_group_id) >>>>>> library 391 6854 >>>>>> core 312 8188 >>>>>> common 142 5084 >>>>>> ui 138 1414 >>>>>> >>>>>> In theory I could have a Maven project with 391 'library'-jars on >>>>>> the >>>>>> classpath without any problem. And as long as they are direct >>>>>> dependencies >>>>>> I have control over this by simply not adding 'library' as >>>>>> requirement >>>>>> to >>>>>> module-info. The issues start when different 'library'-jars are >>>>>> transitive >>>>>> dependencies and when they are marked are required in the >>>>>> module-info >>>>>> file >>>>>> of my direct or transitive dependencies. >>>>>> >>>>>> Developers of the 'library'-jars cannot use library as the module >>>>>> name >>>>>> and >>>>>> are forced to pick another name. As developer of my project in the >>>>>> end I >>>>>> decide which versions of dependencies are used. If the 'library'-jar >>>>>> gets >>>>>> a different module name and my dependency is still referring to the >>>>>> old >>>>>> module name, the project can't be built. >>>>>> >>>>>> What I expect is that developers are forced to remove the >>>>>> requirements >>>>>> from their module-info because of the mentioned issues. So instead >>>>>> of >>>>>> increasing the number requirements it will be reduced. For that >>>>>> reason >>>>>> we >>>>>> say either use a unique module name from the beginning (GA) or wait >>>>>> until >>>>>> a dependency has its own module name before adding it as >>>>>> requirement. >>>>>> >>>>>> As far as I know this is the first time the JDK/JRE decides >>>>>> (proposes) a >>>>>> name for an entity based on another entity. There are no relations >>>>>> between >>>>>> method-, class-, or package-names and there doesn't have to be a >>>>>> relation >>>>>> between the module name and the filename, so please don't try to do >>>>>> so. >>>>>> >>>>>> regards, >>>>>> Robert >>>>>> >>>>>> On Mon, 16 Jan 2017 16:44:03 +0100, Remi Forax >>>>>> wrote: >>>>>>> Hi Robert, >>>>>>> the problem with automatic modules is more general that just the >>>>>>> name, >>>>>>> automatics modules also creates a flat hierarchy which doesn't map >>>>>>> well >>>>>>> with the Maven artifact descriptor. >>>>>>> >>>>>>> I wonder why you want Maven to use automatic modules, or said >>>>>>> differently Maven has a lot of information about the artifact, why >>>>>>> do >>>>>>> you want to forget all these information when fetching a Maven >>>>>>> artifact. >>>>>>> >>>>>>> I think that one problem is that you do not want to create a >>>>>>> module-info.class from the Maven POM and insert it into the jar >>>>>>> because >>>>>>> it will change the artifact*. >>>>>>> This kind of modules is supported by jigsaw under the name of >>>>>>> synthetic >>>>>>> modules. A synthetic module is a module with a module descriptor >>>>>>> not >>>>>>> created by javac but by another tool. >>>>>>> >>>>>>> In my opinion, automatic modules are interesting when you have jar >>>>>>> that >>>>>>> do not come from Maven central but comes from an ad-hoc build tool >>>>>>> and >>>>>>> will be considered as a leaf of the dependency DAG. >>>>>>> Otherwise, for existing module system, using a synthetic module >>>>>>> seem >>>>>>> to >>>>>>> be a better idea. >>>>>>> >>>>>>> regards, >>>>>>> R?mi >>>>>>> >>>>>>> * given you have also the problem of split packages, you also need >>>>>>> a >>>>>>> way >>>>>>> to merge several artifacts into one modular jar because it's the >>>>>>> easy >>>>>>> way to solve the split package problem. >>>>>>> >>>>>>> ----- Mail original ----- >>>>>>>> De: "Robert Scholte" >>>>>>>> ?: jpms-spec-experts at openjdk.java.net >>>>>>>> Cc: "Apache Maven Dev" >>>>>>>> Envoy?: Lundi 16 Janvier 2017 10:37:08 >>>>>>>> Objet: Advice + proposals regarding automodule naming >>>>>>> >>>>>>>> This is a message from Robert Scholte and Brian Fox. We both have >>>>>>>> been >>>>>>>> talking about this topic several weeks with other Maven developers >>>>>>>> and >>>>>>>> came to the conclusion that we should warn the jigsaw team with >>>>>>>> their >>>>>>>> current approach regarding auto modules. We will share our >>>>>>>> experiences, >>>>>>>> thoughts, conclusions and will suggest two proposals. >>>>>>>> >>>>>>>> Traditionally, the Java ecosystem has been very mature in terms of >>>>>>>> naming >>>>>>>> and namespacing. The reverse fqdn introduced into the java package >>>>>>>> was a >>>>>>>> great choice to ensure classes don?t conflict. Popular build tools >>>>>>>> such >>>>>>>> as >>>>>>>> Maven and nearly all those that followed built upon that this key >>>>>>>> concept >>>>>>>> with the introduction of ?GroupId? also using the fqdn as part of >>>>>>>> the >>>>>>>> name >>>>>>>> to ensure the coordinates were properly namespaced. >>>>>>>> >>>>>>>> We?ve seen some ecosystems diverge from this leading to new >>>>>>>> challenges >>>>>>>> that ultimately had to be reversed. A great example can be seen in >>>>>>>> the ? >>>>>>>> tragic mistake from npm creators ? [1] which was to launch >>>>>>>> without a >>>>>>>> namespace concept. Eventually, NPM started running out of useful >>>>>>>> names >>>>>>>> and >>>>>>>> had to backtrack to introduce ?scopes? which is really just a >>>>>>>> namespace >>>>>>>> [2]. The real problem here is that the major change in namespace >>>>>>>> was >>>>>>>> backed in after several years of momentum without it. It?s taken a >>>>>>>> long >>>>>>>> time for tooling and best practice to catch up to scopes and in >>>>>>>> the >>>>>>>> interim, people have been left with a dual mode, some namespaced, >>>>>>>> some >>>>>>>> not >>>>>>>> namespaced situation that has created chaos. [3] >>>>>>>> >>>>>>>> The real issue at hand here as we consider behaviors in the jigsaw >>>>>>>> automodule revolves around two well studied concepts. >>>>>>>> >>>>>>>> The most important is the ?Default effect? [3] which states that >>>>>>>> whatever >>>>>>>> the default behavior is will become the most prominent best >>>>>>>> practice. >>>>>>>> A >>>>>>>> default that uses a filename to generate a very short, >>>>>>>> un-namespaced >>>>>>>> module id effectively sets the behavior to create generic names >>>>>>>> that >>>>>>>> will >>>>>>>> eventually conflict...exactly what we?ve seen in npm. >>>>>>>> >>>>>>>> Additionally, The switching costs introduced in overcoming a >>>>>>>> default >>>>>>>> un-namespaced module id to one with a unique namespace is also >>>>>>>> significant >>>>>>>> once you consider all the potential users. This is why API change >>>>>>>> is >>>>>>>> hard, >>>>>>>> and changing the module id after the fact from the default is >>>>>>>> effectively >>>>>>>> an API change. >>>>>>>> >>>>>>>> The second principal at hand is the ?Principle of least >>>>>>>> astonishment?. >>>>>>>> We >>>>>>>> want to find a default that doesn?t violate what most users would >>>>>>>> consider >>>>>>>> to be the most obvious. One could argue the current auto module >>>>>>>> algorithm >>>>>>>> doesn?t violate this principle, but it?s important to consider >>>>>>>> alternate >>>>>>>> suggestions in this light. >>>>>>>> >>>>>>>> First, lets explore the potential downsides if the default effect >>>>>>>> takes >>>>>>>> hold with the currently generated auto module id. In Apache Maven, >>>>>>>> the >>>>>>>> artifact id is the part of the coordinate that generates the >>>>>>>> filename. >>>>>>>> This means that com.somecompany:artifact:version will become >>>>>>>> artifact-version.jar, which would result in automodule id >>>>>>>> ?artifact?. >>>>>>>> Armed with this understanding, that does an analysis of the Maven >>>>>>>> ecosystem have to say about potential conflicts in the automodule >>>>>>>> id? >>>>>>>> >>>>>>>> If we ignore the groupid and version of all the components in the >>>>>>>> Maven >>>>>>>> Central repository, we end up with over 13,500 (7% of the total >>>>>>>> group:artifact combinations) conflicts. This does not consider >>>>>>>> conflicts >>>>>>>> across other repositories, or within customer portfolios yet it is >>>>>>>> pretty >>>>>>>> telling. Conflicts will happen. In some cases, the number of >>>>>>>> conflicts >>>>>>>> on >>>>>>>> the same common names is well above 100. The list of conflicts as >>>>>>>> of >>>>>>>> October, 2016 can be seen here. [6] >>>>>>>> >>>>>>>> At this point, hopefully we?ve made the case for at least >>>>>>>> establishing a >>>>>>>> default module id that >>>>>>>> 1. Uses namespaces to minimizes id conflicts when possible >>>>>>>> 2. Leverages the default effect to create a de facto best practice >>>>>>>> 3. Follows the principle of least astonishment >>>>>>>> >>>>>>>> We have two potential proposals that solve these goals. >>>>>>>> >>>>>>>> Proposal 1: Leverage existing coordinates when available. >>>>>>>> >>>>>>>> Maven is inarguably the most popular build system for Java >>>>>>>> components, >>>>>>>> with Maven Central being the default and largest repository of >>>>>>>> Java >>>>>>>> components in the world. By default, every jar built by Maven >>>>>>>> automatically gets a simple properties file inserted into it with >>>>>>>> its >>>>>>>> unique coordinates. Now, not every jar in Central was built with >>>>>>>> Maven, >>>>>>>> however 94% of them were, as we can find the pom.properties file >>>>>>>> in >>>>>>>> 1,806,023 of the 1,913,561 central components . Talk about the >>>>>>>> default >>>>>>>> effect in action! >>>>>>>> >>>>>>>> It?s further important to recognize that given a jar with a >>>>>>>> pom.properties >>>>>>>> declaring coordinates, it means that the project itself has chosen >>>>>>>> those >>>>>>>> coordinates as their own name. In other words, this is how they >>>>>>>> refer >>>>>>>> to >>>>>>>> themselves, even if other consumers may not be using Maven >>>>>>>> directly. >>>>>>>> >>>>>>>> If automodule were able to peek inside a jar and generate the >>>>>>>> default >>>>>>>> id >>>>>>>> using the groupid and artifactid present in the file, this would >>>>>>>> nearly >>>>>>>> eliminate all instances of id conflict because a significant >>>>>>>> portion >>>>>>>> of >>>>>>>> the Java ecosystem is in fact built with Maven. Additionally, the >>>>>>>> fact >>>>>>>> that 1.8 million (and counting) modules would have namespace as >>>>>>>> the >>>>>>>> default behavior means we?ve taken a huge step in setting the best >>>>>>>> practice of picking module ids with a namepace. Additionally, >>>>>>>> since >>>>>>>> the >>>>>>>> project itself has chosen these coordinates and uses them as their >>>>>>>> primary >>>>>>>> distribution mechanism, this follows the principle of least >>>>>>>> astonishment >>>>>>>> to consumers regardless of their chosen build system. Finally, >>>>>>>> since >>>>>>>> all >>>>>>>> of the above are true, it?s unlikely the project would need to >>>>>>>> migrate >>>>>>>> to >>>>>>>> a new module id when they adopt jigsaw natively, thus avoiding an >>>>>>>> API >>>>>>>> switching cost for their users. >>>>>>>> >>>>>>>> Proposal 2: Drop automodules >>>>>>>> Right now Jigsaw tries to calculate a module name solely based on >>>>>>>> the >>>>>>>> name >>>>>>>> of the jar file, which now already causes issues. Besides the fact >>>>>>>> that >>>>>>>> the module name is not guaranteed unique compared with its Maven >>>>>>>> coordinate, there are extra transformations which makes it even >>>>>>>> less >>>>>>>> guaranteed that it is unique; e.g. dashes are replaced by dots >>>>>>>> (which >>>>>>>> are >>>>>>>> both valid artifactId characters), in some cases the number and >>>>>>>> their >>>>>>>> following characters are stripped off. For artifacts like >>>>>>>> jboss-servlet-api_4.0_spec it makes sense, however we already see >>>>>>>> issues >>>>>>>> here where commons-lang, commons-lang2 and commons-lang3 get the >>>>>>>> same >>>>>>>> module name, >>>>>>>> even though they have different artifactIds and contain different >>>>>>>> packages. Choosing different artifactIds and packages was a very >>>>>>>> wise >>>>>>>> decision because it made it possible that these jars could live >>>>>>>> next >>>>>>>> to >>>>>>>> each other. Removing that separation by the authors is a very >>>>>>>> unwise >>>>>>>> decision. >>>>>>>> >>>>>>>> Another known example is the jsrNNN jars, which now all get jsr as >>>>>>>> the >>>>>>>> module name. >>>>>>>> >>>>>>>> Is it highly unlikely there is one single rule to capture all the >>>>>>>> use >>>>>>>> cases and which always result in a module name we can work with. >>>>>>>> >>>>>>>> For that reason the other proposal is to simply drop automodules. >>>>>>>> Don?t >>>>>>>> try to come up with a name for unnamed jars. It might look like >>>>>>>> the >>>>>>>> feature of automodules makes migrating easier because every >>>>>>>> dependency >>>>>>>> will get a name so can complete your module-info for all >>>>>>>> requirements, >>>>>>>> but >>>>>>>> we expect that once Jigsaw comes to speed the invalid module names >>>>>>>> are >>>>>>>> actually blocking further development due to name collisions or >>>>>>>> forced >>>>>>>> renaming by transitive modular jars. >>>>>>>> >>>>>>>> The advantage of this proposal is that library builders are not >>>>>>>> forced >>>>>>>> to >>>>>>>> keep the proposed module name in order to maintain backwards >>>>>>>> compatibility >>>>>>>> with the default.. Instead library builders can pick a more >>>>>>>> suitable >>>>>>>> module name. The modular system doesn?t allow the same package to >>>>>>>> be >>>>>>>> exported by multiple jars (and automodules exports every package). >>>>>>>> Library >>>>>>>> builders can fix this is their new jars, however if end users >>>>>>>> would >>>>>>>> require both jars because they were specified as requirements in >>>>>>>> different >>>>>>>> transitive jars, you cannot compile this project. There?s just no >>>>>>>> dependency-excludes like Maven has, because ?requires? in the >>>>>>>> module-info >>>>>>>> really means requires. Dropping automodules will prevent these >>>>>>>> kind >>>>>>>> of >>>>>>>> issues, because a package can only be exported by a named module. >>>>>>>> >>>>>>>> Sure, this means that for end users they cannot refer to every jar >>>>>>>> in >>>>>>>> their module-info. But at least if they add a ?requires? to their >>>>>>>> module-info, they can ensure that it?ll always refer to the >>>>>>>> intended >>>>>>>> modular jar. With build tools like Maven the chance of missing >>>>>>>> artifacts >>>>>>>> on the classpath has already been reduced a lot. In general builds >>>>>>>> have >>>>>>>> become quite stable, so we don?t expect that developers will >>>>>>>> translate >>>>>>>> all >>>>>>>> dependencies to the module-info file, especially if we warn them >>>>>>>> about >>>>>>>> the >>>>>>>> possible consequences of depending on automodules. Only referring >>>>>>>> to >>>>>>>> named >>>>>>>> modules and even a single ?requires? is already a gain. There?s no >>>>>>>> reason >>>>>>>> to try to speed this up and give the developer the false >>>>>>>> impression >>>>>>>> that >>>>>>>> it?ll keep working when upgrading to real modular jars. Focus >>>>>>>> should >>>>>>>> be >>>>>>>> on >>>>>>>> the target, not on the path how to reach it. >>>>>>>> >>>>>>>> Dropping the automodules will prevent a lot of discussions about >>>>>>>> what >>>>>>>> is >>>>>>>> the correct way to select a module name and will give the >>>>>>>> responsibility >>>>>>>> for the name back to the place where it belongs: the developer. >>>>>>>> >>>>>>>> [1] >>>>>>>> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >>>>>>>> [2] >>>>>>>> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >>>>>>>> [3] The fact that so much of the npm ecosystem is effectively >>>>>>>> not-namespaced is has actually >>>>>>>> created potential build time malware injection possibilities. If I >>>>>>>> know >>>>>>>> of >>>>>>>> a package in use by a >>>>>>>> company through log analysis, bug report analysis etc, I could >>>>>>>> potentially >>>>>>>> go register the same >>>>>>>> name in the default repo with a very high semver and know that >>>>>>>> it?s >>>>>>>> very >>>>>>>> likely this would be >>>>>>>> picked up over the intended internally developed module because >>>>>>>> there?s >>>>>>>> no >>>>>>>> namespace. >>>>>>>> [4] https://en.wikipedia.org/wiki/Default_effect_(psychology) >>>>>>>> [5] https://en.wikipedia.org/wiki/Principle_of_least_astonishment >>>>>>>> [6] >>>>>>>> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >>>>>>>> Q5M/edit?usp=sharing >>>>>>>> [7] http://openjdk.java.net/jeps/261 #Risk and assumptions >>>>>>>> [8] >>>>>> >> >>>> >> >> >> >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html From forax at univ-mlv.fr Thu Jan 19 22:47:11 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 19 Jan 2017 23:47:11 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> Message-ID: <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Robert Scholte" > ?: forax at univ-mlv.fr > Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" > Envoy?: Jeudi 19 Janvier 2017 21:07:51 > Objet: Re: Advice + proposals regarding automodule naming [...] >> >> The real problem is that you can have jars in the classpath and in the >> module path, and it doesn't mean the same thing. > Exactly! > >> Build tool do not have to manage automatic modules, you have to manage >> automatic modules only if you introduce a rule that decide where to put >> each jar. > > That rule is already there, it is the "require M.N" in the module-info > file, but it isn't aware if that module automatic or not. This file should > be the only place where people should care about Jigsaw related > definitions. clever :) So you start with the current code, check if there is a module-info.java, if so all required modules need to be in the modulepath recursively, everything else is in the classpath. > At conferences THE returning question is: can you generate > the module-info for me, because I've already specified my dependencies in > the pom. So there is already some frustration here, so we most be very > clear what belongs where and what every responsibility is: > pom.xml : specify the dependency coordinates you want to use, so Maven can > download them and can make them available for the maven-plugins > module-info.java : specify the modules required to compile and/or run; > these will be checked upfront to confirm everything is there. > maven-compiler-plugin : build up the correct arguments to call the java > compiler. I agree, you also need to crosscheck that the rules in the pom.xml match the rules in the module-info.java (or the other way around). > > In the of the maven-compiler-plugin it is of course > possible to define which jars belong on which path, but that would make it > very hard to use and maintain. but at the same time you have need a way to be able to see a plain old jar as a modular jar (without being an automatic module) to ease the transition. Here i think you need the user to say, Maven help me to generate a synthetic module (a synthetic module-info.class that will be injected in the jar of the dependency) from the information available in the pom of that plain old jar. It's not clear to me if this is something that should be in the configuration or in the dependency of the pom. > For example: for test-compile I could have added a parameter called > "moduleName" and use this value for the -Xmodule: argument. I've decided > not to do so, because the name is already specified in the module-info. > For me it was kind of frustrating that one assumed the name of the module > was a given fact, which implied I had to parse the module-info file. b+148 > killed this strategy for a while, but it is fixed again. If only I could > have point to target/classes ... ;) Anyway, this is solved once the > structure of the module-info file is final. i've decided to not use -Xmodule to run the test but to merge the module-info.java of the test and the module-info.java of the main, this allows me to be able to specify test only dependencies like junit in the module-info.java of the test. > > To prevent users to add their requirements to the module-info file AND > specify which jars belong on which path in either the or the > of the maven-compiler-plugin, I think we should "simply" > read the module-info.java*, gather all requirements and find the matching > jar. And do that recursively. > > * I maintain QDox, a Java source parser, as well. This is exactly what i'm doing [1] but i use javac instead of writing my own parser :) (the fact that i can requires the module java.compiler makes the things easy) > >> And also do not forget that deciding to put every jars in the modulepath >> do not work because of split packages. >> > Exactly! > [...] R?mi [1] https://github.com/forax/pro/blob/master/src/main/java/com.github.forax.pro.helper/com/github/forax/pro/helper/parser/JavacModuleParser.java From rfscholte at apache.org Fri Jan 20 11:08:03 2017 From: rfscholte at apache.org (Robert Scholte) Date: Fri, 20 Jan 2017 12:08:03 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> References: <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> Message-ID: On Thu, 19 Jan 2017 23:47:11 +0100, wrote: > ----- Mail original ----- >> De: "Robert Scholte" >> ?: forax at univ-mlv.fr >> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >> >> Envoy?: Jeudi 19 Janvier 2017 21:07:51 >> Objet: Re: Advice + proposals regarding automodule naming > > > [...] > >>> >>> The real problem is that you can have jars in the classpath and in the >>> module path, and it doesn't mean the same thing. >> Exactly! >> >>> Build tool do not have to manage automatic modules, you have to manage >>> automatic modules only if you introduce a rule that decide where to put >>> each jar. >> >> That rule is already there, it is the "require M.N" in the module-info >> file, but it isn't aware if that module automatic or not. This file >> should >> be the only place where people should care about Jigsaw related >> definitions. > > clever :) > > So you start with the current code, check if there is a > module-info.java, if so all required modules need to be in the > modulepath recursively, everything else is in the classpath. > >> At conferences THE returning question is: can you generate >> the module-info for me, because I've already specified my dependencies >> in >> the pom. So there is already some frustration here, so we most be very >> clear what belongs where and what every responsibility is: >> pom.xml : specify the dependency coordinates you want to use, so Maven >> can >> download them and can make them available for the maven-plugins >> module-info.java : specify the modules required to compile and/or run; >> these will be checked upfront to confirm everything is there. >> maven-compiler-plugin : build up the correct arguments to call the java >> compiler. > > I agree, > you also need to crosscheck that the rules in the pom.xml match the > rules in the module-info.java (or the other way around). > >> >> In the of the maven-compiler-plugin it is of course >> possible to define which jars belong on which path, but that would make >> it >> very hard to use and maintain. > > but at the same time you have need a way to be able to see a plain old > jar as a modular jar (without being an automatic module) to ease the > transition. > Here i think you need the user to say, Maven help me to generate a > synthetic module (a synthetic module-info.class that will be injected in > the jar of the dependency) > from the information available in the pom of that plain old jar. > It's not clear to me if this is something that should be in the > configuration or in the dependency of the pom. > I think we need to clarify the term "ease of transition". What are we expecting and why? It looks to me the expectation is that every current existing Java project should be able to have a module-info where every dependency is specified as a requirement. We all agree (and have accepted) that in case of split packages this will not work, so we drop a little in percentages. Brian and I will go one step beyond: you cannot require unnamed / automatic modules. Even if we are going to consider synthetic modules, that won't solve the problem. This would require the developer to specify the module name for all his requirements AND including the transitive ones. The latter is required because there's no way to pass this kind of information to depending projects. Stability is the keyword here, and the automatic modules cannot guarantee stability. Some developers have asked me: why do we need this? Tools like Maven are already superb in selecting all required dependencies for both compile time and runtime. Results are stable. And those few times there are issues, I was able to fix it by adding or excluding dependencies. With the module-info the result should be at least as stable as done with Maven. In case of automodules you could think of 2 options: #1 add as requires AND configure aliases for this synthetic module AND all its transitive unnamed modules. #2 drop requires for this module. I know which one I would choose. Also keep in mind: Maven Central also started empty. I don't know which jar was the first, but I'm pretty sure it can still be used. Give it time for jars to become modules. Ease of transition is already there because applications can refer to jars containing a module-info file, even if they don't use it. >> For example: for test-compile I could have added a parameter called >> "moduleName" and use this value for the -Xmodule: argument. I've decided >> not to do so, because the name is already specified in the module-info. >> For me it was kind of frustrating that one assumed the name of the >> module >> was a given fact, which implied I had to parse the module-info file. >> b+148 >> killed this strategy for a while, but it is fixed again. If only I could >> have point to target/classes ... ;) Anyway, this is solved once the >> structure of the module-info file is final. > > i've decided to not use -Xmodule to run the test but to merge the > module-info.java of the test and the module-info.java of the main, > this allows me to be able to specify test only dependencies like junit > in the module-info.java of the test. > For now I've decided not to support module-info files for tests. It is very unlikely that it will become a dependency for other projects. So it'll only be used within the build lifecycle, and since the user has already specified the dependencies, I see no reason why one should add the module-info as well. If a dependency is missing, the build will fail. Adding module-info here feels paranoid. >> >> To prevent users to add their requirements to the module-info file AND >> specify which jars belong on which path in either the or >> the >> of the maven-compiler-plugin, I think we should "simply" >> read the module-info.java*, gather all requirements and find the >> matching >> jar. And do that recursively. >> >> * I maintain QDox, a Java source parser, as well. > > This is exactly what i'm doing [1] but i use javac instead of writing my > own parser :) > (the fact that i can requires the module java.compiler makes the things > easy) > >> >>> And also do not forget that deciding to put every jars in the >>> modulepath >>> do not work because of split packages. >>> >> Exactly! >> > > [...] > > R?mi > > [1] > https://github.com/forax/pro/blob/master/src/main/java/com.github.forax.pro.helper/com/github/forax/pro/helper/parser/JavacModuleParser.java From david.lloyd at redhat.com Fri Jan 20 22:48:54 2017 From: david.lloyd at redhat.com (David M. Lloyd) Date: Fri, 20 Jan 2017 16:48:54 -0600 Subject: Maven Central will never be the universal Jigsaw module repository (Re: Advice + proposals regarding automodule naming) In-Reply-To: References: <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> Message-ID: <2c545f26-13fa-c446-d3df-6af0f92c11c9@redhat.com> On 01/20/2017 05:08 AM, Robert Scholte wrote: > I think we need to clarify the term "ease of transition". What are we > expecting and why? > It looks to me the expectation is that every current existing Java > project should be able to have a module-info where every dependency is > specified as a requirement. We all agree (and have accepted) that in > case of split packages this will not work, so we drop a little in > percentages. Brian and I will go one step beyond: you cannot require > unnamed / automatic modules. > Even if we are going to consider synthetic modules, that won't solve the > problem. This would require the developer to specify the module name for > all his requirements AND including the transitive ones. The latter is > required because there's no way to pass this kind of information to > depending projects. > Stability is the keyword here, and the automatic modules cannot > guarantee stability. > > Some developers have asked me: why do we need this? Tools like Maven are > already superb in selecting all required dependencies for both compile > time and runtime. Results are stable. And those few times there are > issues, I was able to fix it by adding or excluding dependencies. > > With the module-info the result should be at least as stable as done > with Maven. In case of automodules you could think of 2 options: #1 add > as requires AND configure aliases for this synthetic module AND all its > transitive unnamed modules. #2 drop requires for this module. I know > which one I would choose. > > Also keep in mind: Maven Central also started empty. I don't know which > jar was the first, but I'm pretty sure it can still be used. Give it > time for jars to become modules. Ease of transition is already there > because applications can refer to jars containing a module-info file, > even if they don't use it. The critical flaw in this analogy (which I'm afraid departs a little from the automatic module concept) is that the single, global module namespace that necessarily will have to exist in order for any centralized module repository cannot be met by Maven Central without a fundamental and complex change to the way that submissions are curated The reason for this is that today, a Maven artifact in Maven Central only has to resolve consistently relative to the set of artifacts it consumes, and (to a lesser extent because there's some flexibility here) the set of artifacts it is likely to coexist with. This flexibility and relativity goes most of the way to mitigate the fact that many Maven artifacts have conflicting packages and version requirements. Because of this, most of the time this is invisible to average users. In the modular world though, not only must you resolve a set of artifacts that resolve in a mutually consistent way, but they also have to be 100% non-conflicting in terms of module specification, and more problematically, they have to be 100% mutually consistent in terms of dependency mesh. In order to have any sort of guarantee of consistency for any given module artifact, consistency must be guaranteed for *all* artifacts. The Maven Central model for artifacts fails in this regard for the exact same reason that there isn't, for example, one unified Linux package "mega-repository". Packaging issues aside, there are many competing implementations of the same specifications and solutions to the same problems; these things have rippling effects on compatibility. In order to create one, single, unified module repository for *everything* in Maven Central that is internally consistent would be a behemoth undertaking and a major maintenance burden. Thus the alternative is as I've expressed many times before. The ecosystem of artifacts remains an ecosystem of artifacts. Ecosystems of modules will be a new entity, a subset of available artifacts designed to solve a specific problem; some module ecosystems will be produced as single applications and others as development platform distributions targeted at various audiences and maintained by different entities with different goals. As a consequence, build systems which work to contribute these ecosystems necessarily will operate in one of two possible ways. The first way is to consume artifacts like one does today, and within the build environment and using ecosystem-specific metadata, wire it in to the module graph, before (probably) CI testing the resultant combination and committing it into the distribution. The second way is to consume sources as artifacts and use the same ecosystem-specific metadata to compile the sources, wiring it in as above. Expecting that we can start from an empty repository and build up The One Single Central module repository is unrealistic because such a repository either must be too constrained to be generally useful in the way that Maven Central is useful, or it must be too inconsistent to be useful in any nontrivial project. I think that because a lot of users are still in the beginning or experimental stages of modularization, these realities are not yet obvious, and I hate to defeat optimism in this regard but we have been modularizing Maven artifacts for many years now with our own module system, so we know firsthand how difficult it is to mesh hundreds of artifacts into a single distribution, let alone the many thousands that exist in Maven Central. I think that anyone developing nontrivial distributions or applications will encounter these realities sooner or later so I hope that the experience we've gained will inform a more sensible approach to module distribution than "throw it all in Maven Central, it'll be fine". -- - DML From brianf at infinity.nu Sat Jan 21 01:34:33 2017 From: brianf at infinity.nu (Brian Fox) Date: Fri, 20 Jan 2017 20:34:33 -0500 Subject: Maven Central will never be the universal Jigsaw module repository (Re: Advice + proposals regarding automodule naming) In-Reply-To: <2c545f26-13fa-c446-d3df-6af0f92c11c9@redhat.com> References: <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> <2c545f26-13fa-c446-d3df-6af0f92c11c9@redhat.com> Message-ID: I don't see how this is relevant to the proposal? We are talking about problems with the legacy jars that are out there getting an auto module assigned that conflicts with other stuff, where avoiding this is very simple to do. This has nothing to do specifically with Maven, nor with Central looking forward. On Fri, Jan 20, 2017 at 5:48 PM, David M. Lloyd wrote: > On 01/20/2017 05:08 AM, Robert Scholte wrote: > >> I think we need to clarify the term "ease of transition". What are we >> expecting and why? >> It looks to me the expectation is that every current existing Java >> project should be able to have a module-info where every dependency is >> specified as a requirement. We all agree (and have accepted) that in >> case of split packages this will not work, so we drop a little in >> percentages. Brian and I will go one step beyond: you cannot require >> unnamed / automatic modules. >> Even if we are going to consider synthetic modules, that won't solve the >> problem. This would require the developer to specify the module name for >> all his requirements AND including the transitive ones. The latter is >> required because there's no way to pass this kind of information to >> depending projects. >> Stability is the keyword here, and the automatic modules cannot >> guarantee stability. >> >> Some developers have asked me: why do we need this? Tools like Maven are >> already superb in selecting all required dependencies for both compile >> time and runtime. Results are stable. And those few times there are >> issues, I was able to fix it by adding or excluding dependencies. >> >> With the module-info the result should be at least as stable as done >> with Maven. In case of automodules you could think of 2 options: #1 add >> as requires AND configure aliases for this synthetic module AND all its >> transitive unnamed modules. #2 drop requires for this module. I know >> which one I would choose. >> >> Also keep in mind: Maven Central also started empty. I don't know which >> jar was the first, but I'm pretty sure it can still be used. Give it >> time for jars to become modules. Ease of transition is already there >> because applications can refer to jars containing a module-info file, >> even if they don't use it. >> > > The critical flaw in this analogy (which I'm afraid departs a little from > the automatic module concept) is that the single, global module namespace > that necessarily will have to exist in order for any centralized module > repository cannot be met by Maven Central without a fundamental and complex > change to the way that submissions are curated > > The reason for this is that today, a Maven artifact in Maven Central only > has to resolve consistently relative to the set of artifacts it consumes, > and (to a lesser extent because there's some flexibility here) the set of > artifacts it is likely to coexist with. This flexibility and relativity > goes most of the way to mitigate the fact that many Maven artifacts have > conflicting packages and version requirements. Because of this, most of > the time this is invisible to average users. > > In the modular world though, not only must you resolve a set of artifacts > that resolve in a mutually consistent way, but they also have to be 100% > non-conflicting in terms of module specification, and more problematically, > they have to be 100% mutually consistent in terms of dependency mesh. In > order to have any sort of guarantee of consistency for any given module > artifact, consistency must be guaranteed for *all* artifacts. > > The Maven Central model for artifacts fails in this regard for the exact > same reason that there isn't, for example, one unified Linux package > "mega-repository". Packaging issues aside, there are many competing > implementations of the same specifications and solutions to the same > problems; these things have rippling effects on compatibility. In order to > create one, single, unified module repository for *everything* in Maven > Central that is internally consistent would be a behemoth undertaking and a > major maintenance burden. > > Thus the alternative is as I've expressed many times before. The > ecosystem of artifacts remains an ecosystem of artifacts. Ecosystems of > modules will be a new entity, a subset of available artifacts designed to > solve a specific problem; some module ecosystems will be produced as single > applications and others as development platform distributions targeted at > various audiences and maintained by different entities with different goals. > > As a consequence, build systems which work to contribute these ecosystems > necessarily will operate in one of two possible ways. > > The first way is to consume artifacts like one does today, and within the > build environment and using ecosystem-specific metadata, wire it in to the > module graph, before (probably) CI testing the resultant combination and > committing it into the distribution. > > The second way is to consume sources as artifacts and use the same > ecosystem-specific metadata to compile the sources, wiring it in as above. > > Expecting that we can start from an empty repository and build up The One > Single Central module repository is unrealistic because such a repository > either must be too constrained to be generally useful in the way that Maven > Central is useful, or it must be too inconsistent to be useful in any > nontrivial project. > > I think that because a lot of users are still in the beginning or > experimental stages of modularization, these realities are not yet obvious, > and I hate to defeat optimism in this regard but we have been modularizing > Maven artifacts for many years now with our own module system, so we know > firsthand how difficult it is to mesh hundreds of artifacts into a single > distribution, let alone the many thousands that exist in Maven Central. I > think that anyone developing nontrivial distributions or applications will > encounter these realities sooner or later so I hope that the experience > we've gained will inform a more sensible approach to module distribution > than "throw it all in Maven Central, it'll be fine". > > -- > - DML > From forax at univ-mlv.fr Sat Jan 21 10:07:12 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 21 Jan 2017 11:07:12 +0100 (CET) Subject: Advice + proposals regarding automodule naming In-Reply-To: References: <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> Message-ID: <1937844537.2500739.1484993232354.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Robert Scholte" > ?: forax at univ-mlv.fr > Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" > Envoy?: Vendredi 20 Janvier 2017 12:08:03 > Objet: Re: Advice + proposals regarding automodule naming [...] >>> >>> In the of the maven-compiler-plugin it is of course >>> possible to define which jars belong on which path, but that would make >>> it >>> very hard to use and maintain. >> >> but at the same time you have need a way to be able to see a plain old >> jar as a modular jar (without being an automatic module) to ease the >> transition. >> Here i think you need the user to say, Maven help me to generate a >> synthetic module (a synthetic module-info.class that will be injected in >> the jar of the dependency) >> from the information available in the pom of that plain old jar. >> It's not clear to me if this is something that should be in the >> configuration or in the dependency of the pom. >> > > I think we need to clarify the term "ease of transition". What are we > expecting and why? > It looks to me the expectation is that every current existing Java project > should be able to have a module-info where every dependency is specified > as a requirement. We all agree (and have accepted) that in case of split > packages this will not work, so we drop a little in percentages. Brian and > I will go one step beyond: you cannot require unnamed / automatic modules. > Even if we are going to consider synthetic modules, that won't solve the > problem. This would require the developer to specify the module name for > all his requirements AND including the transitive ones. The latter is > required because there's no way to pass this kind of information to > depending projects. > Stability is the keyword here, and the automatic modules cannot guarantee > stability. "no way to pass" -> another idea, as a way to help the transition, you can ask module-info that contains the dependencies to also have a mapping between the jigsaw name and the Maven artifact id, something like import some.specific.maven.annotation.MavenDependency; @MavenDependency("guava=com.google.guava:guava") module myapp { requires guava; } Having a dependency in the source to some Maven specific annotations if maybe not that great, the other solution is to hijack an already existing annotation like @SuppressWarnings or @Since @SuppressWarnings("maven-dependency:guava=com.google.guava:guava") module myapp { requires guava; } you will still need to requires all transitive dependencies but at least you have the association between a module name and the corresponding Maven artifact name. > > Some developers have asked me: why do we need this? Tools like Maven are > already superb in selecting all required dependencies for both compile > time and runtime. Results are stable. And those few times there are > issues, I was able to fix it by adding or excluding dependencies. > > With the module-info the result should be at least as stable as done with > Maven. In case of automodules you could think of 2 options: #1 add as > requires AND configure aliases for this synthetic module AND all its > transitive unnamed modules. #2 drop requires for this module. I know which > one I would choose. #2 is like the Python 2/Python 3 transition, i.e. you have to wait that all your dependencies to be jigsawified before you can using them in Maven. Just yesterday, a team of my students have to downgrade their whole soft from Python 3 to Python 2 because they need to use some specific geometric object on top of PostGIS, something that was never ported to Python 3. I think we (the Java community) can do better. > > Also keep in mind: Maven Central also started empty. I don't know which > jar was the first, but I'm pretty sure it can still be used. Give it time > for jars to become modules. Ease of transition is already there because > applications can refer to jars containing a module-info file, even if they > don't use it. yes, very true, the question is more, can we make it easier for someone that starts a new project just after the release of the java 9 and before all its dependencies contains a module-info. > >>> For example: for test-compile I could have added a parameter called >>> "moduleName" and use this value for the -Xmodule: argument. I've decided >>> not to do so, because the name is already specified in the module-info. >>> For me it was kind of frustrating that one assumed the name of the >>> module >>> was a given fact, which implied I had to parse the module-info file. >>> b+148 >>> killed this strategy for a while, but it is fixed again. If only I could >>> have point to target/classes ... ;) Anyway, this is solved once the >>> structure of the module-info file is final. >> >> i've decided to not use -Xmodule to run the test but to merge the >> module-info.java of the test and the module-info.java of the main, >> this allows me to be able to specify test only dependencies like junit >> in the module-info.java of the test. >> > For now I've decided not to support module-info files for tests. It is > very unlikely that it will become a dependency for other projects. So > it'll only be used within the build lifecycle, and since the user has > already specified the dependencies, I see no reason why one should add the > module-info as well. If a dependency is missing, the build will fail. > Adding module-info here feels paranoid. > I think it's because i think in term of module-info first and not in term of POM file first :) >>> >>> To prevent users to add their requirements to the module-info file AND >>> specify which jars belong on which path in either the or >>> the >>> of the maven-compiler-plugin, I think we should "simply" >>> read the module-info.java*, gather all requirements and find the >>> matching >>> jar. And do that recursively. >>> >>> * I maintain QDox, a Java source parser, as well. >> >> This is exactly what i'm doing [1] but i use javac instead of writing my >> own parser :) >> (the fact that i can requires the module java.compiler makes the things >> easy) >> >>> >>>> And also do not forget that deciding to put every jars in the >>>> modulepath >>>> do not work because of split packages. >>>> >>> Exactly! >>> >> >> [...] >> >> R?mi >> >> [1] > > https://github.com/forax/pro/blob/master/src/main/java/com.github.forax.pro.helper/com/github/forax/pro/helper/parser/JavacModuleParser.java R?mi From forax at univ-mlv.fr Sat Jan 21 10:29:59 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 21 Jan 2017 11:29:59 +0100 (CET) Subject: Maven Central will never be the universal Jigsaw module repository (Re: Advice + proposals regarding automodule naming) In-Reply-To: <2c545f26-13fa-c446-d3df-6af0f92c11c9@redhat.com> References: <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> <2c545f26-13fa-c446-d3df-6af0f92c11c9@redhat.com> Message-ID: <725018526.2502644.1484994599426.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "David M. Lloyd" > ?: jpms-spec-experts at openjdk.java.net > Envoy?: Vendredi 20 Janvier 2017 23:48:54 > Objet: Maven Central will never be the universal Jigsaw module repository (Re: Advice + proposals regarding automodule > naming) > On 01/20/2017 05:08 AM, Robert Scholte wrote: >> I think we need to clarify the term "ease of transition". What are we >> expecting and why? >> It looks to me the expectation is that every current existing Java >> project should be able to have a module-info where every dependency is >> specified as a requirement. We all agree (and have accepted) that in >> case of split packages this will not work, so we drop a little in >> percentages. Brian and I will go one step beyond: you cannot require >> unnamed / automatic modules. >> Even if we are going to consider synthetic modules, that won't solve the >> problem. This would require the developer to specify the module name for >> all his requirements AND including the transitive ones. The latter is >> required because there's no way to pass this kind of information to >> depending projects. >> Stability is the keyword here, and the automatic modules cannot >> guarantee stability. >> >> Some developers have asked me: why do we need this? Tools like Maven are >> already superb in selecting all required dependencies for both compile >> time and runtime. Results are stable. And those few times there are >> issues, I was able to fix it by adding or excluding dependencies. >> >> With the module-info the result should be at least as stable as done >> with Maven. In case of automodules you could think of 2 options: #1 add >> as requires AND configure aliases for this synthetic module AND all its >> transitive unnamed modules. #2 drop requires for this module. I know >> which one I would choose. >> >> Also keep in mind: Maven Central also started empty. I don't know which >> jar was the first, but I'm pretty sure it can still be used. Give it >> time for jars to become modules. Ease of transition is already there >> because applications can refer to jars containing a module-info file, >> even if they don't use it. > > The critical flaw in this analogy (which I'm afraid departs a little > from the automatic module concept) is that the single, global module > namespace that necessarily will have to exist in order for any > centralized module repository cannot be met by Maven Central without a > fundamental and complex change to the way that submissions are curated > > The reason for this is that today, a Maven artifact in Maven Central > only has to resolve consistently relative to the set of artifacts it > consumes, and (to a lesser extent because there's some flexibility here) > the set of artifacts it is likely to coexist with. This flexibility and > relativity goes most of the way to mitigate the fact that many Maven > artifacts have conflicting packages and version requirements. Because > of this, most of the time this is invisible to average users. > > In the modular world though, not only must you resolve a set of > artifacts that resolve in a mutually consistent way, but they also have > to be 100% non-conflicting in terms of module specification, and more > problematically, they have to be 100% mutually consistent in terms of > dependency mesh. In order to have any sort of guarantee of consistency > for any given module artifact, consistency must be guaranteed for *all* > artifacts. You can imagine island of artifacts with no connection, OSGI bundles are these kind of island, but i agree. > > The Maven Central model for artifacts fails in this regard for the exact > same reason that there isn't, for example, one unified Linux package > "mega-repository". Packaging issues aside, there are many competing > implementations of the same specifications and solutions to the same > problems; these things have rippling effects on compatibility. In order > to create one, single, unified module repository for *everything* in > Maven Central that is internally consistent would be a behemoth > undertaking and a major maintenance burden. What save Java when compared to linux distributions is that you have only one Java platform from the module point of view, there are no or a loosely defined spec for linux. But, i agree with you that we will have to be stricter on what can be published on term of binary compatibility and dependency. My dream is to have a build tool that increment my module version automatically depending how i break the compatibility from the previously published version. > > Thus the alternative is as I've expressed many times before. The > ecosystem of artifacts remains an ecosystem of artifacts. Ecosystems of > modules will be a new entity, a subset of available artifacts designed > to solve a specific problem; some module ecosystems will be produced as > single applications and others as development platform distributions > targeted at various audiences and maintained by different entities with > different goals. > > As a consequence, build systems which work to contribute these > ecosystems necessarily will operate in one of two possible ways. > > The first way is to consume artifacts like one does today, and within > the build environment and using ecosystem-specific metadata, wire it in > to the module graph, before (probably) CI testing the resultant > combination and committing it into the distribution. > > The second way is to consume sources as artifacts and use the same > ecosystem-specific metadata to compile the sources, wiring it in as above. for me, we will transitioning from 1 to 2. > > Expecting that we can start from an empty repository and build up The > One Single Central module repository is unrealistic because such a > repository either must be too constrained to be generally useful in the > way that Maven Central is useful, or it must be too inconsistent to be > useful in any nontrivial project. given that the module-info is not read for Java < 9, you can consider that currently Maven Central is empty and that when jigsaw compatible will be commited in Maven Central, it will build a new respository of compatible modules from bottom to the top. The fact that these modules will be more or less compatible with Java < 9 (the classpath world) is just a nice bonus. > > I think that because a lot of users are still in the beginning or > experimental stages of modularization, these realities are not yet > obvious, and I hate to defeat optimism in this regard but we have been > modularizing Maven artifacts for many years now with our own module > system, so we know firsthand how difficult it is to mesh hundreds of > artifacts into a single distribution, let alone the many thousands that > exist in Maven Central. I think that anyone developing nontrivial > distributions or applications will encounter these realities sooner or > later so I hope that the experience we've gained will inform a more > sensible approach to module distribution than "throw it all in Maven > Central, it'll be fine". And as i said above, from the module perspective of the Java module, maven Central is currently empty (if build tool do not allow to mix modulepath and classpath). In my opinion, we will see more backpressure that we have now, you will not be able to push something not compatible to your previous release to Maven Central without a major internet outrage. > > -- > - DML R?mi From rfscholte at apache.org Sat Jan 21 14:18:42 2017 From: rfscholte at apache.org (Robert Scholte) Date: Sat, 21 Jan 2017 15:18:42 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: <1937844537.2500739.1484993232354.JavaMail.zimbra@u-pem.fr> References: <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> <1937844537.2500739.1484993232354.JavaMail.zimbra@u-pem.fr> Message-ID: On Sat, 21 Jan 2017 11:07:12 +0100, wrote: > ----- Mail original ----- >> De: "Robert Scholte" >> ?: forax at univ-mlv.fr >> Cc: jpms-spec-experts at openjdk.java.net, "Brian Fox" >> >> Envoy?: Vendredi 20 Janvier 2017 12:08:03 >> Objet: Re: Advice + proposals regarding automodule naming > > [...] > >>>> >>>> In the of the maven-compiler-plugin it is of course >>>> possible to define which jars belong on which path, but that would >>>> make >>>> it >>>> very hard to use and maintain. >>> >>> but at the same time you have need a way to be able to see a plain old >>> jar as a modular jar (without being an automatic module) to ease the >>> transition. >>> Here i think you need the user to say, Maven help me to generate a >>> synthetic module (a synthetic module-info.class that will be injected >>> in >>> the jar of the dependency) >>> from the information available in the pom of that plain old jar. >>> It's not clear to me if this is something that should be in the >>> configuration or in the dependency of the pom. >>> >> >> I think we need to clarify the term "ease of transition". What are we >> expecting and why? >> It looks to me the expectation is that every current existing Java >> project >> should be able to have a module-info where every dependency is specified >> as a requirement. We all agree (and have accepted) that in case of split >> packages this will not work, so we drop a little in percentages. Brian >> and >> I will go one step beyond: you cannot require unnamed / automatic >> modules. >> Even if we are going to consider synthetic modules, that won't solve the >> problem. This would require the developer to specify the module name for >> all his requirements AND including the transitive ones. The latter is >> required because there's no way to pass this kind of information to >> depending projects. >> Stability is the keyword here, and the automatic modules cannot >> guarantee >> stability. > > "no way to pass" -> another idea, > as a way to help the transition, you can ask module-info that contains > the dependencies to also have a mapping between the jigsaw name and the > Maven artifact id, > something like > > import some.specific.maven.annotation.MavenDependency; > > @MavenDependency("guava=com.google.guava:guava") > module myapp { > requires guava; > } > > Having a dependency in the source to some Maven specific annotations if > maybe not that great, the other solution is to hijack an already > existing annotation like @SuppressWarnings or @Since > > @SuppressWarnings("maven-dependency:guava=com.google.guava:guava") > module myapp { > requires guava; > } > > you will still need to requires all transitive dependencies but at least > you have the association between a module name and the corresponding > Maven artifact name. > This is an interesting suggestion, but is ignoring the case of collisions with auto module names. For example: module myapp { requires dep1; requires dep2; } dep1 has a dependency on com.foo:library dep2 has a dependency on com.acme:library and of course all deps and library jars don't have a module-info. For Jigsaw it might look like the same 'library', but on the module-path both jars are required, that's not allowed because they result in the same modulename, right? I don't think some like this can fix that: @MavenDependency("dep1:library=com.foo:library","dep2:library=com.acme:library") Robert >> >> Some developers have asked me: why do we need this? Tools like Maven are >> already superb in selecting all required dependencies for both compile >> time and runtime. Results are stable. And those few times there are >> issues, I was able to fix it by adding or excluding dependencies. >> >> With the module-info the result should be at least as stable as done >> with >> Maven. In case of automodules you could think of 2 options: #1 add as >> requires AND configure aliases for this synthetic module AND all its >> transitive unnamed modules. #2 drop requires for this module. I know >> which >> one I would choose. > > #2 is like the Python 2/Python 3 transition, i.e. you have to wait that > all your dependencies to be jigsawified before you can using them in > Maven. > Just yesterday, a team of my students have to downgrade their whole soft > from Python 3 to Python 2 because they need to use some specific > geometric object on top of PostGIS, something that was never ported to > Python 3. > > I think we (the Java community) can do better. > >> >> Also keep in mind: Maven Central also started empty. I don't know which >> jar was the first, but I'm pretty sure it can still be used. Give it >> time >> for jars to become modules. Ease of transition is already there because >> applications can refer to jars containing a module-info file, even if >> they >> don't use it. > > yes, very true, > the question is more, can we make it easier for someone that starts a > new project just after the release of the java 9 and before all its > dependencies contains a module-info. > >> >>>> For example: for test-compile I could have added a parameter called >>>> "moduleName" and use this value for the -Xmodule: argument. I've >>>> decided >>>> not to do so, because the name is already specified in the >>>> module-info. >>>> For me it was kind of frustrating that one assumed the name of the >>>> module >>>> was a given fact, which implied I had to parse the module-info file. >>>> b+148 >>>> killed this strategy for a while, but it is fixed again. If only I >>>> could >>>> have point to target/classes ... ;) Anyway, this is solved once the >>>> structure of the module-info file is final. >>> >>> i've decided to not use -Xmodule to run the test but to merge the >>> module-info.java of the test and the module-info.java of the main, >>> this allows me to be able to specify test only dependencies like junit >>> in the module-info.java of the test. >>> >> For now I've decided not to support module-info files for tests. It is >> very unlikely that it will become a dependency for other projects. So >> it'll only be used within the build lifecycle, and since the user has >> already specified the dependencies, I see no reason why one should add >> the >> module-info as well. If a dependency is missing, the build will fail. >> Adding module-info here feels paranoid. >> > > I think it's because i think in term of module-info first and not in > term of POM file first :) > >>>> >>>> To prevent users to add their requirements to the module-info file AND >>>> specify which jars belong on which path in either the or >>>> the >>>> of the maven-compiler-plugin, I think we should >>>> "simply" >>>> read the module-info.java*, gather all requirements and find the >>>> matching >>>> jar. And do that recursively. >>>> >>>> * I maintain QDox, a Java source parser, as well. >>> >>> This is exactly what i'm doing [1] but i use javac instead of writing >>> my >>> own parser :) >>> (the fact that i can requires the module java.compiler makes the things >>> easy) >>> >>>> >>>>> And also do not forget that deciding to put every jars in the >>>>> modulepath >>>>> do not work because of split packages. >>>>> >>>> Exactly! >>>> >>> >>> [...] >>> >>> R?mi >>> >>> [1] >> > >> https://github.com/forax/pro/blob/master/src/main/java/com.github.forax.pro.helper/com/github/forax/pro/helper/parser/JavacModuleParser.java > > R?mi From rfscholte at apache.org Sat Jan 21 14:22:55 2017 From: rfscholte at apache.org (Robert Scholte) Date: Sat, 21 Jan 2017 15:22:55 +0100 Subject: Maven Central will never be the universal Jigsaw module repository (Re: Advice + proposals regarding automodule naming) In-Reply-To: <2c545f26-13fa-c446-d3df-6af0f92c11c9@redhat.com> References: <1798601919.942643.1484691071823.JavaMail.zimbra@u-pem.fr> <626433667.1789430.1484835658855.JavaMail.zimbra@u-pem.fr> <419778179.1911914.1484849657427.JavaMail.zimbra@u-pem.fr> <1126484066.1967528.1484866031037.JavaMail.zimbra@u-pem.fr> <2c545f26-13fa-c446-d3df-6af0f92c11c9@redhat.com> Message-ID: "Maven Central will never be the universal Jigsaw module repository" this is true. From my point of view there's no ambition that Maven Central will ever become the universal Jigsaw module repository. And it would also be wrong: One shouldn't be forced to use Maven Central when developing with the Jigsaw features. On Fri, 20 Jan 2017 23:48:54 +0100, David M. Lloyd wrote: > On 01/20/2017 05:08 AM, Robert Scholte wrote: >> I think we need to clarify the term "ease of transition". What are we >> expecting and why? >> It looks to me the expectation is that every current existing Java >> project should be able to have a module-info where every dependency is >> specified as a requirement. We all agree (and have accepted) that in >> case of split packages this will not work, so we drop a little in >> percentages. Brian and I will go one step beyond: you cannot require >> unnamed / automatic modules. >> Even if we are going to consider synthetic modules, that won't solve the >> problem. This would require the developer to specify the module name for >> all his requirements AND including the transitive ones. The latter is >> required because there's no way to pass this kind of information to >> depending projects. >> Stability is the keyword here, and the automatic modules cannot >> guarantee stability. >> >> Some developers have asked me: why do we need this? Tools like Maven are >> already superb in selecting all required dependencies for both compile >> time and runtime. Results are stable. And those few times there are >> issues, I was able to fix it by adding or excluding dependencies. >> >> With the module-info the result should be at least as stable as done >> with Maven. In case of automodules you could think of 2 options: #1 add >> as requires AND configure aliases for this synthetic module AND all its >> transitive unnamed modules. #2 drop requires for this module. I know >> which one I would choose. >> >> Also keep in mind: Maven Central also started empty. I don't know which >> jar was the first, but I'm pretty sure it can still be used. Give it >> time for jars to become modules. Ease of transition is already there >> because applications can refer to jars containing a module-info file, >> even if they don't use it. > > The critical flaw in this analogy (which I'm afraid departs a little > from the automatic module concept) is that the single, global module > namespace that necessarily will have to exist in order for any > centralized module repository cannot be met by Maven Central without a > fundamental and complex change to the way that submissions are curated > > The reason for this is that today, a Maven artifact in Maven Central > only has to resolve consistently relative to the set of artifacts it > consumes, and (to a lesser extent because there's some flexibility here) > the set of artifacts it is likely to coexist with. This flexibility and > relativity goes most of the way to mitigate the fact that many Maven > artifacts have conflicting packages and version requirements. Because > of this, most of the time this is invisible to average users. > > In the modular world though, not only must you resolve a set of > artifacts that resolve in a mutually consistent way, but they also have > to be 100% non-conflicting in terms of module specification, and more > problematically, they have to be 100% mutually consistent in terms of > dependency mesh. In order to have any sort of guarantee of consistency > for any given module artifact, consistency must be guaranteed for *all* > artifacts. > > The Maven Central model for artifacts fails in this regard for the exact > same reason that there isn't, for example, one unified Linux package > "mega-repository". Packaging issues aside, there are many competing > implementations of the same specifications and solutions to the same > problems; these things have rippling effects on compatibility. In order > to create one, single, unified module repository for *everything* in > Maven Central that is internally consistent would be a behemoth > undertaking and a major maintenance burden. > > Thus the alternative is as I've expressed many times before. The > ecosystem of artifacts remains an ecosystem of artifacts. Ecosystems of > modules will be a new entity, a subset of available artifacts designed > to solve a specific problem; some module ecosystems will be produced as > single applications and others as development platform distributions > targeted at various audiences and maintained by different entities with > different goals. > > As a consequence, build systems which work to contribute these > ecosystems necessarily will operate in one of two possible ways. > > The first way is to consume artifacts like one does today, and within > the build environment and using ecosystem-specific metadata, wire it in > to the module graph, before (probably) CI testing the resultant > combination and committing it into the distribution. > > The second way is to consume sources as artifacts and use the same > ecosystem-specific metadata to compile the sources, wiring it in as > above. > > Expecting that we can start from an empty repository and build up The > One Single Central module repository is unrealistic because such a > repository either must be too constrained to be generally useful in the > way that Maven Central is useful, or it must be too inconsistent to be > useful in any nontrivial project. > > I think that because a lot of users are still in the beginning or > experimental stages of modularization, these realities are not yet > obvious, and I hate to defeat optimism in this regard but we have been > modularizing Maven artifacts for many years now with our own module > system, so we know firsthand how difficult it is to mesh hundreds of > artifacts into a single distribution, let alone the many thousands that > exist in Maven Central. I think that anyone developing nontrivial > distributions or applications will encounter these realities sooner or > later so I hope that the experience we've gained will inform a more > sensible approach to module distribution than "throw it all in Maven > Central, it'll be fine". > Lack of experience is indeed a key issue. The Java distributions don't have any third party dependencies, so they are in full control: applying the modular system is easy now that the rt.jar, etc. is split up. I've been working with Redhats Camel/Fuse recently and I have been using its modular system. Here is a repository you control. If you need third party dependencies, you can add them to the repository and give it a name *for this physical system*. With both you control to complete ecosystem/dependencyManagement. With Jigsaw we're a facing modular system with a universal scope. This is a completely different environment and doesn't allow any mistakes from the beginning due to is universal character. It is probably better to be too strict in the beginning on topics which are still part of discussion. Robert From nipa at codefx.org Wed Jan 25 08:42:50 2017 From: nipa at codefx.org (Nicolai Parlog) Date: Wed, 25 Jan 2017 09:42:50 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: References: Message-ID: Hi Robert, I read the entire ensuing thread but I have to admit that I don't understand all the details. I was always looking for an answer to this question: Why would I use the artifactId as an automatic module name? I always assumed that the entire project name, which in a world dominated by Maven Central means groupId:artifactId, would have to be the automatic module name. It would hence also have to be the JAR name. Now, it looks like this would be problematic for Maven but isn't that "just" a problem for some particular build tool's implementation? I don't quite get how Maven's naming strategy for JARs voids the entire approach to automatic modules. (Now I wish I wouldn't have written "world dominated by Maven". ;) ) so long ... Nicolai On 16.01.2017 10:37, Robert Scholte wrote: > This is a message from Robert Scholte and Brian Fox. We both have > been talking about this topic several weeks with other Maven > developers and came to the conclusion that we should warn the > jigsaw team with their current approach regarding auto modules. We > will share our experiences, thoughts, conclusions and will suggest > two proposals. > > Traditionally, the Java ecosystem has been very mature in terms of > naming and namespacing. The reverse fqdn introduced into the java > package was a great choice to ensure classes don?t conflict. > Popular build tools such as Maven and nearly all those that > followed built upon that this key concept with the introduction of > ?GroupId? also using the fqdn as part of the name to ensure the > coordinates were properly namespaced. > > We?ve seen some ecosystems diverge from this leading to new > challenges that ultimately had to be reversed. A great example can > be seen in the ? tragic mistake from npm creators ? [1] which was > to launch without a namespace concept. Eventually, NPM started > running out of useful names and had to backtrack to introduce > ?scopes? which is really just a namespace [2]. The real problem > here is that the major change in namespace was backed in after > several years of momentum without it. It?s taken a long time for > tooling and best practice to catch up to scopes and in the interim, > people have been left with a dual mode, some namespaced, some not > namespaced situation that has created chaos. [3] > > The real issue at hand here as we consider behaviors in the jigsaw > automodule revolves around two well studied concepts. > > The most important is the ?Default effect? [3] which states that > whatever the default behavior is will become the most prominent > best practice. A default that uses a filename to generate a very > short, un-namespaced module id effectively sets the behavior to > create generic names that will eventually conflict...exactly what > we?ve seen in npm. > > Additionally, The switching costs introduced in overcoming a > default un-namespaced module id to one with a unique namespace is > also significant once you consider all the potential users. This is > why API change is hard, and changing the module id after the fact > from the default is effectively an API change. > > The second principal at hand is the ?Principle of least > astonishment?. We want to find a default that doesn?t violate what > most users would consider to be the most obvious. One could argue > the current auto module algorithm doesn?t violate this principle, > but it?s important to consider alternate suggestions in this > light. > > First, lets explore the potential downsides if the default effect > takes hold with the currently generated auto module id. In Apache > Maven, the artifact id is the part of the coordinate that generates > the filename. This means that com.somecompany:artifact:version will > become artifact-version.jar, which would result in automodule id > ?artifact?. Armed with this understanding, that does an analysis of > the Maven ecosystem have to say about potential conflicts in the > automodule id? > > If we ignore the groupid and version of all the components in the > Maven Central repository, we end up with over 13,500 (7% of the > total group:artifact combinations) conflicts. This does not > consider conflicts across other repositories, or within customer > portfolios yet it is pretty telling. Conflicts will happen. In some > cases, the number of conflicts on the same common names is well > above 100. The list of conflicts as of October, 2016 can be seen > here. [6] > > At this point, hopefully we?ve made the case for at least > establishing a default module id that 1. Uses namespaces to > minimizes id conflicts when possible 2. Leverages the default > effect to create a de facto best practice 3. Follows the principle > of least astonishment > > We have two potential proposals that solve these goals. > > Proposal 1: Leverage existing coordinates when available. > > Maven is inarguably the most popular build system for Java > components, with Maven Central being the default and largest > repository of Java components in the world. By default, every jar > built by Maven automatically gets a simple properties file inserted > into it with its unique coordinates. Now, not every jar in Central > was built with Maven, however 94% of them were, as we can find the > pom.properties file in 1,806,023 of the 1,913,561 central > components . Talk about the default effect in action! > > It?s further important to recognize that given a jar with a > pom.properties declaring coordinates, it means that the project > itself has chosen those coordinates as their own name. In other > words, this is how they refer to themselves, even if other > consumers may not be using Maven directly. > > If automodule were able to peek inside a jar and generate the > default id using the groupid and artifactid present in the file, > this would nearly eliminate all instances of id conflict because a > significant portion of the Java ecosystem is in fact built with > Maven. Additionally, the fact that 1.8 million (and counting) > modules would have namespace as the default behavior means we?ve > taken a huge step in setting the best practice of picking module > ids with a namepace. Additionally, since the project itself has > chosen these coordinates and uses them as their primary > distribution mechanism, this follows the principle of least > astonishment to consumers regardless of their chosen build system. > Finally, since all of the above are true, it?s unlikely the > project would need to migrate to a new module id when they adopt > jigsaw natively, thus avoiding an API switching cost for their > users. > > Proposal 2: Drop automodules Right now Jigsaw tries to calculate a > module name solely based on the name of the jar file, which now > already causes issues. Besides the fact that the module name is not > guaranteed unique compared with its Maven coordinate, there are > extra transformations which makes it even less guaranteed that it > is unique; e.g. dashes are replaced by dots (which are both valid > artifactId characters), in some cases the number and their > following characters are stripped off. For artifacts like > jboss-servlet-api_4.0_spec it makes sense, however we already see > issues here where commons-lang, commons-lang2 and commons-lang3 get > the same module name, even though they have different artifactIds > and contain different packages. Choosing different artifactIds and > packages was a very wise decision because it made it possible that > these jars could live next to each other. Removing that separation > by the authors is a very unwise decision. > > Another known example is the jsrNNN jars, which now all get jsr as > the module name. > > Is it highly unlikely there is one single rule to capture all the > use cases and which always result in a module name we can work > with. > > For that reason the other proposal is to simply drop automodules. > Don?t try to come up with a name for unnamed jars. It might look > like the feature of automodules makes migrating easier because > every dependency will get a name so can complete your module-info > for all requirements, but we expect that once Jigsaw comes to speed > the invalid module names are actually blocking further development > due to name collisions or forced renaming by transitive modular > jars. > > The advantage of this proposal is that library builders are not > forced to keep the proposed module name in order to maintain > backwards compatibility with the default.. Instead library builders > can pick a more suitable module name. The modular system doesn?t > allow the same package to be exported by multiple jars (and > automodules exports every package). Library builders can fix this > is their new jars, however if end users would require both jars > because they were specified as requirements in different transitive > jars, you cannot compile this project. There?s just no > dependency-excludes like Maven has, because ?requires? in the > module-info really means requires. Dropping automodules will > prevent these kind of issues, because a package can only be > exported by a named module. > > Sure, this means that for end users they cannot refer to every jar > in their module-info. But at least if they add a ?requires? to > their module-info, they can ensure that it?ll always refer to the > intended modular jar. With build tools like Maven the chance of > missing artifacts on the classpath has already been reduced a lot. > In general builds have become quite stable, so we don?t expect that > developers will translate all dependencies to the module-info file, > especially if we warn them about the possible consequences of > depending on automodules. Only referring to named modules and even > a single ?requires? is already a gain. There?s no reason to try to > speed this up and give the developer the false impression that > it?ll keep working when upgrading to real modular jars. Focus > should be on the target, not on the path how to reach it. > > Dropping the automodules will prevent a lot of discussions about > what is the correct way to select a module name and will give the > responsibility for the name back to the place where it belongs: the > developer. > > [1] > http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm > > [2] > http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages > > [3] The fact that so much of the npm ecosystem is effectively > not-namespaced is has actually created potential build time malware > injection possibilities. If I know of a package in use by a company > through log analysis, bug report analysis etc, I could potentially > go register the same name in the default repo with a very high > semver and know that it?s very likely this would be picked up over > the intended internally developed module because there?s no > namespace. [4] > https://en.wikipedia.org/wiki/Default_effect_(psychology) [5] > https://en.wikipedia.org/wiki/Principle_of_least_astonishment [6] > https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj > > Q5M/edit?usp=sharing [7] http://openjdk.java.net/jeps/261 #Risk > and assumptions [8] > https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html > > -- PGP Key: http://keys.gnupg.net/pks/lookup?op=vindex&search=0xCA3BAD2E9CCCD509 Web: http://codefx.org a blog about software development https://www.sitepoint.com/java high-quality Java/JVM content http://do-foss.de Free and Open Source Software for the City of Dortmund Twitter: https://twitter.com/nipafx From rfscholte at apache.org Wed Jan 25 13:43:05 2017 From: rfscholte at apache.org (Robert Scholte) Date: Wed, 25 Jan 2017 14:43:05 +0100 Subject: Advice + proposals regarding automodule naming In-Reply-To: References: Message-ID: On Wed, 25 Jan 2017 09:42:50 +0100, Nicolai Parlog wrote: > Hi Robert, > > I read the entire ensuing thread but I have to admit that I don't > understand all the details. I was always looking for an answer to this > question: > > Why would I use the artifactId as an automatic module name? To be precise: why use the filename as an automatic module name? And yes, in case of Maven the artifactId is used to construct the filename. The issue is that the filename is the only thing you can ensure that exists for a jar. However, the statistics in Maven central states that over 94% of all the jars contain a pom.properties, a file containing only 3 things: version, groupId and artifactId. Proposal #1 is about using this file as first strategy to calculate the module name. > > I always assumed that the entire project name, which in a world > dominated by Maven Central means groupId:artifactId, would have to be > the automatic module name. It would hence also have to be the JAR name. Maven *points* to the jars in the local repository. The maven2 repository layout follows the following convention: ${groupId}*/${artifactId}/${version}/${artifactId}-${version}.${type} When this repository layout was design, adding groupId to the filename felt like redundant information and not how the java community would expect how jar-files should be named. Copy+rename would drop us back 1 year, when it wasn't possible to refer to a jar on the modulepath, only directories were allowed. * dots in groupId are replaced by file separators > > Now, it looks like this would be problematic for Maven but isn't that > "just" a problem for some particular build tool's implementation? I > don't quite get how Maven's naming strategy for JARs voids the entire > approach to automatic modules. (Now I wish I wouldn't have written > "world dominated by Maven". ;) ) You actually wrote "world dominated by Maven Central" :). There's an overlap between these two, however Maven is not the only tool using Maven Central, and not all dependencies used by Maven projects have been built by Maven. Maven could decide to implement its own artifact-to-modulename Mapper, but that would destroy the ecosystem for other tools if they didn't follow. This is a universal issue and should for that reason be solved by the component we're all sharing: the JDK/JRE. thanks, Robert > > so long ... Nicolai > > > > On 16.01.2017 10:37, Robert Scholte wrote: >> This is a message from Robert Scholte and Brian Fox. We both have >> been talking about this topic several weeks with other Maven >> developers and came to the conclusion that we should warn the >> jigsaw team with their current approach regarding auto modules. We >> will share our experiences, thoughts, conclusions and will suggest >> two proposals. >> >> Traditionally, the Java ecosystem has been very mature in terms of >> naming and namespacing. The reverse fqdn introduced into the java >> package was a great choice to ensure classes don?t conflict. >> Popular build tools such as Maven and nearly all those that >> followed built upon that this key concept with the introduction of >> ?GroupId? also using the fqdn as part of the name to ensure the >> coordinates were properly namespaced. >> >> We?ve seen some ecosystems diverge from this leading to new >> challenges that ultimately had to be reversed. A great example can >> be seen in the ? tragic mistake from npm creators ? [1] which was >> to launch without a namespace concept. Eventually, NPM started >> running out of useful names and had to backtrack to introduce >> ?scopes? which is really just a namespace [2]. The real problem >> here is that the major change in namespace was backed in after >> several years of momentum without it. It?s taken a long time for >> tooling and best practice to catch up to scopes and in the interim, >> people have been left with a dual mode, some namespaced, some not >> namespaced situation that has created chaos. [3] >> >> The real issue at hand here as we consider behaviors in the jigsaw >> automodule revolves around two well studied concepts. >> >> The most important is the ?Default effect? [3] which states that >> whatever the default behavior is will become the most prominent >> best practice. A default that uses a filename to generate a very >> short, un-namespaced module id effectively sets the behavior to >> create generic names that will eventually conflict...exactly what >> we?ve seen in npm. >> >> Additionally, The switching costs introduced in overcoming a >> default un-namespaced module id to one with a unique namespace is >> also significant once you consider all the potential users. This is >> why API change is hard, and changing the module id after the fact >> from the default is effectively an API change. >> >> The second principal at hand is the ?Principle of least >> astonishment?. We want to find a default that doesn?t violate what >> most users would consider to be the most obvious. One could argue >> the current auto module algorithm doesn?t violate this principle, >> but it?s important to consider alternate suggestions in this >> light. >> >> First, lets explore the potential downsides if the default effect >> takes hold with the currently generated auto module id. In Apache >> Maven, the artifact id is the part of the coordinate that generates >> the filename. This means that com.somecompany:artifact:version will >> become artifact-version.jar, which would result in automodule id >> ?artifact?. Armed with this understanding, that does an analysis of >> the Maven ecosystem have to say about potential conflicts in the >> automodule id? >> >> If we ignore the groupid and version of all the components in the >> Maven Central repository, we end up with over 13,500 (7% of the >> total group:artifact combinations) conflicts. This does not >> consider conflicts across other repositories, or within customer >> portfolios yet it is pretty telling. Conflicts will happen. In some >> cases, the number of conflicts on the same common names is well >> above 100. The list of conflicts as of October, 2016 can be seen >> here. [6] >> >> At this point, hopefully we?ve made the case for at least >> establishing a default module id that 1. Uses namespaces to >> minimizes id conflicts when possible 2. Leverages the default >> effect to create a de facto best practice 3. Follows the principle >> of least astonishment >> >> We have two potential proposals that solve these goals. >> >> Proposal 1: Leverage existing coordinates when available. >> >> Maven is inarguably the most popular build system for Java >> components, with Maven Central being the default and largest >> repository of Java components in the world. By default, every jar >> built by Maven automatically gets a simple properties file inserted >> into it with its unique coordinates. Now, not every jar in Central >> was built with Maven, however 94% of them were, as we can find the >> pom.properties file in 1,806,023 of the 1,913,561 central >> components . Talk about the default effect in action! >> >> It?s further important to recognize that given a jar with a >> pom.properties declaring coordinates, it means that the project >> itself has chosen those coordinates as their own name. In other >> words, this is how they refer to themselves, even if other >> consumers may not be using Maven directly. >> >> If automodule were able to peek inside a jar and generate the >> default id using the groupid and artifactid present in the file, >> this would nearly eliminate all instances of id conflict because a >> significant portion of the Java ecosystem is in fact built with >> Maven. Additionally, the fact that 1.8 million (and counting) >> modules would have namespace as the default behavior means we?ve >> taken a huge step in setting the best practice of picking module >> ids with a namepace. Additionally, since the project itself has >> chosen these coordinates and uses them as their primary >> distribution mechanism, this follows the principle of least >> astonishment to consumers regardless of their chosen build system. >> Finally, since all of the above are true, it?s unlikely the >> project would need to migrate to a new module id when they adopt >> jigsaw natively, thus avoiding an API switching cost for their >> users. >> >> Proposal 2: Drop automodules Right now Jigsaw tries to calculate a >> module name solely based on the name of the jar file, which now >> already causes issues. Besides the fact that the module name is not >> guaranteed unique compared with its Maven coordinate, there are >> extra transformations which makes it even less guaranteed that it >> is unique; e.g. dashes are replaced by dots (which are both valid >> artifactId characters), in some cases the number and their >> following characters are stripped off. For artifacts like >> jboss-servlet-api_4.0_spec it makes sense, however we already see >> issues here where commons-lang, commons-lang2 and commons-lang3 get >> the same module name, even though they have different artifactIds >> and contain different packages. Choosing different artifactIds and >> packages was a very wise decision because it made it possible that >> these jars could live next to each other. Removing that separation >> by the authors is a very unwise decision. >> >> Another known example is the jsrNNN jars, which now all get jsr as >> the module name. >> >> Is it highly unlikely there is one single rule to capture all the >> use cases and which always result in a module name we can work >> with. >> >> For that reason the other proposal is to simply drop automodules. >> Don?t try to come up with a name for unnamed jars. It might look >> like the feature of automodules makes migrating easier because >> every dependency will get a name so can complete your module-info >> for all requirements, but we expect that once Jigsaw comes to speed >> the invalid module names are actually blocking further development >> due to name collisions or forced renaming by transitive modular >> jars. >> >> The advantage of this proposal is that library builders are not >> forced to keep the proposed module name in order to maintain >> backwards compatibility with the default.. Instead library builders >> can pick a more suitable module name. The modular system doesn?t >> allow the same package to be exported by multiple jars (and >> automodules exports every package). Library builders can fix this >> is their new jars, however if end users would require both jars >> because they were specified as requirements in different transitive >> jars, you cannot compile this project. There?s just no >> dependency-excludes like Maven has, because ?requires? in the >> module-info really means requires. Dropping automodules will >> prevent these kind of issues, because a package can only be >> exported by a named module. >> >> Sure, this means that for end users they cannot refer to every jar >> in their module-info. But at least if they add a ?requires? to >> their module-info, they can ensure that it?ll always refer to the >> intended modular jar. With build tools like Maven the chance of >> missing artifacts on the classpath has already been reduced a lot. >> In general builds have become quite stable, so we don?t expect that >> developers will translate all dependencies to the module-info file, >> especially if we warn them about the possible consequences of >> depending on automodules. Only referring to named modules and even >> a single ?requires? is already a gain. There?s no reason to try to >> speed this up and give the developer the false impression that >> it?ll keep working when upgrading to real modular jars. Focus >> should be on the target, not on the path how to reach it. >> >> Dropping the automodules will prevent a lot of discussions about >> what is the correct way to select a module name and will give the >> responsibility for the name back to the place where it belongs: the >> developer. >> >> [1] >> http://stackoverflow.com/questions/22053381/lack-of-available-module-names-on-npm >> >> [2] >> http://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages >> >> [3] The fact that so much of the npm ecosystem is effectively >> not-namespaced is has actually created potential build time malware >> injection possibilities. If I know of a package in use by a company >> through log analysis, bug report analysis etc, I could potentially >> go register the same name in the default repo with a very high >> semver and know that it?s very likely this would be picked up over >> the intended internally developed module because there?s no >> namespace. [4] >> https://en.wikipedia.org/wiki/Default_effect_(psychology) [5] >> https://en.wikipedia.org/wiki/Principle_of_least_astonishment [6] >> https://docs.google.com/spreadsheets/d/1TVR5uTpDYw0827AlvPRu8l95zHnFPL_g61TdPtnj >> >> Q5M/edit?usp=sharing [7] http://openjdk.java.net/jeps/261 #Risk >> and assumptions [8] >> https://www.mail-archive.com/jigsaw-dev at openjdk.java.net/msg06623.html >> From mark.reinhold at oracle.com Thu Jan 26 22:34:31 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Thu, 26 Jan 2017 14:34:31 -0800 (PST) Subject: Proposal: #NonHierarchicalLayers In-Reply-To: <9b66aa52-877a-e4b2-1458-9ff47dfb690b@redhat.com> References: <20161207234117.1E5F1213F1@eggemoggin.niobe.net> <1de06198-930b-4b71-fe5d-1477d9cace80@redhat.com> <20161212232335.A9BFC24B27@eggemoggin.niobe.net> <9b66aa52-877a-e4b2-1458-9ff47dfb690b@redhat.com> Message-ID: <20170126223431.B417255DD9@eggemoggin.niobe.net> 2016/12/13 6:47:53 -0800, david.lloyd at redhat.com: > On 12/12/2016 05:23 PM, mark.reinhold at oracle.com wrote: >> 2016/12/8 5:46:24 -0800, david.lloyd at redhat.com: >>> ... I've added the following methods to Layer.Controller: >>> >>> public Controller addPackage(Module source, String pn) { ... } >>> public Controller addOpens(Module source, String pn, Module target) { ... } >>> public Controller addOpensToAll(Module source, String pn) { ... } >>> public Controller addOpensToAllUnnamed(Module source, String pn) { ... } >>> public Controller addExports(Module source, String pn, Module target) { >>> ... } >>> public Controller addExportsToAll(Module source, String pn) { ... } >>> public Controller addExportsToAllUnnamed(Module source, String pn) { ... } >>> public Controller addUses(Module source, Class service) { ... } >> >> Can you explain exactly why you need all these methods? >> >> I can see why you might need the qualified `addExports` method, akin >> to the existing `addOpens` method, if you're doing some form of module >> resolution on your own that's somehow taking named layers, or whatever, >> into account. > > Yeah we're assembling the module structure in a multi-stage lazy > resolution process, thus we don't know exactly what we're opening or > exporting until after all contents and dependencies are defined (and > this can change over time). Hmm. This seems to be a fundamental difference between JBoss modules and OSGi. Once an OSGi bundle is loaded then its exports don't change, at least not until it's updated, and this is part of what enables Watson's JPMS embedding to work. >> The `add{Opens,Exports}ToAll` variants shouldn't be needed since you can >> just include unqualified `open` and `exports` directives in the module >> descriptor that you're going to build anyway. That has the additional >> benefit of making the exports apparent to the JPMS resolver so that JPMS >> modules can resolve against your modules, whereas invoking these methods >> wouldn't do that. > > The problem is that when I first build the module, I don't have all the > dependency information available, and also some of the dependencies will > include modules which are not visible to the layer (in fact right now > I'm putting each module into a separate layer), and some of the > dependencies are on non-Jigsaw entities which I also can't know initially. > >> The `addPackage` method is problematic, in part since it's quite slow >> (at least in HotSpot, where we've optimized for the common case of the >> set of packages in a module never changing). Can't you compute the set >> of packages in a module at the time you create the descriptor? > > Two things prevent this: firstly, we don't have the module contents > until after we've constructed the module class loader, which requires > the layer controller in order to process dependencies, so as of now, > adding contents depends on having the module established. Secondly, it > is allowed to add content to a JBoss module after it has been > established. The first might be fixable, but I can't think of a way > around the second. Me neither. For any kind of Watson-style embedding of JBoss modules in JPMS to work then there has to be a point in time at which you can say "okay, this JBoss module is in a stable state, let's spin up a JPMS module descriptor for it so that the JPMS resolver can resolve some JPMS modules against it." If the JBoss module is updated later on then you create a new JPMS module descriptor for it and re-resolve any JPMS modules that used to depend upon it. Can you, in general, identify such points in time for JBoss modules? If so, then I don't think you need the above methods. The JPMS module descriptor that you construct needn't capture all the details of the corresponding JBoss module. In fact it can't, since JBoss modules can relate to each other in ways that aren't expressible in JPMS. That's fine, though -- the module descriptor need only expose the information that the JPMS resolver needs for actual JPMS modules, mainly `opens`, `exports`, and `requires transitive`. If not, then adding the above methods won't help you anyway. They only affect the run-time state of modules, i.e., the state known to the JVM and surfaced in `java.lang.reflect.Module`. The JPMS resolver doesn't read the run-time state but, rather, the configurations computed by earlier invocations of the resolver, and ultimately those are all based on module descriptors, over in the `java.lang.module` package. - Mark From mark.reinhold at oracle.com Thu Jan 26 22:35:31 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Thu, 26 Jan 2017 14:35:31 -0800 (PST) Subject: Proposal: #VersionedDependences In-Reply-To: References: <20161209214646.D915621F70@eggemoggin.niobe.net> Message-ID: <20170126223531.B91FE55DDE@eggemoggin.niobe.net> 2016/12/9 14:10:43 -0800, david.lloyd at redhat.com: > On 12/09/2016 03:46 PM, mark.reinhold at oracle.com wrote: >> ... >> >> Proposal >> -------- >> >> ... >> >> Now that compile-time versions can be recorded in module descriptors >> there is even less need to tolerate version information in module names, >> a bad practice that we'd like to discourage at the outset. We therefore >> further propose to: >> >> - Revise the accepted proposal for #VersionsInModuleNames [3] to state >> that a module name appearing anywhere in a source-form module >> declaration must both start and end with "Java letters" [4]. > > Can we just drop this part? I really am not a fan of the > social-enforcement-in-technical-code thing, and I can already think of > one existing project off the top of my head that will suffer > collaterally from this: Fabric8. Also any other project to which the > "vanity license plate effect" would apply. This is only the second example that's been pointed out since I posted the original proposal for #VersionsInModuleNames [1], the first being `commons-lang3` [2]. > And I can think of many ways > to circumvent this rule, including, but not limited to, bracketing with > letters "v5slot", roman numerals, etc., so really all it does is add an > annoying "big brother" effect without practical benefit. Sure, you can work around it, but disallowing the obvious abuses should go a long way towards encouraging people to do the right thing. If this restriction really becomes a problem then we could consider loosening it in a later release. If we lift it now and this kind of abuse becomes common then we'll have no way to go back. > Anyway I disagree pretty strongly that versions (or more specifically, > version segments) in module names are that really that strong of an > anti-pattern. Sure having whole version numbers in is a pretty fragile > technique, but it's very useful to have (for example) multiple major > versions of a library on a large distribution to ease migration, > especially when you're talking hundreds or thousands of modules. In the context of JPMS, I respectfully disagree. We long ago chose explicitly not to solve the version-selection problem in this module system, leaving it to build tools and container applications. If we allow digits at the end of module names then we just invite confusion. "I wrote `requires foo4` but I have `foo5.jar` on my module path, why doesn't that work?" - Mark [1] http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-September/000393.html [2] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-September/009366.html From mark.reinhold at oracle.com Thu Jan 26 22:36:31 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Thu, 26 Jan 2017 14:36:31 -0800 (PST) Subject: Proposal: #VersionedDependences In-Reply-To: <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> Message-ID: <20170126223631.BE89955DE7@eggemoggin.niobe.net> 2016/12/15 9:20:51 -0800, forax at univ-mlv.fr: > I do not like this proposal for several (good) reasons. (Well, of course they're good reasons!) > The issue asks to be able to store a version strings or a constraints. > It can be interpreted as two different things. > - the configuration used by example by Maven which uses constraints > that will be resolved, > - the other is the effectively resolved versions which is what this > proposal do. > In my opinion, storing the former info maybe more interesting for a > language (the actual configuration) than storing the later. Storing version constraints might be more interesting, but we can't do that in JPMS since JPMS (intentionally) does not have a concept of version constraints. > We already agree that we support annotations (and obviously classfile > attributes) so any languages are free to store this kind of information > in it's own annotation/attribute, which is more flexible. Well sure, but the question here is what, if anything, we should do for Java itself. > If we still want to store the version of each requires, i think it's > better to store it in a side attribute (an array of versions should be > enough) than in the Module attribute, it's more aligned with the way we > encode ModuleVersion currently. The proposal exposes recorded version strings in the standard API, via a new method in the `ModuleDescriptor.Requires` class: Optional compiledVersion(); These strings are no less standard than any of the other information exposed in the standard `ModuleDescriptor` API, so recording them in the main `Module` attribute seems perfectly appropriate. > The flag --module-version in javac is useless: > - it's an optional feature that few will use, so runtimes can not use > it reliably, so it's as useful as -parameters, I don't think we can reliably predict, today, how many people will or will not use this feature. In the abstract it is compelling, especially to those of us with experience debugging large systems built from many components, so at this point I'm inclined to keep it. > - you often need to compile several modules together, if a module A > requires a module B and the module B uses a service from module A, but > --module-version can only specify one version for all the modules > compiled together. The expectation is that if you're compiling a set of modules together then they're all related, and hence likely all have the same version string. The workaround is to invoke the compiler more than once. If this really becomes a problem then a compiler could accept a more complex flag that specifies a map of module names to version strings, but that seems like overkill at this stage. - Mark From mark.reinhold at oracle.com Thu Jan 26 22:37:31 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Thu, 26 Jan 2017 14:37:31 -0800 (PST) Subject: Proposal: #ModuleNameCharacters (revised) In-Reply-To: <1619498537.509143.1481616761921.JavaMail.zimbra@u-pem.fr> References: <20161209214546.D23A721F6C@eggemoggin.niobe.net> <16e76b92-b6c1-6ac0-0e18-14f888dcf35c@redhat.com> <6c9bbb7f-133c-0600-d13a-1154098532de@redhat.com> <610BAFF3-B5AB-467D-A4F6-F51050274344@univ-mlv.fr> <20161212154730.907442096@eggemoggin.niobe.net> <1619498537.509143.1481616761921.JavaMail.zimbra@u-pem.fr> Message-ID: <20170126223731.C4D8855DEC@eggemoggin.niobe.net> 2016/12/13 0:12:42 -0800, R??mi Forax : > I think part of the problem is that currently the builder enforces some > rules and it should not, the fact that a ModuleDescriptor is well > formed or not should be checked by the configuration and not by the > builder itself. > > ... > > Actions: > - the builder should only build a descriptor and not check if it is > well formed or not > - when resolving the configuration, additional checks should be done to > verify that a module descriptor is well formed > - add a way to specify additional metadata to the module descriptor. These changes would make the `ModuleDescriptor.Builder` API much less usable for the vast majority of developers in order to make life a little easier for the very, very few developers who need to violate the constraints that it imposes. I think that's the wrong tradeoff, so I'm not going to make these changes. - Mark From mark.reinhold at oracle.com Thu Jan 26 22:38:31 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Thu, 26 Jan 2017 14:38:31 -0800 (PST) Subject: Proposal: #ModuleNameCharacters (revised) In-Reply-To: <1185843677.6305.1483314190501.JavaMail.zimbra@u-pem.fr> References: <20161209214546.D23A721F6C@eggemoggin.niobe.net> <1185843677.6305.1483314190501.JavaMail.zimbra@u-pem.fr> Message-ID: <20170126223831.CC9DF55DF1@eggemoggin.niobe.net> 2017/1/1 15:43:10 -0800, R??mi Forax : > Re-reading this thread after a message sent privately by Ess Kay aksing why > spaces are supported, i think we should disallow 0x20 (space) too, > having a module not found because a module name has a trailing space will not > be fun. No, that wouldn't be fun, but such module names will never be emitted by a Java compiler, hence I don't think this type of error will be so common as to justify banning spaces in module names in all `module-info.class` files. - Mark From ali.ebrahimi1781 at gmail.com Fri Jan 27 16:52:01 2017 From: ali.ebrahimi1781 at gmail.com (Ali Ebrahimi) Date: Fri, 27 Jan 2017 20:22:01 +0330 Subject: Proposal: #VersionedDependences In-Reply-To: <20170126223631.BE89955DE7@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> Message-ID: Hi, On Fri, Jan 27, 2017 at 2:06 AM, wrote: > 2016/12/15 9:20:51 -0800, forax at univ-mlv.fr: > > I do not like this proposal for several (good) reasons. > > (Well, of course they're good reasons!) > > > The issue asks to be able to store a version strings or a constraints. > > It can be interpreted as two different things. > > - the configuration used by example by Maven which uses constraints > > that will be resolved, > > - the other is the effectively resolved versions which is what this > > proposal do. > > In my opinion, storing the former info maybe more interesting for a > > language (the actual configuration) than storing the later. > > Storing version constraints might be more interesting, but we can't > do that in JPMS since JPMS (intentionally) does not have a concept of > version constraints. > > > We already agree that we support annotations (and obviously classfile > > attributes) so any languages are free to store this kind of information > > in it's own annotation/at > > - you often need to compile several modules together, if a module A > > requires a module B and the module B uses a service from module A, but > > --module-version can only specify one version for all the modules > > compiled together. > > The expectation is that if you're compiling a set of modules together > then they're all related, and hence likely all have the same version > string. The workaround is to invoke the compiler more than once. If > this really becomes a problem then a compiler could accept a more complex > flag that specifies a map of module names to version strings, but that > seems like overkill at this stage. > Why we don't add source code support for version string in module declaration? module foo at 1.0{ .... } module bar at 2.0{ .... } This way when code reviewing we know what is version of module. Also, we don't have Remi's mentioned problem of multi-module code bases. -- Best Regards, Ali Ebrahimi From forax at univ-mlv.fr Fri Jan 27 17:53:23 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 27 Jan 2017 18:53:23 +0100 (CET) Subject: Proposal: #VersionedDependences In-Reply-To: <20170126223631.BE89955DE7@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> Message-ID: <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> Hi Mark, ----- Mail original ----- > De: "mark reinhold" > ?: "Remi Forax" > Cc: jpms-spec-experts at openjdk.java.net > Envoy?: Jeudi 26 Janvier 2017 23:36:31 > Objet: Re: Proposal: #VersionedDependences > 2016/12/15 9:20:51 -0800, forax at univ-mlv.fr: >> I do not like this proposal for several (good) reasons. > > (Well, of course they're good reasons!) > >> The issue asks to be able to store a version strings or a constraints. >> It can be interpreted as two different things. >> - the configuration used by example by Maven which uses constraints >> that will be resolved, >> - the other is the effectively resolved versions which is what this >> proposal do. >> In my opinion, storing the former info maybe more interesting for a >> language (the actual configuration) than storing the later. > > Storing version constraints might be more interesting, but we can't > do that in JPMS since JPMS (intentionally) does not have a concept of > version constraints. yes, but given that a version is a string, you should be able to store anything in it, if the value is not inserted by javac but by jar. > >> We already agree that we support annotations (and obviously classfile >> attributes) so any languages are free to store this kind of information >> in it's own annotation/attribute, which is more flexible. > > Well sure, but the question here is what, if anything, we should do for > Java itself. Java already use non standard attributes for things specific to Java ... > >> If we still want to store the version of each requires, i think it's >> better to store it in a side attribute (an array of versions should be >> enough) than in the Module attribute, it's more aligned with the way we >> encode ModuleVersion currently. > > The proposal exposes recorded version strings in the standard API, via > a new method in the `ModuleDescriptor.Requires` class: > > Optional compiledVersion(); > > These strings are no less standard than any of the other information > exposed in the standard `ModuleDescriptor` API, so recording them in > the main `Module` attribute seems perfectly appropriate. It's something we have swept under the rug, but the version (of the module or of requires) format should not be Java specific. I see no problem to have the ModuleBuilder to enforce a specific format but a ModuleDescriptor should return an Optional (it can also retruns an Optional in an overloaded method but it should be possible to get the string version). Version format depends on the module system, it should be enforced by the code that creates the Layer. > >> The flag --module-version in javac is useless: >> - it's an optional feature that few will use, so runtimes can not use >> it reliably, so it's as useful as -parameters, > > I don't think we can reliably predict, today, how many people will or > will not use this feature. In the abstract it is compelling, especially > to those of us with experience debugging large systems built from many > components, so at this point I'm inclined to keep it. it's compelling for those debugging a large system and not having an artifact repository ... being able to get the sources (+ configuration) for something you have shipped is all what you need. > >> - you often need to compile several modules together, if a module A >> requires a module B and the module B uses a service from module A, but >> --module-version can only specify one version for all the modules >> compiled together. > > The expectation is that if you're compiling a set of modules together > then they're all related, and hence likely all have the same version > string. The workaround is to invoke the compiler more than once. If > this really becomes a problem then a compiler could accept a more complex > flag that specifies a map of module names to version strings, but that > seems like overkill at this stage. Compiling several times will not play well with incremental compilation, if you change a module B required by another module A that do not change, having to re-compile module A that have not changed just because you have to update the version of the requires directive in A is a very bad property of a build process. I was pretty happy with a world without a version on requires, specifying the version using the command jar which make a lot of sense because it's when you bundle everything together (code + resources), when you create the module. > > - Mark R?mi From forax at univ-mlv.fr Fri Jan 27 18:04:42 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 27 Jan 2017 19:04:42 +0100 (CET) Subject: Proposal: #ModuleNameCharacters (revised) In-Reply-To: <20170126223831.CC9DF55DF1@eggemoggin.niobe.net> References: <20161209214546.D23A721F6C@eggemoggin.niobe.net> <1185843677.6305.1483314190501.JavaMail.zimbra@u-pem.fr> <20170126223831.CC9DF55DF1@eggemoggin.niobe.net> Message-ID: <393669787.2066967.1485540282879.JavaMail.zimbra@u-pem.fr> Ok, i agree with both of you. R?mi ----- Mail original ----- > De: "mark reinhold" > ?: "Remi Forax" > Cc: jpms-spec-experts at openjdk.java.net > Envoy?: Jeudi 26 Janvier 2017 23:38:31 > Objet: Re: Proposal: #ModuleNameCharacters (revised) > 2017/1/1 15:43:10 -0800, R?mi Forax : >> Re-reading this thread after a message sent privately by Ess Kay aksing why >> spaces are supported, i think we should disallow 0x20 (space) too, >> having a module not found because a module name has a trailing space will not >> be fun. > > No, that wouldn't be fun, but such module names will never be emitted by > a Java compiler, hence I don't think this type of error will be so common > as to justify banning spaces in module names in all `module-info.class` > files. > > - Mark From forax at univ-mlv.fr Fri Jan 27 18:19:09 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 27 Jan 2017 19:19:09 +0100 (CET) Subject: Proposal: #VersionedDependences In-Reply-To: References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> Message-ID: <1695159091.2069646.1485541149445.JavaMail.zimbra@u-pem.fr> Hi Ali, > De: "Ali Ebrahimi" > ?: "jpms-spec-observers" , "mark reinhold" > > Cc: "Remi Forax" > Envoy?: Vendredi 27 Janvier 2017 17:52:01 > Objet: Re: Proposal: #VersionedDependences > Hi, > On Fri, Jan 27, 2017 at 2:06 AM, < mark.reinhold at oracle.com > wrote: >> 2016/12/15 9:20:51 -0800, forax at univ-mlv.fr : >> > I do not like this proposal for several (good) reasons. >> (Well, of course they're good reasons!) >> > The issue asks to be able to store a version strings or a constraints. >> > It can be interpreted as two different things. >> > - the configuration used by example by Maven which uses constraints >> > that will be resolved, >> > - the other is the effectively resolved versions which is what this >> > proposal do. >> > In my opinion, storing the former info maybe more interesting for a >> > language (the actual configuration) than storing the later. >> Storing version constraints might be more interesting, but we can't >> do that in JPMS since JPMS (intentionally) does not have a concept of >> version constraints. >> > We already agree that we support annotations (and obviously classfile >> > attributes) so any languages are free to store this kind of information >> > in it's own annotation/at >> > - you often need to compile several modules together, if a module A >> > requires a module B and the module B uses a service from module A, but >> > --module-version can only specify one version for all the modules >> > compiled together. >> The expectation is that if you're compiling a set of modules together >> then they're all related, and hence likely all have the same version >> string. The workaround is to invoke the compiler more than once. If >> this really becomes a problem then a compiler could accept a more complex >> flag that specifies a map of module names to version strings, but that >> seems like overkill at this stage. > Why we don't add source code support for version string in module declaration? > module foo at 1.0{ > .... > } > module bar at 2.0{ > .... > } > This way when code reviewing we know what is version of module. > Also, we don't have Remi's mentioned problem of multi-module code bases. when you do things like security patch by example, you may update several module versions but want only to commit the change corresponding to the security fix, i.e. not having the security fix hidden in the middle of a lot of module-info useless updates. In my opinion, it's close to inserting the source control version inside a code, it's an anti-pattern. > -- > Best Regards, > Ali Ebrahimi regards, R?mi From ali.ebrahimi1781 at gmail.com Fri Jan 27 19:08:29 2017 From: ali.ebrahimi1781 at gmail.com (Ali Ebrahimi) Date: Fri, 27 Jan 2017 22:38:29 +0330 Subject: Proposal: #VersionedDependences In-Reply-To: <1695159091.2069646.1485541149445.JavaMail.zimbra@u-pem.fr> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> <1695159091.2069646.1485541149445.JavaMail.zimbra@u-pem.fr> Message-ID: Hi, On Fri, Jan 27, 2017 at 9:49 PM, wrote: > Hi Ali, > ------------------------------ > > Why we don't add source code support for version string in module > declaration? > module foo at 1.0{ > .... > } > > module bar at 2.0{ > .... > } > > This way when code reviewing we know what is version of module. > Also, we don't have Remi's mentioned problem of multi-module code bases. > > > when you do things like security patch by example, you may update several > module versions but want only to commit the change corresponding to the > security fix, > i.e. not having the security fix hidden in the middle of a lot of > module-info useless updates. > In my opinion, it's close to inserting the source control version inside a > code, it's an anti-pattern. > This is some thing that currently occurring in modular code bases before java9 in Maven based code bases and version strings recorded in mvn.xml file. Anyway, command line options can override source code values. -- Best Regards, Ali Ebrahimi From mark.reinhold at oracle.com Fri Jan 27 22:24:21 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Fri, 27 Jan 2017 14:24:21 -0800 Subject: Proposal: #VersionedDependences In-Reply-To: <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> Message-ID: <20170127142421.887524637@eggemoggin.niobe.net> 2017/1/27 9:53:23 -0800, forax at univ-mlv.fr: > 2017/1/26 14:36:31 -0800, mark.reinhold at oracle.com: >> 2016/12/15 9:20:51 -0800, forax at univ-mlv.fr: >>> ... >>> >>> The issue asks to be able to store a version strings or a constraints. >>> It can be interpreted as two different things. >>> - the configuration used by example by Maven which uses constraints >>> that will be resolved, >>> - the other is the effectively resolved versions which is what this >>> proposal do. >>> In my opinion, storing the former info maybe more interesting for a >>> language (the actual configuration) than storing the later. >> >> Storing version constraints might be more interesting, but we can't >> do that in JPMS since JPMS (intentionally) does not have a concept of >> version constraints. > > yes, but given that a version is a string, you should be able to store > anything in it, if the value is not inserted by javac but by jar. A version is not just a string, and we should not encourage people to use versions to encode information that's not a version. >> ... >> >> These strings are no less standard than any of the other information >> exposed in the standard `ModuleDescriptor` API, so recording them in >> the main `Module` attribute seems perfectly appropriate. > > It's something we have swept under the rug, but the version (of the > module or of requires) format should not be Java specific. It should be as general as possible but it should be well-specified. In that sense it will be Java-specific. > I see no problem to have the ModuleBuilder to enforce a specific > format but a ModuleDescriptor should return an Optional (it > can also retruns an Optional in an overloaded method but it > should be possible to get the string version). > > Version format depends on the module system, it should be enforced by > the code that creates the Layer. Yes, the version format depends on the module system, and that module system is JPMS. We're not designing multiple different module systems, nor a low-level framework upon which multiple different module systems can be implemented. The JPMS version format attempts to encompass a wide variety of existing version schemes, but it does not (and can not) encompass them all. >>> The flag --module-version in javac is useless: >>> - it's an optional feature that few will use, so runtimes can not use >>> it reliably, so it's as useful as -parameters, >> >> I don't think we can reliably predict, today, how many people will or >> will not use this feature. In the abstract it is compelling, especially >> to those of us with experience debugging large systems built from many >> components, so at this point I'm inclined to keep it. > > it's compelling for those debugging a large system and not having an > artifact repository ... being able to get the sources (+ > configuration) for something you have shipped is all what you need. ... if you have an artifact repository. >>> - you often need to compile several modules together, if a module A >>> requires a module B and the module B uses a service from module A, but >>> --module-version can only specify one version for all the modules >>> compiled together. >> >> The expectation is that if you're compiling a set of modules together >> then they're all related, and hence likely all have the same version >> string. The workaround is to invoke the compiler more than once. If >> this really becomes a problem then a compiler could accept a more complex >> flag that specifies a map of module names to version strings, but that >> seems like overkill at this stage. > > Compiling several times will not play well with incremental > compilation, if you change a module B required by another module A > that do not change, having to re-compile module A that have not > changed just because you have to update the version of the requires > directive in A is a very bad property of a build process. The mention of a `--module-version` compiler option in the proposal is only a suggestion; it will not be a normative part of the JPMS specification. If complex multi-module, multi-version, incremental compilation scenarios become a real problem then compiler implementors are free to do something fancier. I don't think we need to spend any more time on this point. - Mark From mark.reinhold at oracle.com Sat Jan 28 01:45:47 2017 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Fri, 27 Jan 2017 17:45:47 -0800 Subject: Proposal: #VersionedDependences In-Reply-To: <20170127142421.887524637@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> <20170127142421.887524637@eggemoggin.niobe.net> Message-ID: <20170127174547.771927105@eggemoggin.niobe.net> 2017/1/27 14:24:21 -0800, mark.reinhold at oracle.com: > 2017/1/27 9:53:23 -0800, forax at univ-mlv.fr: >> ... >> >> I see no problem to have the ModuleBuilder to enforce a specific >> format but a ModuleDescriptor should return an Optional (it >> can also retruns an Optional in an overloaded method but it >> should be possible to get the string version). >> >> Version format depends on the module system, it should be enforced by >> the code that creates the Layer. > > Yes, the version format depends on the module system, and that module > system is JPMS. We're not designing multiple different module systems, > nor a low-level framework upon which multiple different module systems > can be implemented. The JPMS version format attempts to encompass a > wide variety of existing version schemes, but it does not (and can not) > encompass them all. Thinking on this further, perhaps you're right. It's true that we're designing just one module system here, not many, but for the sake of tradition and (perhaps needless) generality we've already accepted that binary module descriptors can define and refer to modules whose names would not be accepted in a source-form module declaration. The `ModuleDescriptor.Builder` API is (more or less) aligned with the source language, but you can use the various `read` methods in `ModuleDescriptor` to instantiate instances of that class that you couldn't construct with the `Builder`. Allowing non-source-code names to surface in the API is easy, since they're still just strings. Versions are trickier, since when they're actual JPMS versions we'd like to represent them as instances of the `Version` class, but if they're not JPMS versions then we should still somehow surface them as strings. I suppose the best we can do here is, wherever a version is exposed, define a pair of methods. One will return the raw version string; the other will return a `Version` object if the raw string can be parsed as such, and an empty `Optional` otherwise. So instead of the single `compiledVersion` method proposed for the `Requires` class we'd have two, Optional compiledVersionString(); Optional compiledVersion(); and similarly for the `ModuleDescriptor::version()` method itself. Make sense? - Mark From forax at univ-mlv.fr Sat Jan 28 12:36:06 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 28 Jan 2017 13:36:06 +0100 (CET) Subject: Proposal: #VersionedDependences In-Reply-To: <20170127142421.887524637@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> <20170127142421.887524637@eggemoggin.niobe.net> Message-ID: <1776765183.2122922.1485606966254.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "mark reinhold" > ?: forax at univ-mlv.fr > Cc: jpms-spec-experts at openjdk.java.net > Envoy?: Vendredi 27 Janvier 2017 23:24:21 > Objet: Re: Proposal: #VersionedDependences > 2017/1/27 9:53:23 -0800, forax at univ-mlv.fr: >> 2017/1/26 14:36:31 -0800, mark.reinhold at oracle.com: >>> 2016/12/15 9:20:51 -0800, forax at univ-mlv.fr: >>>> ... >>>> >>>> The issue asks to be able to store a version strings or a constraints. >>>> It can be interpreted as two different things. >>>> - the configuration used by example by Maven which uses constraints >>>> that will be resolved, >>>> - the other is the effectively resolved versions which is what this >>>> proposal do. >>>> In my opinion, storing the former info maybe more interesting for a >>>> language (the actual configuration) than storing the later. >>> >>> Storing version constraints might be more interesting, but we can't >>> do that in JPMS since JPMS (intentionally) does not have a concept of >>> version constraints. >> >> yes, but given that a version is a string, you should be able to store >> anything in it, if the value is not inserted by javac but by jar. > > A version is not just a string, and we should not encourage people to > use versions to encode information that's not a version. > >>> ... >>> >>> These strings are no less standard than any of the other information >>> exposed in the standard `ModuleDescriptor` API, so recording them in >>> the main `Module` attribute seems perfectly appropriate. >> >> It's something we have swept under the rug, but the version (of the >> module or of requires) format should not be Java specific. > > It should be as general as possible but it should be well-specified. > In that sense it will be Java-specific. > >> I see no problem to have the ModuleBuilder to enforce a specific >> format but a ModuleDescriptor should return an Optional (it >> can also retruns an Optional in an overloaded method but it >> should be possible to get the string version). >> >> Version format depends on the module system, it should be enforced by >> the code that creates the Layer. > > Yes, the version format depends on the module system, and that module > system is JPMS. We're not designing multiple different module systems, > nor a low-level framework upon which multiple different module systems > can be implemented. The JPMS version format attempts to encompass a > wide variety of existing version schemes, but it does not (and can not) > encompass them all. > >>>> The flag --module-version in javac is useless: >>>> - it's an optional feature that few will use, so runtimes can not use >>>> it reliably, so it's as useful as -parameters, >>> >>> I don't think we can reliably predict, today, how many people will or >>> will not use this feature. In the abstract it is compelling, especially >>> to those of us with experience debugging large systems built from many >>> components, so at this point I'm inclined to keep it. >> >> it's compelling for those debugging a large system and not having an >> artifact repository ... being able to get the sources (+ >> configuration) for something you have shipped is all what you need. > > ... if you have an artifact repository. > >>>> - you often need to compile several modules together, if a module A >>>> requires a module B and the module B uses a service from module A, but >>>> --module-version can only specify one version for all the modules >>>> compiled together. >>> >>> The expectation is that if you're compiling a set of modules together >>> then they're all related, and hence likely all have the same version >>> string. The workaround is to invoke the compiler more than once. If >>> this really becomes a problem then a compiler could accept a more complex >>> flag that specifies a map of module names to version strings, but that >>> seems like overkill at this stage. >> >> Compiling several times will not play well with incremental >> compilation, if you change a module B required by another module A >> that do not change, having to re-compile module A that have not >> changed just because you have to update the version of the requires >> directive in A is a very bad property of a build process. > > The mention of a `--module-version` compiler option in the proposal > is only a suggestion; it will not be a normative part of the JPMS > specification. If complex multi-module, multi-version, incremental > compilation scenarios become a real problem then compiler implementors > are free to do something fancier. I don't think we need to spend any > more time on this point. Ok ! so for the record, in my opinion, versioned requires should have been encoded in a separated non JPMS attribute, but it's too late to change the classfile format and we have to move on. > > - Mark R?mi From forax at univ-mlv.fr Sat Jan 28 12:40:01 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 28 Jan 2017 13:40:01 +0100 (CET) Subject: Proposal: #VersionedDependences In-Reply-To: <20170127174547.771927105@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> <20170127142421.887524637@eggemoggin.niobe.net> <20170127174547.771927105@eggemoggin.niobe.net> Message-ID: <1441073057.2123030.1485607201470.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "mark reinhold" > ?: forax at univ-mlv.fr > Cc: jpms-spec-experts at openjdk.java.net > Envoy?: Samedi 28 Janvier 2017 02:45:47 > Objet: Re: Proposal: #VersionedDependences > 2017/1/27 14:24:21 -0800, mark.reinhold at oracle.com: >> 2017/1/27 9:53:23 -0800, forax at univ-mlv.fr: >>> ... >>> >>> I see no problem to have the ModuleBuilder to enforce a specific >>> format but a ModuleDescriptor should return an Optional (it >>> can also retruns an Optional in an overloaded method but it >>> should be possible to get the string version). >>> >>> Version format depends on the module system, it should be enforced by >>> the code that creates the Layer. >> >> Yes, the version format depends on the module system, and that module >> system is JPMS. We're not designing multiple different module systems, >> nor a low-level framework upon which multiple different module systems >> can be implemented. The JPMS version format attempts to encompass a >> wide variety of existing version schemes, but it does not (and can not) >> encompass them all. > > Thinking on this further, perhaps you're right. > > It's true that we're designing just one module system here, not many, > but for the sake of tradition and (perhaps needless) generality we've > already accepted that binary module descriptors can define and refer > to modules whose names would not be accepted in a source-form module > declaration. The `ModuleDescriptor.Builder` API is (more or less) > aligned with the source language, but you can use the various `read` > methods in `ModuleDescriptor` to instantiate instances of that class > that you couldn't construct with the `Builder`. > > Allowing non-source-code names to surface in the API is easy, since > they're still just strings. Versions are trickier, since when they're > actual JPMS versions we'd like to represent them as instances of the > `Version` class, but if they're not JPMS versions then we should still > somehow surface them as strings. > > I suppose the best we can do here is, wherever a version is exposed, > define a pair of methods. One will return the raw version string; the > other will return a `Version` object if the raw string can be parsed > as such, and an empty `Optional` otherwise. > > So instead of the single `compiledVersion` method proposed for the > `Requires` class we'd have two, > > Optional compiledVersionString(); > Optional compiledVersion(); > > and similarly for the `ModuleDescriptor::version()` method itself. > > Make sense? yes ! for the API, if i can nitpick, i prefer: Optional rawCompiledVersion() Optional compiledVersion() > > - Mark R?mi From david.lloyd at redhat.com Mon Jan 30 14:26:20 2017 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 30 Jan 2017 08:26:20 -0600 Subject: Proposal: #VersionedDependences In-Reply-To: <20170126223531.B91FE55DDE@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <20170126223531.B91FE55DDE@eggemoggin.niobe.net> Message-ID: On 01/26/2017 04:35 PM, mark.reinhold at oracle.com wrote: > 2016/12/9 14:10:43 -0800, david.lloyd at redhat.com: >> On 12/09/2016 03:46 PM, mark.reinhold at oracle.com wrote: >>> ... >>> >>> Proposal >>> -------- >>> >>> ... >>> >>> Now that compile-time versions can be recorded in module descriptors >>> there is even less need to tolerate version information in module names, >>> a bad practice that we'd like to discourage at the outset. We therefore >>> further propose to: >>> >>> - Revise the accepted proposal for #VersionsInModuleNames [3] to state >>> that a module name appearing anywhere in a source-form module >>> declaration must both start and end with "Java letters" [4]. >> >> Can we just drop this part? I really am not a fan of the >> social-enforcement-in-technical-code thing, and I can already think of >> one existing project off the top of my head that will suffer >> collaterally from this: Fabric8. Also any other project to which the >> "vanity license plate effect" would apply. > > This is only the second example that's been pointed out since I posted > the original proposal for #VersionsInModuleNames [1], the first being > `commons-lang3` [2]. The difference, from my perspective, is that Fabric8 is an important Red Hat project. Also to be subsequently annoyed, from 5 minutes of searching just on GitHub: "CS###", "Excercise###", "Lab###", "Chapter###"; fx2048 (i.e. number based games); Simple8583 (ISO-8583 framework); projects relating to the (apparently several) devices out there whose name ends in a digit or several digits; things referencing JSR numbers like jackson-datatype-jsr310; things ending in a year (relating to events like conferences and that sort of thing); things relating to versions of a certification or specification (like our jta-1_1 API projects and similar), etc. >> And I can think of many ways >> to circumvent this rule, including, but not limited to, bracketing with >> letters "v5slot", roman numerals, etc., so really all it does is add an >> annoying "big brother" effect without practical benefit. > > Sure, you can work around it, but disallowing the obvious abuses should > go a long way towards encouraging people to do the right thing. > > If this restriction really becomes a problem then we could consider > loosening it in a later release. If we lift it now and this kind of > abuse becomes common then we'll have no way to go back. If this "abuse" is common, then... so what? I think the error here is assuming there is a globally "right" thing to do, when it's already clear that there are valid and useful cases where this is not true, and I'm telling you now from a place of experience that having multiple versions of a thing is not by itself any kind of anti-pattern. Like anything in software, it can be abused, but taking away an obvious feature in order to ensure that users behave in one certain way has always been a regrettable mistake in my experience. Especially when the primary problem o I think that users are going to be far more annoyed by the restriction than benefited by it. >> Anyway I disagree pretty strongly that versions (or more specifically, >> version segments) in module names are that really that strong of an >> anti-pattern. Sure having whole version numbers in is a pretty fragile >> technique, but it's very useful to have (for example) multiple major >> versions of a library on a large distribution to ease migration, >> especially when you're talking hundreds or thousands of modules. > > In the context of JPMS, I respectfully disagree. We long ago chose > explicitly not to solve the version-selection problem in this module > system, leaving it to build tools and container applications. Yes however what this policy implies is not "we will not solve this problem" but "we will not only not solve this problem, we will make many reasonable solutions impossible". People are still going to do this; their solutions are just going to be weirder and the resultant problems harder to solve. The basis for any agreement on our part with this requirement was that, while we don't have explicit support for "multiple versions" (which has been repeatedly shown to be a very hard-to-define expression, by the way), we will do nothing to prevent it either. > If we > allow digits at the end of module names then we just invite confusion. > "I wrote `requires foo4` but I have `foo5.jar` on my module path, why > doesn't that work?" The obvious answer is because foo4 is not the same as foo5, and experience tells me that users historically have been able to figure this out without any problems. > - Mark > > > [1] http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-September/000393.html > [2] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-September/009366.html > -- - DML From david.lloyd at redhat.com Mon Jan 30 15:13:10 2017 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 30 Jan 2017 09:13:10 -0600 Subject: Proposal: #VersionedDependences In-Reply-To: <20170127142421.887524637@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> <20170127142421.887524637@eggemoggin.niobe.net> Message-ID: <228212c6-e4ee-4da6-507f-f31524ef0d88@redhat.com> On 01/27/2017 04:24 PM, mark.reinhold at oracle.com wrote: > 2017/1/27 9:53:23 -0800, forax at univ-mlv.fr: >> 2017/1/26 14:36:31 -0800, mark.reinhold at oracle.com: >>> 2016/12/15 9:20:51 -0800, forax at univ-mlv.fr: >>>> ... >>>> >>>> The issue asks to be able to store a version strings or a constraints. >>>> It can be interpreted as two different things. >>>> - the configuration used by example by Maven which uses constraints >>>> that will be resolved, >>>> - the other is the effectively resolved versions which is what this >>>> proposal do. >>>> In my opinion, storing the former info maybe more interesting for a >>>> language (the actual configuration) than storing the later. >>> >>> Storing version constraints might be more interesting, but we can't >>> do that in JPMS since JPMS (intentionally) does not have a concept of >>> version constraints. >> >> yes, but given that a version is a string, you should be able to store >> anything in it, if the value is not inserted by javac but by jar. > > A version is not just a string, and we should not encourage people to > use versions to encode information that's not a version. This is a fair point, however, see below. >>> ... >>> >>> These strings are no less standard than any of the other information >>> exposed in the standard `ModuleDescriptor` API, so recording them in >>> the main `Module` attribute seems perfectly appropriate. >> >> It's something we have swept under the rug, but the version (of the >> module or of requires) format should not be Java specific. > > It should be as general as possible but it should be well-specified. > In that sense it will be Java-specific. > >> I see no problem to have the ModuleBuilder to enforce a specific >> format but a ModuleDescriptor should return an Optional (it >> can also retruns an Optional in an overloaded method but it >> should be possible to get the string version). >> >> Version format depends on the module system, it should be enforced by >> the code that creates the Layer. > > Yes, the version format depends on the module system, and that module > system is JPMS. We're not designing multiple different module systems, > nor a low-level framework upon which multiple different module systems > can be implemented. But unfortunately we are, and we must. You cannot hide behind "we're only implementing one module system" while at the same time creating the JVM diagnostic and security features which *should* be applicable to any module system but *can* only be exploited by the one. And it is certainly irresponsible to implement a top-to-bottom module system like this, in such a way as to ensure that it exclusively has the ability to exploit such features, on the assumption that somehow it is either sufficient or can be made sufficient for all use cases, while at the same time either disregarding or narrowly cherry-picking requirements and experience from existing frameworks, particularly when several such frameworks exist and are used in the wild. Indeed in hindsight it appears that it would have been far more logical to have *started* with a low-level framework to support additional diagnostics and security for user class loader implementations, modularizing just the JDK at first, and to have built it up over subsequent JDK releases to include a Java user API for modules, a deployment format, and compiler support once the bugs were worked out (and it would certainly have been far less likely to have delayed Java 9 this long in such a case). But we are where we are, and the only reasonable compromise at this point is to provide the new deployment format while at the same time creating hooks to support others, including those for which naming or versioning schemes differ (perhaps dramatically) from the proposed deployment format. From the perspective of the characteristics of the modular deployment format: if Red Hat signs off on this specification, it is not going to be because the deployment format is good, or correct, or covers all use cases, but rather because there are ways to mitigate the problems in the case that it is not. > The JPMS version format attempts to encompass a > wide variety of existing version schemes, but it does not (and can not) > encompass them all. Absolutely. But it's the Layer provider which *can* do so, and should be able to do so. This can easily be mitigated by putting version validation (and collation, if such a thing is to be allowed) policy on the Layer implementation. Users will not manually encode garbage into the version string if the resultant module is not loadable; in this way, such a Layer-oriented policy is identical to the current policy. Speaking in terms of implementation - the current List implementation is not very good for a bunch of reasons. A simple String (guarded by the Layer's validation policy of course) would work far more efficiently, and if collation is to be supported, a Comparator implementation matching the validation rules that simply operates on the code points of the string is not only possible but reasonably easy to do. I implemented just such a simple DFA-style parser for jboss-modules which works for a combination parser and comparator and the whole thing clocks in at just around 100 lines for the parser methods plus a few small support methods. The whole thing, from comments to copyright to serialization to an additional API that allows iterating the version segments, is under 600 LOC. The memory savings of such an approach should be fairly substantial, not to mention eliminating some generics and other weirdness. -- - DML From david.lloyd at redhat.com Mon Jan 30 15:13:31 2017 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 30 Jan 2017 09:13:31 -0600 Subject: Proposal: #NonHierarchicalLayers In-Reply-To: <20170126223431.B417255DD9@eggemoggin.niobe.net> References: <20161207234117.1E5F1213F1@eggemoggin.niobe.net> <1de06198-930b-4b71-fe5d-1477d9cace80@redhat.com> <20161212232335.A9BFC24B27@eggemoggin.niobe.net> <9b66aa52-877a-e4b2-1458-9ff47dfb690b@redhat.com> <20170126223431.B417255DD9@eggemoggin.niobe.net> Message-ID: <7454db4e-8eeb-49a8-f7b6-d4821ddb9170@redhat.com> On 01/26/2017 04:34 PM, mark.reinhold at oracle.com wrote: > 2016/12/13 6:47:53 -0800, david.lloyd at redhat.com: >> On 12/12/2016 05:23 PM, mark.reinhold at oracle.com wrote: >>> 2016/12/8 5:46:24 -0800, david.lloyd at redhat.com: >>>> ... I've added the following methods to Layer.Controller: >>>> >>>> public Controller addPackage(Module source, String pn) { ... } >>>> public Controller addOpens(Module source, String pn, Module target) { ... } >>>> public Controller addOpensToAll(Module source, String pn) { ... } >>>> public Controller addOpensToAllUnnamed(Module source, String pn) { ... } >>>> public Controller addExports(Module source, String pn, Module target) { >>>> ... } >>>> public Controller addExportsToAll(Module source, String pn) { ... } >>>> public Controller addExportsToAllUnnamed(Module source, String pn) { ... } >>>> public Controller addUses(Module source, Class service) { ... } >>> >>> Can you explain exactly why you need all these methods? >>> >>> I can see why you might need the qualified `addExports` method, akin >>> to the existing `addOpens` method, if you're doing some form of module >>> resolution on your own that's somehow taking named layers, or whatever, >>> into account. >> >> Yeah we're assembling the module structure in a multi-stage lazy >> resolution process, thus we don't know exactly what we're opening or >> exporting until after all contents and dependencies are defined (and >> this can change over time). > > Hmm. This seems to be a fundamental difference between JBoss modules and > OSGi. Once an OSGi bundle is loaded then its exports don't change, at > least not until it's updated, and this is part of what enables Watson's > JPMS embedding to work. OSGi does support dynamic attachments of fragments. The current prototype cannot do this but on Jan. 4 Thomas expressed that being able to add packages would enable this part of the specification. Re-linking everything is (according to this email) an alternative that comes at the cost of not supporting this feature. >>> The `add{Opens,Exports}ToAll` variants shouldn't be needed since you can >>> just include unqualified `open` and `exports` directives in the module >>> descriptor that you're going to build anyway. That has the additional >>> benefit of making the exports apparent to the JPMS resolver so that JPMS >>> modules can resolve against your modules, whereas invoking these methods >>> wouldn't do that. >> >> The problem is that when I first build the module, I don't have all the >> dependency information available, and also some of the dependencies will >> include modules which are not visible to the layer (in fact right now >> I'm putting each module into a separate layer), and some of the >> dependencies are on non-Jigsaw entities which I also can't know initially. >> >>> The `addPackage` method is problematic, in part since it's quite slow >>> (at least in HotSpot, where we've optimized for the common case of the >>> set of packages in a module never changing). Can't you compute the set >>> of packages in a module at the time you create the descriptor? >> >> Two things prevent this: firstly, we don't have the module contents >> until after we've constructed the module class loader, which requires >> the layer controller in order to process dependencies, so as of now, >> adding contents depends on having the module established. Secondly, it >> is allowed to add content to a JBoss module after it has been >> established. The first might be fixable, but I can't think of a way >> around the second. > > Me neither. > > For any kind of Watson-style embedding of JBoss modules in JPMS to work > then there has to be a point in time at which you can say "okay, this > JBoss module is in a stable state, let's spin up a JPMS module descriptor > for it so that the JPMS resolver can resolve some JPMS modules against > it." If the JBoss module is updated later on then you create a new JPMS > module descriptor for it and re-resolve any JPMS modules that used to > depend upon it. In order to do this we have to shut down all existing deployments in the container that have any kind of dependency relationship, and restart them. This is not really acceptable as it defeats the purpose of the mechanism. > Can you, in general, identify such points in time for JBoss modules? Any time the user deploys additional or replacement content is a point where a module may need to be updated. > If so, then I don't think you need the above methods. The JPMS module > descriptor that you construct needn't capture all the details of the > corresponding JBoss module. In fact it can't, since JBoss modules can > relate to each other in ways that aren't expressible in JPMS. That's > fine, though -- the module descriptor need only expose the information > that the JPMS resolver needs for actual JPMS modules, mainly `opens`, > `exports`, and `requires transitive`. > If not, then adding the above methods won't help you anyway. They only > affect the run-time state of modules, i.e., the state known to the JVM > and surfaced in `java.lang.reflect.Module`. The JPMS resolver doesn't > read the run-time state but, rather, the configurations computed by > earlier invocations of the resolver, and ultimately those are all based > on module descriptors, over in the `java.lang.module` package. Currently the prototype code does not use the JPMS resolver. From my understanding, the JVM tables mediate access control, but the actual linkage between classes at run time happens via java.lang.ClassLoader#findClass(java.lang.String, java.lang.String). As long as this method resolves > > - Mark > -- - DML From david.lloyd at redhat.com Mon Jan 30 15:14:57 2017 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 30 Jan 2017 09:14:57 -0600 Subject: Proposal: #VersionedDependences In-Reply-To: <20170127174547.771927105@eggemoggin.niobe.net> References: <20161209214646.D915621F70@eggemoggin.niobe.net> <1717092100.1687029.1481822451766.JavaMail.zimbra@u-pem.fr> <20170126223631.BE89955DE7@eggemoggin.niobe.net> <1525805814.2065614.1485539603846.JavaMail.zimbra@u-pem.fr> <20170127142421.887524637@eggemoggin.niobe.net> <20170127174547.771927105@eggemoggin.niobe.net> Message-ID: <2f24d850-81b8-a442-5ef8-9f5872e2e479@redhat.com> On 01/27/2017 07:45 PM, mark.reinhold at oracle.com wrote: > 2017/1/27 14:24:21 -0800, mark.reinhold at oracle.com: >> 2017/1/27 9:53:23 -0800, forax at univ-mlv.fr: >>> ... >>> >>> I see no problem to have the ModuleBuilder to enforce a specific >>> format but a ModuleDescriptor should return an Optional (it >>> can also retruns an Optional in an overloaded method but it >>> should be possible to get the string version). >>> >>> Version format depends on the module system, it should be enforced by >>> the code that creates the Layer. >> >> Yes, the version format depends on the module system, and that module >> system is JPMS. We're not designing multiple different module systems, >> nor a low-level framework upon which multiple different module systems >> can be implemented. The JPMS version format attempts to encompass a >> wide variety of existing version schemes, but it does not (and can not) >> encompass them all. > > Thinking on this further, perhaps you're right. > > It's true that we're designing just one module system here, not many, > but for the sake of tradition and (perhaps needless) generality we've > already accepted that binary module descriptors can define and refer > to modules whose names would not be accepted in a source-form module > declaration. The `ModuleDescriptor.Builder` API is (more or less) > aligned with the source language, but you can use the various `read` > methods in `ModuleDescriptor` to instantiate instances of that class > that you couldn't construct with the `Builder`. > > Allowing non-source-code names to surface in the API is easy, since > they're still just strings. Versions are trickier, since when they're > actual JPMS versions we'd like to represent them as instances of the > `Version` class, but if they're not JPMS versions then we should still > somehow surface them as strings. > > I suppose the best we can do here is, wherever a version is exposed, > define a pair of methods. One will return the raw version string; the > other will return a `Version` object if the raw string can be parsed > as such, and an empty `Optional` otherwise. > > So instead of the single `compiledVersion` method proposed for the > `Requires` class we'd have two, > > Optional compiledVersionString(); > Optional compiledVersion(); > > and similarly for the `ModuleDescriptor::version()` method itself. > > Make sense? Better than nothing... I'll take it. -- - DML From rfscholte at apache.org Mon Jan 30 17:41:34 2017 From: rfscholte at apache.org (Robert Scholte) Date: Mon, 30 Jan 2017 18:41:34 +0100 Subject: Proposal: #VersionedDependences In-Reply-To: References: <20161209214646.D915621F70@eggemoggin.niobe.net> <20170126223531.B91FE55DDE@eggemoggin.niobe.net> Message-ID: On Mon, 30 Jan 2017 15:26:20 +0100, David M. Lloyd wrote: > On 01/26/2017 04:35 PM, mark.reinhold at oracle.com wrote: >> 2016/12/9 14:10:43 -0800, david.lloyd at redhat.com: >>> On 12/09/2016 03:46 PM, mark.reinhold at oracle.com wrote: >>>> ... >>>> >>>> Proposal >>>> -------- >>>> >>>> ... >>>> >>>> Now that compile-time versions can be recorded in module descriptors >>>> there is even less need to tolerate version information in module >>>> names, >>>> a bad practice that we'd like to discourage at the outset. We >>>> therefore >>>> further propose to: >>>> >>>> - Revise the accepted proposal for #VersionsInModuleNames [3] to >>>> state >>>> that a module name appearing anywhere in a source-form module >>>> declaration must both start and end with "Java letters" [4]. >>> >>> Can we just drop this part? I really am not a fan of the >>> social-enforcement-in-technical-code thing, and I can already think of >>> one existing project off the top of my head that will suffer >>> collaterally from this: Fabric8. Also any other project to which the >>> "vanity license plate effect" would apply. >> >> This is only the second example that's been pointed out since I posted >> the original proposal for #VersionsInModuleNames [1], the first being >> `commons-lang3` [2]. > > The difference, from my perspective, is that Fabric8 is an important Red > Hat project. > > Also to be subsequently annoyed, from 5 minutes of searching just on > GitHub: "CS###", "Excercise###", "Lab###", "Chapter###"; fx2048 (i.e. > number based games); Simple8583 (ISO-8583 framework); projects relating > to the (apparently several) devices out there whose name ends in a digit > or several digits; things referencing JSR numbers like > jackson-datatype-jsr310; things ending in a year (relating to events > like conferences and that sort of thing); things relating to versions of > a certification or specification (like our jta-1_1 API projects and > similar), etc. > As far as I know there has never been a restriction or a specification regarding the usage of numbers in the filename. And even if there would be such a specification, it won't apply anymore on the current world or jars (if desired, I could try to get these statistics from Central). Assuming it is always a version and that the base followed by those numbers are always part of the same base-artifact is simply not correct. >>> And I can think of many >>> ways >>> to circumvent this rule, including, but not limited to, bracketing with >>> letters "v5slot", roman numerals, etc., so really all it does is add an >>> annoying "big brother" effect without practical benefit. >> >> Sure, you can work around it, but disallowing the obvious abuses should >> go a long way towards encouraging people to do the right thing. >> >> If this restriction really becomes a problem then we could consider >> loosening it in a later release. If we lift it now and this kind of >> abuse becomes common then we'll have no way to go back. > > If this "abuse" is common, then... so what? I think the error here is > assuming there is a globally "right" thing to do, when it's already > clear that there are valid and useful cases where this is not true, and > I'm telling you now from a place of experience that having multiple > versions of a thing is not by itself any kind of anti-pattern. Like > anything in software, it can be abused, but taking away an obvious > feature in order to ensure that users behave in one certain way has > always been a regrettable mistake in my experience. Especially when the > primary problem o > > I think that users are going to be far more annoyed by the restriction > than benefited by it. > >>> Anyway I disagree pretty strongly that versions (or more specifically, >>> version segments) in module names are that really that strong of an >>> anti-pattern. Sure having whole version numbers in is a pretty fragile >>> technique, but it's very useful to have (for example) multiple major >>> versions of a library on a large distribution to ease migration, >>> especially when you're talking hundreds or thousands of modules. >> >> In the context of JPMS, I respectfully disagree. We long ago chose >> explicitly not to solve the version-selection problem in this module >> system, leaving it to build tools and container applications. > > Yes however what this policy implies is not "we will not solve this > problem" but "we will not only not solve this problem, we will make many > reasonable solutions impossible". People are still going to do this; > their solutions are just going to be weirder and the resultant problems > harder to solve. The basis for any agreement on our part with this > requirement was that, while we don't have explicit support for "multiple > versions" (which has been repeatedly shown to be a very hard-to-define > expression, by the way), we will do nothing to prevent it either. > >> If we >> allow digits at the end of module names then we just invite confusion. >> "I wrote `requires foo4` but I have `foo5.jar` on my module path, why >> doesn't that work?" > > The obvious answer is because foo4 is not the same as foo5, and > experience tells me that users historically have been able to figure > this out without any problems. > This is what happened at Devoxx BE, when somebody from the audience suddenly had trouble to compile his Java9 project with Maven, while it used to work before. The root cause was actually the upgrade from Java9, which changed this versioned dependences behavior. He had a line like ' requires jsomething2;'. When I told him he should remove the '2' he could confirm that it works, however the next valid question was "but how can I make the difference between the old jsomething and this jsomething2"? Well, that's not possible anymore. Robert >> - Mark >> >> >> [1] >> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-September/000393.html >> [2] >> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-September/009366.html >>