Can automatic modules be made to work?

Thu Apr 27 18:50:19 UTC 2017

The returning question is: how can I as a *library builder* participate in  
adopting Jigsaw?

The first thing you need to ensure is that there are no split package  
issues.
The next steps can be conflicting:
- I want to name my module, so my customers can refer to it
- I do not want to refer to auto modules, because their names are  
unreliable.
- I do not want to add command line flags to change module access behavior.

The only place where you can specify the module name is in the module  
descriptor. (The proposal to provide a name via a MANIFEST attribute has  
been rejected. Having only one location to specify it is reasonable)
Adding a module descriptor means you have to specify every requirement,  
which would mean probably referring to automodules.
Because we don't want to do that, it seems we're blocked.

There is an option which makes it possible for a module to read from the  
classpath, i.e. add-reads <module>=ALL-UNNAMED . This way you don't have  
to specify those automodule requirements and the effect will be the same.
However, using these commandline options means you need to specify them  
both at compiletime and runtime. You cannot expect from your customers to  
do this by hand. And since this kind of information is lost after  
compilation, no tool will be able automatically add it again at runtime.

If only this kind of information could be stored in the module  
descriptor...

Which made me think of the concept of soft and strict modules. Assuming  
'strict' is the preferred default, 'soft' or any equivalent alternative  
would be a new keyword which has the same effect as add-reads  
<module>=ALL-UNNAMED, but it's information is available at both compile  
time and runtime.
With soft modules you can require a subset of modules. This should ease  
the migration a lot.

Is it bad that the module gets access to all classes on the classpath? I  
don't think so. This was already happening when using *only* the  
classpath. Now you get access to the required (not all) modules on the  
module path and everything on the classpath, which is already much less  
then ever before. With the soft module you control the pace of migration,  
won't be blocked by the pace of others and you can really participate  
without the possible consequences of referring to automodules.

Does this mean we need to remove automodules? Even though I still think  
they can do quite some damage to the ecosystem, I know it is too popular  
by some to be removed. So yes, we could keep automodules. In fact, you  
could say that (strict) modules should never refer to automodules, but  
soft modules can. This still matches the modular setup of Java 9.
Soft modules gives the option to 1. refer to automodules and 2. omit  
requirements

So can automatic modules be made to work? Yes, by not requiring to refer  
to them.

Robert

On Wed, 26 Apr 2017 23:19:42 +0200, Stephen Colebourne  
<scolebourne at joda.org> wrote:

> On 26 April 2017 at 17:27,  <mark.reinhold at oracle.com> wrote:
>> I think I need to reconsider my previous conclusion that explicit  
>> modules
>> that depend upon automatic modules should never be published for broad
>> use [2].
>> ...
>> The only remaining objection seems to be the aesthetic one, i.e., the
>> fact that the name of an automatic module is derived from the artifact
>> that defines it rather than from some intrinsic property of its content
>> or an explicit declaration by its author.  I understand and share this
>> concern.  I completely agree that modules are not artifacts, as I've
>> written before [3], but I don't see a better alternative [4].
>
> OK, so in this thread, I'll outline what changes could potentially
> allow distributed modules based on automatic modules to work. This is
> not an endorsement of the concept, but a pragmatic take on what could
> be done now (other than delay JDK9 or remove the modulepath from
> JDK9).
>
> Some basic assertions:
> 1) Modules != Artifacts [1]
> 2) Module names should thus be aligned with code, not artifacts [2]
> 3) Automatic modules currently derive their name from the artifact,
> which is almost always wrong wrt #2
> 4) A module author is forced to choose the artifact name initially,
> and change it to the module name when released
> 5) Since multiple artifact names point represent the same module [1],
> there are guaranteed to be problems in large graphs
> 6) There is no way out when module hell hits
>
> To succeed with distributing modules depending on automatic modules,
> it seems to me that we need:
> 1) a naming strategy that is reliable for the person making the guess
> 2) an approach for JPMS to link the guessed name to an artifact
>
> I'd argue that the naming strategy can be relatively simple - the
> highest package name [2]. Feedback has been pretty much universal in
> agreeing to super-package reverse-DNS so far, thus the chances of the
> guess being right are definitely increased. While not a perfect, it
> might just about be good enough.
>
> The second point, what does JPMS do, has yielded various options and
> much discussion [3]. If we accept the notion that we are using
> super-package reverse-DNS module names, then we can limit the options
> on the table to those that produce names of that type. This implies
> that the name of an automatic module is derived from the packages in
> the jar file.
>
> Since not every jar file has a single super-package, we need a four
> step strategy:
>
> 1) Use Module-Name in MANIFEST.MF if present. This allows low-tech
> projects to override the JPMS strategy. It would be particularly
> useful for old branches of active codebases, such as commons-lang v2
> or commons-collections v3 where they want to actively publish the
> module name rather than leave it implied.
>
> 2) For each non-modular jar on the modulepath, consider each
> super-package to be a separate module name. Since automatic modules
> are open and can see each other, it does not matter if two or more
> automatic modules are produced from the same jar file. This handles
> most jar files on Maven Central, including odd ones like Colt where
> there are multiple super-packages [4].
>
> 3) If a module name dependency is still not found, further examine the
> non-modular jar files. Consider each and every package name to be a
> potential module name (yes, every single package). This handles Colt
> where `cern.colt` would be the most sensible module name.
>
> 4) If a module name dependency is still not found, further examine the
> non-modular jar files. If a non-modular jar file has two or more
> packages with the same stem but no code at the level of that shared
> stem, treat the stem as a potential module name, provided it does not
> clash with any other module name. For example, a jar file that has
> `com.google.common.io` and `com.google.common.base` would have a
> shared stem of `com.google.common` This handles Guava [5]. eg. In
> Colt, `cern.jet` would be a module name by these rules.
>
> In other words, a non-modular jar does not have one module name, it
> has a number of possible module names, checked in order, for the
> purpose of matching the missing module names in the module graph.
>
> As always, the success of the approach to automatic modules will
> depend on the ability of module authors to guess the name correctly,
> but hopefully they will do OK, particularly if there are some
> published rules, or a website suggesting the best option for a given
> jar file.
>
>
> As I started with, the hoops necessary to get automatic modules to
> work indicate to me that they are not the right solution to the
> gradual migration problem. But if they are to stay, this is my take on
> what is needed.
>
> Stephen
>
>
> [1]  
> http://blog.joda.org/2017/04/java-se-9-jpms-modules-are-not-artifacts.html
> [2] http://blog.joda.org/2017/04/java-se-9-jpms-module-naming.html
> [3]  
> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2017-April/000666.html
> [4] https://dst.lbl.gov/ACSSoftware/colt/api/index.html
> [5] http://google.github.io/guava/releases/21.0/api/docs/