Why package deps work (Was: Re: Converting plain JARs to Java modules)

Wed Nov 16 15:03:03 PST 2011

Hi Brian,

On Wed, Nov 16, 2011 at 9:16 PM, Brian Pontarelli <brian at pontarelli.com> wrote:
> [snip]
> Let's use an example. Let's say you write a module to provide a user management system. It uses JPA only for persistence  and does not use any hibernate or Eclipse link specific class, which BTW is highly unlikely. You write your module and then unit test it.

How is this unlikely? On the contrary, it's highly unlikely that it
*could* use any Hibernate or EclipseLink specific class. First, the
module would be compiled with visibility only of the abstract API,
i.e. javax.persistence, and then at runtime we employ the module
system to ensure that it cannot see any specific implementation
classes, even via reflection.

In fact if we expect our modules to be reusable -- so they can be used
not only in the project we are working on right now but in subsequent
projects over the next several years -- it's very likely that the
module will be used with multiple JPA providers. That's not to say we
switch JPA providers willy nilly, or without running a QA process over
the set of module that we intend to put into production. We also don't
claim that we never find and fix bugs in our modules when used against
alternative providers.

> Although it would be really neat if you could ship your module to the world and not really care what implementation of JPA is used, this is unlikely. In fact, whatever JPA implementation you unit tested against is probably the only one that you can definitely say your module will run with. Therefore, you would really need to unit test your module against all variations of JPA and that again is unlikely.

The developer of the module should NOT care what implementation of JPA
they use, so long as it is an implementation that complies with the
contract. The deployer or assembler of an application is responsible
for finding and providing a compliant implementation. Testing of
course is done against a specific implementation, and does not
guarantee that other implementations can be used; they would have to
be tested by whoever wanted to use them. Nevertheless the work
required to fix issues arising from differences in interpretation of
contracts is much less than the porting effort required if the
original module developer had coded directly against an
implementation.

It's important to stress that none of this is theoretical. Exactly
this scenario involving JPA, along with many other similar scenarios,
can be done and is done in real world practice.

> In other cases, it might work though. For example SLF4J is probably a safe bet to code against the API and then be able to drop in any of the implementations.

This seems to contradict your previous statements. What is special
about SLF4J? If anything I would expect the JPA example to "safer",
since the JPA API was designed very cautiously by a JCP Expert Group,
whereas SLF4J came from a single author in the open source world (I
intend no slight against the SLF4J author; I simply have no
information on the quality and stability of its API).

> I guess my point is that it isn't always as simple as you are implying.

I agree it is not always simple, but it is possible.

Regards
Neil

>
>
>>
>>> Again, are you talking about build time or runtime? Also, are you suggesting that Java determine your dependencies automatically including the version? Or is this merely to help developers find missing dependencies and remove unused dependencies?
>> If you compile class com.bar.Foo that refers to com.bar.Bar then the Foo.class and Bar.class file contain all their type dependencies and by implication their package dependencies. The version of the package can then for example be found from package-info.java.
>>
>> This technique is used extensively in bnd, an open source library used in Eclipse, IDEA, ant, maven, SBT, and others. It reads the class files in a JAR and generates the OSGi dependency information automatically.
>
> This is true unless there is reflection, which is common. JEE is based entirely on reflecting the top level entry points (servlets, filters, etc). There are many cases of configuration files defining implementations and transitive dependencies. This makes using a bytecode analyzer only a partial solution.
>
> -bp
>
>
>
>
>>
>> Hope this clarifies. Kind regards,
>>
>>       Peter Kriens
>>
>>
>>
>>>
>>> -bp
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>> Kind regards,
>>>>
>>>> Peter Kriens
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 15 nov. 2011, at 06:14, David M. Lloyd wrote:
>>>>
>>>>> On 11/14/2011 08:33 PM, Neil Bartlett wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> More than happy to talk about reality, though Peter already was doing
>>>>>> so. Nothing he said was theoretical, it came directly from his
>>>>>> experiences and the experiences of developers who have been using OSGi
>>>>>> for around 12 years.
>>>>>
>>>>>> Before digging into the points in your message, I need to address the
>>>>>> subject line "Why package deps won't work". Package dependencies will
>>>>>> work and they do work, as is proven every day by the large number of
>>>>>> applications running on OSGi. To claim otherwise would be wilfully
>>>>>> ignorant, so I will assume you are being hyperbolic and really meant
>>>>>> to assert that module deps are just better than package deps.
>>>>>> Therefore I intend to argue on that basis and claim that package deps
>>>>>> are better than module deps; note that I don't claim that module
>>>>>> dependencies "won't work" because JBoss Modules is a proof point that
>>>>>> they do.
>>>>>
>>>>> There was an implied "... for Java SE 8 modules" after "won't work" (this is why I said "won't" rather than "don't") which apparently wasn't implied enough.
>>>>>
>>>>> Of course people use OSGi, or we wouldn't be arguing, but that alone isn't enough to make OSGi's dependency system preferable for the SE case (after all, far more people use plain old class paths).  I believe that JBoss Modules is a very accurate representation of how an SE module system could function, right down to executing JARs on the command line with dependencies and modularizing certain JDK APIs.  This is not a proof point, but strong evidence that it is an effective solution for the actual problem on the table, and that similar architectures are likely to succeed for the same problem domain.  As to why package deps are not an example of an effective solution for this problem, I intend to illustrate in greater detail.
>>>>>
>>>>> Further responses and elaboration below.
>>>>>
>>>>>> On Mon, Nov 14, 2011 at 5:05 PM, David M. Lloyd<david.lloyd at redhat.com>  wrote:
>>>>>>> The explanation is quite simple, really - each point can be pretty much
>>>>>>> wiped out by a heavy dose of reality.
>>>>>>>
>>>>>>> 1. "M2P is leverages the Java type system unlike m2m that must introduce new
>>>>>>> namespaces outside the Java type system." - this is just fluffy
>>>>>>> buzzwordology.  Considering packages part of the Java type system is a
>>>>>>> pretty liberal interpretation of the term "type system" as they're basically
>>>>>>> just arbitrary name spaces.  That said, there's nothing particularly "new"
>>>>>>> about namespacing modules.  JARs have names. Projects have names.  Maven
>>>>>>> uses its own namespaces for artifacts.  Even the primitive JDK extension
>>>>>>> mechanism uses symbolic names.
>>>>>>
>>>>>> From Wikipedia (http://en.wikipedia.org/wiki/Type_system): "A type
>>>>>> system associates a type with each computed value ...  the aim is to
>>>>>> prevent operations expecting a certain kind of value being used with
>>>>>> values for which that operation does not make sense."
>>>>>>
>>>>>> The following will not compile:
>>>>>>
>>>>>>   import java.awt.List;
>>>>>>   // ...
>>>>>>   List list = new ArrayList();
>>>>>>   list.iterator(); // etc
>>>>>>
>>>>>> whereas the following will:
>>>>>>
>>>>>>   import java.util.List;
>>>>>>   // ...
>>>>>>   List list = new ArrayList();
>>>>>>   list.iterator(); // etc
>>>>>>
>>>>>> Package names are part of the type system because using an incorrect
>>>>>> package name in Java source can result in type errors during
>>>>>> compilation, and because the type and available operations associated
>>>>>> with each value varies with the package name used.
>>>>>>
>>>>>> In contrast, it is impossible to obtain a type error at compilation by
>>>>>> incorrectly naming a JAR, project, Maven artefact etc, because none of
>>>>>> those things are ever referenced in Java source code.
>>>>>
>>>>> I stand by my statement.  A package is part of the name of the type; it is not a type, nor a meta-type, nor anything like a type in and of itself.  This makes them part of the language, sure, but not part of the type system, in my opinion.  Thus, "a stretch".  I don't feel any particular need to have the world agree with me on this though, other than as I state below...
>>>>>
>>>>>>> Even a simple convention of the parent-most package name for the name of a
>>>>>>> module is very simple to grasp and is in fact the solution we've used quite
>>>>>>> effectively thus far.  Thus implying that module names are some new alien
>>>>>>> concept is really not valid.
>>>>>>
>>>>>> The claim is not that they are a new alien concept. The claim is that
>>>>>> they are not part of the Java language and type system, and as a
>>>>>> result of this (along with other problems) are less useful for
>>>>>> depending upon than packages.
>>>>>
>>>>> I don't see how not being part of the Java language makes module identifiers "less useful".  I think that this argument artificially elevates the significance of the relationship between packages and the actual type system, and then uses that elevation as a proof that a system based on other namespace principles is deficient, without offering any real examples which actually relate to the type system.
>>>>>
>>>>>>> 2. "M2P can be used to break the transitive dependency chain, m2m suffers of
>>>>>>> excessive coupling." - How about some facts to back this up?  I've found
>>>>>>> "m2m" coupling to be just right.  In JBoss Modules, we do not export
>>>>>>> transitive dependencies by default.  This results in a very simple and clean
>>>>>>> dependency graph between modules.  Using package dependencies results in a
>>>>>>> *far more complex* dependency graph, because you need edges for every
>>>>>>> package even though *most of the time you want the whole module anyway*.
>>>>>>> Again, real world here.
>>>>>>
>>>>>> Peter is definitely also talking about the real world and so am I. In
>>>>>> an m2m dependency model you cannot avoid transitive dependencies,
>>>>>> whether you expose them to consumers or not.
>>>>>
>>>>> In *any* model you cannot avoid transitive dependencies.  If module A exports module B as part of its API, then you get A and B, and that's life.  If you only want part of B, then you filter it down.  But with most existing libraries out there (and they easily number in the thousands), they're intended to be used whole, so this doesn't really happen too much in practice.
>>>>>
>>>>>> An importer of a module must be assumed to depend on the whole functionality of that module
>>>>>> whether or not that is actually the case, and therefore all of the
>>>>>> transitive dependencies must be present at runtime. In an m2p world we
>>>>>> have the opportunity to split modules and break apart dependency
>>>>>> graphs, because the unit of coupling is more granular.
>>>>>
>>>>> You could break things down by actual classes if granularity is the goal.  But it isn't; it is so much less important than, say, performance, conceptual simplicity, usability, efficiency, memory footprint, etc.  The fact is that the vast majority of modules you'll find published at, say Maven central, are designed to be used completely, not partially, and can and are used as such without issue.
>>>>>
>>>>> People just don't usually design modules to be imported by package, except perhaps as an afterthought concession to OSGi, and they don't use them that way at runtime either.  It is so seldom a real problem in the SE world (or EE for that matter) once you step away from flat class paths.  The simple fact is that most users think of their dependencies in the same terms that their IDEs and their build scripts do: by module.
>>>>>
>>>>> This is truly a case of creating a lot of complexity and extra work in the common case to avoid problems which are not a common case.  I think it is much better to optimize for the common case, where people know exactly what dependencies they want (in terms of artifacts), and they just want to declare them and be done without (a) hunting down package lists or (b) running extra tooling.
>>>>>
>>>>>>> If we're just going to throw dogma around, I'll put it the other way: m2p is
>>>>>>> a design error by the OSGi spec designers which has since been embraced as a
>>>>>>> religion.
>>>>>>
>>>>>> Pejorative accusations about "dogma" or "religion" have no place in a
>>>>>> technical discussion. They are also untrue. All of the practising OSGi
>>>>>> developers I know have arrived at their support for package
>>>>>> dependencies as a result of real world experience, not because of some
>>>>>> willingness to bow down to the Almighty CPEG. I don't know of any
>>>>>> practising OSGi developer who has used both Require-Bundle (m2m) and
>>>>>> Import-Package (m2p) and actually prefers the former.
>>>>>
>>>>> Sure, but Require-Bundle is *not* the same thing as using module-to-module dependencies in a non-OSGi system.  All OSGi life revolves around the resolver.  In the SE world, you cannot have a resolver without introducing a lot of machinery which will definitely negatively impact boot performance in the case that the resolver runs at module load time, or add the complexity of centralized package index management in the case that the resolver would run at install time (as in, install into a module library/repository, not install in the OSGi sense).
>>>>>
>>>>> If in a typical installation a user has to do more than simply delete a file to remove a module, or drop a JAR into a directory to add a module, I believe we've run quite far off-track usability-wise.  And if a user has to sit through a lengthy resolution process at runtime (even if only the first time) then we're off-track performance-wise, especially if such resolution must expand to include modules in the module repository which are not even loaded.
>>>>>
>>>>>> I do know several who started as strong supporters of Require-Bundle
>>>>>> and switched to being supporters of Import-Package, not because of
>>>>>> Peter's wrathful pontificating but because they encountered the
>>>>>> specific problems that he described and found that they were fixed by
>>>>>> using Import-Package, and indeed that everything worked so much more
>>>>>> cleanly that way. I'm in this camp myself, and so was Jeff McAffer
>>>>>> before he went over to the Dark Side (Microsoft).
>>>>>
>>>>> There are other, simpler ways to fix package conflict issues when they arise.  Like simply excluding the offending package from the import.
>>>>>
>>>>> And as I've said before, I do not believe that Require-Bundle in the context of OSGi bundles is comparable to loading modules by name in the context of a module system such as Jigsaw or JBoss Modules and the types of modules that could be loaded thereby (the aforementioned thousands of existing modules).
>>>>>
>>>>>>> It offers no significant benefit, other than a couple of edge
>>>>>>> cases which are frankly just as well handled by m2m simply by adding package
>>>>>>> filters.  Which, by the way, I haven't seen the need to do yet in our 200+
>>>>>>> module environment, but which we do have the capability to do.
>>>>>>
>>>>>> Substitution, module refactoring and transitive decoupling are hardly
>>>>>> edge cases.
>>>>>
>>>>> Substitution isn't solely addressed by package dependencies.  Nor is module refactoring.  And additional transitive decoupling isn't something people actually want - as I said, the common case is that users want to be able to use any class from a module once they import that module.  In the real world, it is very rare to only want to import part of a module!  If you need this routinely, you've done something very unorthodox.  Thus I stand by my statement.
>>>>>
>>>>>> However I can see how these issues might not yet have come
>>>>>> to the fore in a module system designed for a single product with a
>>>>>> small number of modules, and where that product has not yet been
>>>>>> through the mill of multiple version evolutions.
>>>>>
>>>>> Cute.  I assure you that we have been around through a long, long history of a huge number of Java users doing everything under the sun to their class paths and class loading structures.  We also have experience with operating system distribution and managing module distribution across a wide range of other programming languages, not just Java.  Our module system has already stood up to reasonably large deployments (200+ modules) with excellent memory footprint and performance numbers and we've been processing usability feedback, including feedback from those with OSGi experience, which has been mostly positive.
>>>>>
>>>>> Furthermore when I was at JavaOne giving booth talks about JBoss Modules and AS7, I found that users expressed quite a bit of frustration at the OSGi model and were quite receptive to a more orthodox m2m system, for what that's worth.
>>>>>
>>>>>>> M2P is a solution just itching for a problem.
>>>>>>
>>>>>> This is also not a very useful statement, I assure you that the
>>>>>> problem came before the solution. Better to show why you think the
>>>>>> problem is invalid or should have been solved differently.
>>>>>
>>>>> There is no way you can convince me that this solution was invented with general modularity in mind.  Back when OSGi was first formed, it was designed for the embedded market.  The world of Java today is completely different.  People just don't bundle their libraries by package, as was possibly expected in the early days of Java.  If they did we wouldn't be having this discussion because packages and modules would already be one and the same.
>>>>>
>>>>> OSGi evolved to where it is today.  It was not designed from the ground up with actual requirements which pertained to the problem at hand which is modularity of the SE platform and applications which run on it.  So yeah I'm going to say, solution before problem.  Taking a solution and trying to apply it retroactively to a different problem like this has never failed to bite me in the ass, personally.  But I can't speak to everyone's experience.
>>>>>
>>>>>>> But you're going to have a
>>>>>>> tough time convincing me that users *want* to have to use special tooling
>>>>>>> because they want to depend on a module which has too many packages to list
>>>>>>> out by hand.  And before you cry "wildcards", be sure to consider that some
>>>>>>> modules use package names which are subordinate to other modules' packages,
>>>>>>> which is a perfectly normal and allowable scenario.  Using wildcards for
>>>>>>> package matching could cause amazing levels of havoc.
>>>>>>
>>>>>> OSGi never uses wildcards at runtime, and tools such as bnd do not
>>>>>> need wildcards in order to express package-level dependencies. They
>>>>>> extract the set of packages that were actually used by the code, all
>>>>>> of which are available in the class files. This is possible because
>>>>>> packages are part of the type system (see first point above).
>>>>>>
>>>>>> So it's not that I don't want to list dependencies by hand, rather I
>>>>>> only want to do it once. I am already forced to do it in the import
>>>>>> statements of my Java sources. If I had to repeat that dependency
>>>>>> information -- whether in m2p or m2m form -- then I would run the risk
>>>>>> of it getting out of step with the real dependencies.
>>>>>
>>>>> One of the critical logical gaps here which m2p ignores at a fundamental level is that *packages are not unique*.  In the real world, packages are repeated across more than one module all the time, and not just due to versioning.  There is *no way* you could take every JAR at Maven Central and plug it in to an m2p system.  You'd have to break apart and repackage every single one, at least, and at worst you'd have to rewrite an awful lot of them under different package names to account for this restriction.  You'd need to enforce package uniqueness across your whole library, for all time.  I don't think this is a sane option.  You're adding significant overhead to management, installation, and execution, for what?  So you can have the privilege of needing a special tool to extract your dependencies for you?
>>>>>
>>>>>>> Using package dependencies means you either must have a master package index
>>>>>>> for linking, or you need a resolver which has to have analyzed every module
>>>>>>> you ever plan to load.  Otherwise, O(1) loading of modules is impossible,
>>>>>>> which is absolutely 100% a deal-breaker for JDK modules which must be
>>>>>>> incrementally installable.  And it forbids having packages with the same
>>>>>>> name in more than one JAR without bringing run-time versioning into the
>>>>>>> fold, which is a terrible, terrible can of worms.
>>>>>>
>>>>>> Could you please explain why O(1) is the only acceptable complexity
>>>>>> for installing modules.
>>>>>
>>>>> Because this is Java SE we're talking about.  You're going to have a potentially huge number of modules installed in your system.  Even if you get around the versioning issues somehow and fully externalize the index of all packages, you still have to load or traverse the centralized index for every module load.
>>>>>
>>>>> I say that not only is Θ(1) the only acceptable complexity for installing modules but also loading them at run time.  Linking modules at run time should be no worse than O(n) complexity for the number of dependencies the module has, including transitive dependencies.  Loading classes and resources from modules should be Θ(k) where k is the number of modules which contain a package or directory which matches the class or resource being loaded (almost always one, in my experience).
>>>>>
>>>>> Loading a module should normally be limited to a one-time O(1) disk access without searching or loading any other files or modules than the module artifact itself (and possibly its descriptor if they are held externally by a particular module repository implementation, which is useful but not necessary).  Linking a module should be limited in disk access to loading only the modules it imports via direct dependency, or by the relatively rare partial or full re-export of a transitive dependency by a direct dependency.
>>>>>
>>>>> In particular, the expectation of accessing a central index has some potentially serious implications, facing either possible file lock contention by multiple threads or memory overhead of loading in the complete index in advance.  It should be expected that modules are loaded concurrently, and such concurrency should not be hindered any more than necessary.  Other possible implementations (such as file- or directory-per-package) have their own drawbacks as well or violate what I consider to be core requirements.
>>>>>
>>>>> If there are implementations of package-based resolution which don't involve either a centralized index of one or many files or a resolver which must rove all installed modules in advance, I'd like to hear about it.
>>>>>
>>>>>> OSGi does indeed support incremental install
>>>>>> and while I accept it is probably not O(1) for each module, it would
>>>>>> likely be no more than O(N), though I haven't done the maths yet to
>>>>>> prove this. Bear in mind that in practice, for small N, the constant
>>>>>> factors can result in O(1) being more expensive than O(N). I have seen
>>>>>> OSGi used with many thousands of modules, so unless you have some data
>>>>>> and a use-case showings package-level resolution as unacceptably slow,
>>>>>> your concern just sounds like premature optimisation.
>>>>>
>>>>> Identifying algorithmic complexity during design phase is *never* premature optimization AFAICT.  If you don't understand the complexity of the algorithms you're about to implement, and how they are expected to be applied to the problem at hand, then you're not ready for implementation yet.
>>>>>
>>>>> However, postulating that O(n) is okay because can sometimes be faster than O(1) without measuring it for the specific problem in question is definitely premature de-optimization. :-)
>>>>>
>>>>>> There are two reasons to have a packages with the same name in more
>>>>>> than one JAR. The first is a situation called split packages, and it
>>>>>> is highly undesirable because it causes the runtime model of the
>>>>>> package to diverge from the compile-time model, and therefore things
>>>>>> like package-private types and members stop working correctly. For
>>>>>> this reason, OSGi's m2p imports support depending only upon a single
>>>>>> exporter of a particular package, i.e. we do not aggregate all exports
>>>>>> of that package.
>>>>>>
>>>>>> Unfortunately split packages are sometimes unavoidable in legacy code
>>>>>> that cannot be refactored, e.g. the JDK. To support such scenarios
>>>>>> OSGi has Require-Bundle, i.e. m2m. This does not negate the problems
>>>>>> associated with m2m, it is simply a trade-off that we face with poorly
>>>>>> factored legacy code.
>>>>>>
>>>>>> The second reason for multiple packages with the same name is when you
>>>>>> explicitly want to install multiple versions of a library/API and have
>>>>>> them all available within the same runtime. I wouldn't call this a
>>>>>> "can of worms" exactly because it can be done without too much
>>>>>> trouble, though for the sake of a simple life I personally avoid this
>>>>>> situation unless it's necessary.
>>>>>
>>>>> Yes, and it is actually fairly common in practice to want two versions of something in the sense of the two versions being wholly different implementations (think apache commons logging versus jcl-over-slf4j for a trivial example which crops up a lot).  There are reasons to use one or the other.  Multiplicity of versions (in this respect) is something the module system has to handle gracefully and simply.
>>>>>
>>>>>>> Finally it should be perfectly clear to anyone who has read the original
>>>>>>> requirements document that nothing in this module system should prevent OSGi
>>>>>>> from functioning as it is, so there is absolutely no reason to assume that
>>>>>>> any OSGi implementation is so threatened - especially if m2p linking is as
>>>>>>> superior as has been expressed.  Our module system (which is conceptually
>>>>>>> similar to Jigsaw in many regards) in fact does support our OSGi
>>>>>>> implementation quite effectively without itself implementing OSGi's
>>>>>>> package-to-package resolution (which like I said throws O(1) out the
>>>>>>> window).
>>>>>>
>>>>>> I agree that Jigsaw's existence doesn't threaten OSGi's, so long as
>>>>>> Java 8 doesn't actually break OSGi (and if it did so, it would
>>>>>> likewise break many other applications and could not be considered
>>>>>> backwards compatible with Java 7).  The two can interoperate through
>>>>>> m2m-type dependencies. Tim Ellison started Project Penrose for the
>>>>>> purpose of investigating, testing and deepening this collaboration.
>>>>>>
>>>>>> Neverthless, the point that I believe Glyn was making is the
>>>>>> following. We accept that m2m dependencies are probably required for
>>>>>> the JDK, which implies a module system like Jigsaw or
>>>>>> OSGi/Require-Bundle rather than OSGi/Import-Package. However is it
>>>>>> intended to be used for application modularisation as well? This is of
>>>>>> course a question for the Jigsaw team rather than you, David.
>>>>>
>>>>> The question of whether Java SE 8 modules are intended to be used for application modularization is a question for the EG, not the Jigsaw team.  The job of the Jigsaw team is really to implement a prototype which meets the requirements set forth by the EG, which may become the reference implementation at a future point.
>>>>>
>>>>> The question of whether the greater Java community will embrace the SE module system for applications is to be answered by the community only.  I personally believe that to develop a module system which is not intended to be usable by the average standalone application is foolhardy, a great waste of effort, and is doomed to mediocrity, given the success we have had with such a system.
>>>>>
>>>>> However until there is an EG, these are all just as much questions for me as for anybody, and I can and will answer to the best of my knowledge and belief.
>>>>>
>>>>>> As a result of experience in developing and evolving large real-world
>>>>>> applications using a module system that supports BOTH m2m and m2p
>>>>>> dependencies, I believe it would be very unfortunate if a module
>>>>>> system that supports ONLY m2m were to become widely used in the
>>>>>> application space... not because OSGi can't handle the competition,
>>>>>> but because those applications will be fragile and hard to evolve.
>>>>>
>>>>> I don't believe this to be the case.  I think that the status quo doesn't result in particularly fragile applications, and such can be easy to evolve or hard depending on the quality of the application components and frameworks involved.  I think that m2m dependencies enhance the status quo such that applications are somewhat less fragile (owing chiefly to the simple measure of preventing transitive dependencies from being exported by default), and quite easy to evolve as well (even large and complex frameworks rarely have more than a few dependencies, unless they themselves are overly fragmented (e.g. CXF as a nightmare example which yet still works fine under an m2m system)).
>>>>>
>>>>>> My question for you David is as follows. I understand that you prefer
>>>>>> module dependencies, but do you believe that package dependencies have
>>>>>> no value whatsoever and therefore should not be available to
>>>>>> application developers in the Java 8 module system? If so, why did Red
>>>>>> Hat create an OSGi implementation?
>>>>>
>>>>> I personally believe that they have some value, but that value is limited to interoperability with OSGi.  I do not believe that it is a preferable model for most users or standalone applications, nor for larger applications such as our application server.  I think that the requirement for extra tooling and the increased run-time complexity which is exposed to users nullifies the benefits.  If someone really wants this variety of modularity, they should simply use OSGi.
>>>>>
>>>>> Red Hat created an OSGi implementation for many reasons, but the only significant one to me is that there are people who want to use their OSGi applications with our application server.  I don't believe any other reason is even necessary.  It's the same reason we strive to support any spec, from EJB 1.0 to Java EE 6 and beyond.  Someone wants the functionality, so we deliver it to the best of our ability.
>>>>>
>>>>>> Kind regards
>>>>>> Neil
>>>>>>
>>>>>>>
>>>>>>> On 11/14/2011 01:49 AM, Glyn Normington wrote:
>>>>>>>>
>>>>>>>> I look forward to David's elaboration of why he thinks "using packages as
>>>>>>>> a dependency unit is a terrible idea" to balance Peter's clear explanation
>>>>>>>> of the benefits of m2p.
>>>>>>>>
>>>>>>>> Meanwhile, it's worth noting that, according to the requirements document,
>>>>>>>> Jigsaw is aimed at platform modularisation and the platform being
>>>>>>>> modularised has some non-optimal division of types across packages (see the
>>>>>>>> package subsets requirement) which favour m2m dependencies. (Note that
>>>>>>>> Apache Harmony was developed with modularity in mind and was able to exploit
>>>>>>>> m2p, so platform modularisation per se needn't be limited to m2m.)
>>>>>>>>
>>>>>>>> So if Jigsaw excludes m2p, it will then be applicable to certain kinds of
>>>>>>>> legacy code modularisation and less applicable to new module development and
>>>>>>>> modularisation of existing code whose division into packages suits m2p. IIRC
>>>>>>>> this was the original positioning of Jigsaw: for use primarily within the
>>>>>>>> OpenJDK codebase and only exposed for application use because it was too
>>>>>>>> inconvenient to hide it.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Glyn
>>>>>>>>
>>>>>>>> On 12 Nov 2011, at 11:59, Peter Kriens wrote:
>>>>>>>>
>>>>>>>>> Neither my wrath, nor the fact that I rarely if ever get angry is
>>>>>>>>> relevant in this discussion ... This is a technical argument that are
>>>>>>>>> solvable by technical people that share the same goals. I prefer package
>>>>>>>>> dependencies because they address the excessive type coupling problem in
>>>>>>>>> object oriented systems, not because they're part of OSGi. Let me argue my
>>>>>>>>> case.
>>>>>>>>>
>>>>>>>>> Module-to-package dependencies (m2p) are preferable over module-to-module
>>>>>>>>> dependencies (m2m) for many reasons but these are the most important
>>>>>>>>> reasons:
>>>>>>>>>
>>>>>>>>> M2P is leverages the Java type system unlike m2m that must introduce new
>>>>>>>>> namespaces outside the Java type system.
>>>>>>>>> M2P can be used to break the transitive dependency chain, m2m suffers of
>>>>>>>>> excessive coupling
>>>>>>>>>
>>>>>>>>> Since the first bullet's benefit should be clear I only argue the more
>>>>>>>>> complex second bullet.
>>>>>>>>>
>>>>>>>>> A module is in many regards like a class. A class encapsulates members,
>>>>>>>>> depends on other members/classes, and makes a few members accessible outside
>>>>>>>>> the class. A module has a similar structure but then with types/packages as
>>>>>>>>> members.
>>>>>>>>>
>>>>>>>>> After the initial success of Object Oriented Programming (OO) it was
>>>>>>>>> quickly learned that reuse did not take place at the expected scale due to
>>>>>>>>> excessive type coupling. The problem was that a class aggregated many
>>>>>>>>> dependencies to simplify its implementation but these dependencies were
>>>>>>>>> unrelated to the contract it implemented. Since class dependencies are
>>>>>>>>> transitive most applications disappointingly became an almost fully
>>>>>>>>> connected graph.
>>>>>>>>>
>>>>>>>>> Java's great innovation was the interface because it broke both the
>>>>>>>>> transitivity and aggregation of dependencies. A class could now express its
>>>>>>>>> dependency (use or implement) on a contract (the interface) and was
>>>>>>>>> therefore fully type decoupled from the opposite site.
>>>>>>>>>
>>>>>>>>> An interface can act as a contract because it names the signature of a
>>>>>>>>> set of methods so that the compiler can verify the client and the
>>>>>>>>> implementer.
>>>>>>>>>
>>>>>>>>> Since a module has a very similar structure to a class it suffers from
>>>>>>>>> exactly the same transitive aggregation of dependencies. This is not a
>>>>>>>>> theory, look at the experiences with Maven
>>>>>>>>> (http://www.sonatype.com/people/2011/04/how-not-to-download-the-internet/)
>>>>>>>>> Again, this is not that maven is bad or developers are stupid, it is the
>>>>>>>>> same underlying force that finally resulted in the Java interface.
>>>>>>>>>
>>>>>>>>> The parallel for the class' interface for modules is a named set of
>>>>>>>>> interfaces. This concept already exists in Java: a package. Looking at
>>>>>>>>> almost all JSRs it is clear that our industry already uses packages as
>>>>>>>>> "interfaces" to provider implementations.
>>>>>>>>>
>>>>>>>>> Therefore, just like a class should not depend on other implementation
>>>>>>>>> types, a module should preferably not depend on other modules. A module
>>>>>>>>> should instead depend on contracts. Since modules will be used to provide
>>>>>>>>> components from different sources managed with different life cycles the
>>>>>>>>> excessive type coupling caused by m2m is even more damaging than in c2c.
>>>>>>>>> Proper use of m2p creates significantly less type coupled systems than m2m,
>>>>>>>>> the benefits should be obvious.
>>>>>>>>>
>>>>>>>>> Since there are use cases for m2m (non-type safe languages for example) I
>>>>>>>>> do believe that Jigsaw should still support m2m. However, it would be
>>>>>>>>> greatly beneficial to our industry if we could take advantage of the lessons
>>>>>>>>> learned with the Java interface and realize how surprisingly important the
>>>>>>>>> Java package actually is in our eco system.
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>>
>>>>>>>>>     Peter Kriens
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9 nov. 2011, at 15:04, David M. Lloyd wrote:
>>>>>>>>>
>>>>>>>>>> I'll just state now that using packages as a dependency unit is a
>>>>>>>>>> terrible idea, and not some architectural revelation.  That way, Peter's
>>>>>>>>>> wrath will be largely directed at me. :-)
>>>>>>>>>>
>>>>>>>>>> On 11/09/2011 08:02 AM, Peter Kriens wrote:
>>>>>>>>>>>
>>>>>>>>>>> I agree that tools are needed but we must be careful to not expect
>>>>>>>>>>> tools to stopgap an architectural issue. I think it is important to first do
>>>>>>>>>>> good architectural design leveraging existing tools (e.g. the Java type
>>>>>>>>>>> system) before you try to add new tools. It is such a pity (but all to
>>>>>>>>>>> common) that a design allows for classes of errors that would be impossible
>>>>>>>>>>> with a slightly different design.
>>>>>>>>>>>
>>>>>>>>>>> Kind regards,
>>>>>>>>>>>
>>>>>>>>>>>     Peter Kriens
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 9 nov. 2011, at 14:49, Alan Bateman wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On 09/11/2011 13:04, Peter Kriens wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The issue is that maven problems are not caused because maven is bad
>>>>>>>>>>>>> or that pom authors are stupid. The reason is that the module-to-module
>>>>>>>>>>>>> dependency architecture in maven (and Jigsaw) is error prone ...
>>>>>>>>>>>>
>>>>>>>>>>>> This thread started out with someone asking about adding module
>>>>>>>>>>>> declarations to existing JAR files, and in that context, I agree it can be
>>>>>>>>>>>> error prone without good tools. I think things should be a lot better when
>>>>>>>>>>>> modules are compiled.
>>>>>>>>>>>>
>>>>>>>>>>>> -Alan.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> - DML
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> - DML
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> - DML
>>>>
>>>
>>
>
>