NP-Completeness Re: How to drain your laptop battery using Jigsaw
Peter Kriens
peter.kriens at aqute.biz
Fri Dec 23 00:35:01 PST 2011
This issue has been discussed many times inside the OSGi. I just like to point out some experiences.
First, Richard S. Hall's resolver in Felix has shown that even large systems resolve in a very short time and using very few electrons on the way. And he resolves with a completely generic require/capabilities model including the so called uses constraints necessary to support side-by-side versioning. If google maps can find a route between Montpellier and Stockholm in less than a second then I fail to see why we could not find a route in the much smaller space of software dependencies.
Second, it is necessary to make a distinction between deployment time and build time. During building flexibility and allowing me to specify as little as possible for rapid turnaround. However, after an artifact moves to testing and production it becomes necessary to pin down the configuration of the test artifact to ensure repeatability. For example, one compiles against API only JARs that are properly (package) versioned so that my code remains clean and cannot accidentally pickup unwanted dependencies. If I then test my program I need to combine it with a concrete implementation. When the application goes to production I need to ensure that what was tested is actually moved in production, even further narrowing it down. This model can largely be automated with OSGi.
Third, semantic versioning allows us to specify when we break backward compatibility and that is good. However, what the semver people do not seem to get is that API based programming wreaks havoc with this model. With an API you have a consumer of an API and a provider of that API (i.e. JPA, with EclipseLink as provider and me as consumer). Both depend on JPA but virtually any change in the JPA packages breaks EclipseLink while allowing me to happily use a newer version of the JPA API. In discussions I find that few people get the implications of APIs based programming and semantic versioning.
Kind regards,
Peter Kriens
On 22 dec. 2011, at 20:50, Jaroslav Tulach wrote:
>> ## 21. 12. 2011 23:23:44 ##<
>> My colleague Jarda Tulach had investigated module systems supporting
>> version ranges, and wrote up a formal proof [1] that picking one version
>> of each of several modules in a repository such that all dependencies are
>> satisfied is in general NP-complete. I decided to verify that Jigsaw
>> suffers from this problem.
>>
>> The attached program, run against a recent Jigsaw build (8c6429c279e7),
>> starts consuming many seconds of CPU time per round (effectively hanging)
>> somewhere around 9 variables (depending on the random seed), mostly
>> running in Resolver.resolve according to a profiler. The randomized nature
>> of the test and the clause count is designed to stress the system, of
>> course.
>
> First of all, congrats Jesse! This is a perfect real world demonstration of my
> theoretical work. I guess for many people it is more important to hear about
> "draining your batteries", than about facing an NP-complete problem.
>
> Anyway, I am glad you brought this problem to the jigsaw-dev mailing list. It
> is in my opinion interesting topic and if there is a way to avoid pitfalls of
> range dependencies, I'll be glad to help with that.
>
>> Jarda please step in if I misrepresented any of these options.
>
> I've been thinking about these issues for last decade while working on
> NetBeans and its module system (helping you, Jesse). The NP-complete problem
> exists only when we allow implicit re-export. That means:
>
> - using a library internally, without exporting it (and under the assumption
> that multiple versions of a library can co-exist, if not re-exported[4]) is
> safe.
>
> That is why the NP-complete problem is caused by re-export and actually only
> re-export that is done on implicit base. I think I have a proof[2] that shows
> that if an explicit "requires" of all transitive dependencies of a library are
> enumerated (at compile time), it is polynomial problem to resolve them during
> runtime.
>
> Actually this is my favorite solution. I am not sure if it works with range
> dependencies, but clearly it works with semantic versioning[5] (different
> major versiong means incompatible version) as my proof shows[2].
>
> Now few comments to other solutions outlined by Jesse:
>> 1. Drop the requirement that module dependencies may include version ranges
>> or otherwise permit incompatible changes to be marked, so that a
>> dependency is always on some version or later. This seems unsatisfactory.
>
> There is clearly a need for an incompatible version of a module. On the other
> hand, why not rename the name of the module when producing an incompatible
> version. So there would be:
>
> org.w3c.dom (in few revisions like 1.0, 1.1, 1.2, etc.)
>
> and then an incompatible change as
>
> org.w3c.dom2 (in few revisions)
>
> and later also
>
> org.w3c.dom3 (with many revisions).
>
> I am not saying I like changing the name of a module with every incompatible
> version, but it is not as unsatisfactory is it might seem. The biggest problem
> in this issue is to specify a dependency on either org.w3c.dom2 or
> org.w3c.dom3, if that is ever required.
>
>> 2. Drop the requirement that the module system pick a version of each
>> module to load such that all dependencies are satisfied. For example: just
>> try to load the newest available version of each module, and if any
>> dependencies are in fact unsatisfied, report an error and ask the user to
>> override one or more module versions manually.
>
> This is what we do in NetBeans and it works for a single packaged application.
> However I don't think this will work for Java module system where a single
> repository may contain modules for multiple applications. If you accept #2, it
> may happen that after installing new application (and its modules), some
> previously installed application can no longer be executed. That is not
> something that would boost the creation of Java module ecosystem.
>
>> 3. Rewrite Resolver using something like SAT4J, or even leave it as it is,
>> and just document that pathological cases can result in unbounded
>> execution time.
>
> E.g. use brute force. Well, the rationalistic approach commonly worshiped on
> my side of the Ocean is completely against approach like this. If we knew that
> range dependencies with implicit re-export are the problem, we would not allow
> them and think about better solution. Less powerful but computable[3] in
> polynomial time.
>
> However I can imagine a pragmatic design (commonly used on the other side of
> the Ocean) is able to promise something unrealistic and then empower brute
> force tools like SAT4J (as Equinox does) to deal with "pathological" cases.
>
>> 4. Require that a new release of any module which uses an incompatible
>> version of some dependency (even if used only internally) is itself marked
>> incompatible. This seems unsatisfactory since it breaks encapsulation - I
>> should not be forced to label my new widely used graphics toolkit 2.0
>> rather than 1.1 merely because I started to use a refactored version of
>> some math library after 1.0 was released.
>
> Remember, this is only about re-exports! If I re-export (by returning it from
> a signature of my method) from my graphics toolkit java.awt.Rectangle and
> somebody removes it from JDK, there is no way around: just to produce an
> incompatible version of my graphical toolkit.
>
>> Also it is unclear how you would
>> enforce this constraint so long as Jigsaw assigns no semantics to version
>> numbers except that they are totally ordered.
>
> Jesse recently pointed out in our other conversation that there is a semantic
> versioning[5] manifesto. I believe this is much better option than having no
> semantic and fixing that by allowing range dependencies.
>
>> 5. Permit multiple versions of a module to be loaded so long as class
>> spaces remain consistent
>
> This is almost a must. Unless two versions of a class get re-exported for the
> same loader, this should be allowed (see [4]).
>
> Summary. Those are the options me and Jesse see. However the primary question
> is whether Jigsaw wants to avoid NP-completeness or just play on the pragmatic
> feelings and pretend the NP-complete problem is just a corner case. I would be
> willing to invest my knowledge into eliminating the NP-completeness. I believe
> there are reasonable ways to avoid it. And last but not least: if there should
> be a module system that avoids NP-complete problems when resolving
> dependencies, it would be an ultimate reason which rules out OSGi.
>
> -jt
>
>> [1] http://wiki.apidesign.org/wiki/LibraryReExportIsNPComplete
>
> [2] The proof is here
> http://wiki.apidesign.org/index.php?title=LibraryWithoutImplicitExportIsPolynomial&useskin=monobook
> btw. it is better to switch to "Black/White" background when reading math
> statements on my website.
>
> [3] http://wiki.apidesign.org/wiki/RangeDependencies
>
> [4] Having two modules with same namespace is OK, if the module classes are
> called, but what if the module exports a service? Is not that going to cause
> problems?
>
> [5] http://semver.org/
More information about the jigsaw-dev
mailing list