NP-Completeness Re: How to drain your laptop battery using Jigsaw

Thu Dec 22 11:50:13 PST 2011

>## 21. 12. 2011 23:23:44 ##<
> My colleague Jarda Tulach had investigated module systems supporting
> version ranges, and wrote up a formal proof [1] that picking one version
> of each of several modules in a repository such that all dependencies are
> satisfied is in general NP-complete. I decided to verify that Jigsaw
> suffers from this problem.
> 
> The attached program, run against a recent Jigsaw build (8c6429c279e7),
> starts consuming many seconds of CPU time per round (effectively hanging)
> somewhere around 9 variables (depending on the random seed), mostly
> running in Resolver.resolve according to a profiler. The randomized nature
> of the test and the clause count is designed to stress the system, of
> course.

First of all, congrats Jesse! This is a perfect real world demonstration of my 
theoretical work. I guess for many people it is more important to hear about 
"draining your batteries", than about facing an NP-complete problem.

Anyway, I am glad you brought this problem to the jigsaw-dev mailing list. It 
is in my opinion interesting topic and if there is a way to avoid pitfalls of 
range dependencies, I'll be glad to help with that.

> Jarda please step in if I misrepresented any of these options.

I've been thinking about these issues for last decade while working on 
NetBeans and its module system (helping you, Jesse). The NP-complete problem 
exists only when we allow implicit re-export. That means:

- using a library internally, without exporting it (and under the assumption 
that multiple versions of a library can co-exist, if not re-exported[4]) is 
safe. 

That is why the NP-complete problem is caused by re-export and actually only 
re-export that is done on implicit base. I think I have a proof[2] that shows 
that if an explicit "requires" of all transitive dependencies of a library are 
enumerated (at compile time), it is polynomial problem to resolve them during 
runtime.

Actually this is my favorite solution. I am not sure if it works with range 
dependencies, but clearly it works with semantic versioning[5] (different 
major versiong means incompatible version) as my proof shows[2].

Now few comments to other solutions outlined by Jesse:
> 1. Drop the requirement that module dependencies may include version ranges
> or otherwise permit incompatible changes to be marked, so that a
> dependency is always on some version or later. This seems unsatisfactory.

There is clearly a need for an incompatible version of a module. On the other 
hand, why not rename the name of the module when producing an incompatible 
version. So there would be:

org.w3c.dom (in few revisions like 1.0, 1.1, 1.2, etc.)

and then an incompatible change as

org.w3c.dom2 (in few revisions)

and later also

org.w3c.dom3 (with many revisions).

I am not saying I like changing the name of a module with every incompatible 
version, but it is not as unsatisfactory is it might seem. The biggest problem 
in this issue is to specify a dependency on either org.w3c.dom2 or 
org.w3c.dom3, if that is ever required.

> 2. Drop the requirement that the module system pick a version of each
> module to load such that all dependencies are satisfied. For example: just
> try to load the newest available version of each module, and if any
> dependencies are in fact unsatisfied, report an error and ask the user to
> override one or more module versions manually.

This is what we do in NetBeans and it works for a single packaged application. 
However I don't think this will work for Java module system where a single 
repository may contain modules for multiple applications. If you accept #2, it 
may happen that after installing new application (and its modules), some 
previously installed application can no longer be executed. That is not 
something that would boost the creation of Java module ecosystem.

> 3. Rewrite Resolver using something like SAT4J, or even leave it as it is,
> and just document that pathological cases can result in unbounded
> execution time.

E.g. use brute force. Well, the rationalistic approach commonly worshiped on 
my side of the Ocean is completely against approach like this. If we knew that 
range dependencies with implicit re-export are the problem, we would not allow 
them and think about better solution. Less powerful but computable[3] in 
polynomial time.

However I can imagine a pragmatic design (commonly used on the other side of 
the Ocean) is able to promise something unrealistic and then empower brute 
force tools like SAT4J (as Equinox does) to deal with "pathological" cases.

> 4. Require that a new release of any module which uses an incompatible
> version of some dependency (even if used only internally) is itself marked
> incompatible. This seems unsatisfactory since it breaks encapsulation - I
> should not be forced to label my new widely used graphics toolkit 2.0
> rather than 1.1 merely because I started to use a refactored version of
> some math library after 1.0 was released. 

Remember, this is only about re-exports! If I re-export (by returning it from 
a signature of my method) from my graphics toolkit java.awt.Rectangle and 
somebody removes it from JDK, there is no way around: just to produce an 
incompatible version of my graphical toolkit.

> Also it is unclear how you would
> enforce this constraint so long as Jigsaw assigns no semantics to version
> numbers except that they are totally ordered.

Jesse recently pointed out in our other conversation that there is a semantic 
versioning[5] manifesto. I believe this is much better option than having no 
semantic and fixing that by allowing range dependencies.

> 5. Permit multiple versions of a module to be loaded so long as class
> spaces remain consistent 

This is almost a must. Unless two versions of a class get re-exported for the 
same loader, this should be allowed (see [4]).

Summary. Those are the options me and Jesse see. However the primary question 
is whether Jigsaw wants to avoid NP-completeness or just play on the pragmatic 
feelings and pretend the NP-complete problem is just a corner case. I would be 
willing to invest my knowledge into eliminating the NP-completeness. I believe 
there are reasonable ways to avoid it. And last but not least: if there should 
be a module system that avoids NP-complete problems when resolving 
dependencies, it would be an ultimate reason which rules out OSGi.

-jt

> [1] http://wiki.apidesign.org/wiki/LibraryReExportIsNPComplete

[2] The proof is here 
http://wiki.apidesign.org/index.php?title=LibraryWithoutImplicitExportIsPolynomial&useskin=monobook
btw. it is better to switch to "Black/White" background when reading math 
statements on my website.

[3] http://wiki.apidesign.org/wiki/RangeDependencies

[4] Having two modules with same namespace is OK, if the module classes are 
called, but what if the module exports a service? Is not that going to cause 
problems?

[5] http://semver.org/