Project Jigsaw goals and requirements

Tom Marble tmarble at info9.net
Mon May 30 17:34:08 PDT 2011


On 05/25/2011 10:55 AM, mark.reinhold at oracle.com wrote:
> I've posted the latest draft of the requirements document for wider
> review [4].  Comments are most welcome.
Thanks Mark!

Many of us, as you point out, are eagerly anticipating the
new Java Module System.

I have a couple of brief comments and then want to dive
into version syntax: an area in need of serious attention.

Exploded Modules
----------------

The ability to develop, build and debug modular Java
on the filesystem (vs. being "installed) is really
essential. Alan has proposed a webrev [0] for this which
looks quite sensible, yet there hasn't been any discussion
of it... Is this the candidate approach to "Exploded Modules"?

Performance
-----------

Given my personal experience in Java performance this topic
is near and dear to my heart. In general performance analysis
can be tricky: teasing out the impact of filesystem or network
delays, isolating the "shadow contribution" [1] of a given change
through statistics, etc. Java performance analysis is further
complicated by the elaborate optimizations in HotSpot.

I think it would be beneficial for the community to
collaborate on performance analysis tools (e.g. harnesses,
load drivers, statistical calculations, visualizations,
and of course benchmarks). In any case the tools we use
to analyze performance really want to be open source so
that anyone can repeat and analyze the results.

Have the performance tools used to qualify Jigsaw release
criteria been identified? There are a fair number of
important components already in open source and it's possible
that commercial contributors, such as Oracle, may be able
to open source currently internal tools. Performance tools
are an important topic all on their own.

Substitution / Necessity of 'permits'?
--------------------------------------

In somewhat earlier document on metadata semantics [2] there
is a discussion of 'permits'.  This would seem to limit the
potential dependents of module 'M' to only module 'L'.
This seems a little odd to limit the potential uses
of a module API.  Most of the 17 cases of 'permits' in Jigsaw
now seem to serve the role of accessing features through
generic API's (e.g jdk.logging) instead of via specific
implementations (sun.logging).

While this may be the most elegant way to control API
access it probably does not have an analog in the other
kinds of metadata by potential consumers of Jigsaw.  For example a
Debian package cannot limit its dependents ('Reverse-Depends'
in Debian argot) in this way.

However Debian does have the notion of virtual packages
('Provides' [3]). We could imagine, for example, sun.logging
which Provides the jdk.logging API.

Could those who have participated in the discussion of
Substitution comment on how it might be implemented
(via permits, Provides, other)?

Dependency Resolution
---------------------

As David Bosschaert mentions in his blog [4] "The biggest
stumbling block for migrating an existing system to OSGi today
is often the modularization of a system that wasn't created in
a modular way in the first place".  As this problem is yet
unsolved Java applications are notorious for distributing
bundled jars, static libraries and other 3rd party artifacts.

And we know that the algorithms used to resolve dependencies
vary greatly between platforms. We are lucky with GNU/Linux
to have rich, native package management systems that other
operating systems lack. In this context it makes sense that
tools like OSGi and Maven have worked to address this
cross-platform need.

The end result, however, is that in meeting the goal
"Java modules as native packages" we may have cases where the
Jigsaw resolution system offers different solution(s) from
the native package manager or dependency tool.

While I feel we have an even more pressing concern over version
syntax (which I discuss below) allow me to point to Mancoosi -
a research project that might help us in the goal of
true cross-platform dependency resolution. [5][6]

Version Syntax
--------------

We are working in Debian to understand how best to package
Jigsaw while remaining compliant to Debian Policy. [7][8]
There are a number of policy violations in the current
approach to creating *.deb packages [9]. Given support
for Exploded Modules (see above) we can work through
several of these issues... One current showstopper is the
current version syntax which permits the current version
string "7_ea" containing the illegal underscore character
for Debian versions.

Here are some references to the various styles of
version syntax in use:

- Debian [10]
  {epoch:}upstream_version{-debian_revision}
    epoch:= unsigned integer (small, defaults to 0)
    upstream_version :=  digit then digit or alphanumerics or . + - : ~
      NOTE: ':' is permitted if there is no epoch (bad assumption)
      NOTE: '-' is permitted if there is no debian revision (probably ok)
      alphanumeric := [A-Za-z0-9]
    debian_revision := alphanumerics and +.~ (typically a small unsigned
      integer starting at 1 for each upstream_version)
  Comparison rules/tool [11]
    dpkg --compare-versions "$a" "$op" "$b"
      where $op is one of < << <= = >= >> >

- Red Hat (RPM) [12][13]
  {epoch:}version{-release}
    epoch	(optional) number, with assumed default of 0 if not supplied
    version	(required) can contain any character except '-'
    release	(optional) can contain any character except '-'
    NOTE: it is not clear if this pseudo grammar is authoritative
    for Red Hat Packages. Pointers welcome!

- OSGi
  Seems to be {major{.minor{.tiny{.tag}}}}
  Where major minor and tiny are integers and
  tag is alphanumeric or '-' or '_'

- Maven [14][15]
  alphanumeric or '-' or '.'
  Maven has a fairly elaborate syntax which doesn't lend itself
  to a compact grammar.  It is worth noting that Maven states
  that their system is incompatible with OSGi.
  NOTE: dependency resolution depends on path length [16]

- Current Jigsaw Version Syntax [17]
  Version:
    ModuleIdentifier {VersionTokenizer ModuleIdentifier}
  ModuleIdentifier:
    JavaLetterOrDigit
    ModuleIdentifier JavaLetterOrDigit
  VersionTokenizer:
    . or + or - or : or ~

  NOTE that an earlier document [2] said that "The Java language assigns
  no meaning to the version of a module."
  NOTE: The "module jdk.base @ 7-ea" today results in a proposed
  Debian version of "7_ea".

So one can see the challenge of determining a module version syntax
that is likely to be compatible with the consumers of Jigsaw.

One approach is the common denominator (intersection of all constraints):
  {major{.minor{.tiny{.alphatag}}}}
where alphatag is ONLY alphanumeric (no '-' or '_'). If the interpretation
of '-' for RPM is like that for Debian then it *may* be okay (that
is that in Debian if there is only one '-' it delimits the
debian_revision.  However as Jigsaw is not a Debian native package
we are guaranteed to always have a '-debian_revision' thus the
upstream_version could contain '-').

And while Java beautifully supports unicode having ModuleIdentifiers
which use only [A-Za-z] or digit may be desirable.

A simpler version syntax is probably better because it will
potentially simplify the dependency resolution algorithms (or, more
precisely reduce the difficulty in interpreting and implementing
the version comparison operators correctly).

I agree with David Lloyd [18] that the wording and interpretation
"Fidelity across all phases" is very important.  I also agree that
"[...] dependency version ranges with a closed upper end cannot be
a fixed part of the module's internal metadata".  We really don't
want to have to say "requires public foo @ [2.0,99999);" as a surrogate
for "versions 2.0 or later".

However I disagree that we want to restrict build dependencies to
specific versions (and not version ranges). This will cause build
maintainers endless toil to manage packages when no underlying
change provokes the rebuild maintenance. I think that fixed build
dependencies are a poor substitute for functional testing once
the project has been built in assuring reproducibility.

In any case getting to consensus on version syntax is my
highest priority. (Next time I'll try to be less verbose :) )

Respectfully,

--Tom

[0] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2011-May/001297.html
[1] https://secure.wikimedia.org/wikipedia/en/wiki/Shadow_price
[2] http://openjdk.java.net/projects/jigsaw/doc/language.html
[3] http://www.debian.org/doc/debian-policy/ch-relationships.html#s-virtual
[4] http://osgithoughts.blogspot.com/2011/05/java-se-8-modularity-requirements.html
[5] http://www.mancoosi.org/
[6] http://www.mancoosi.org/edos/formalization/
[7] http://www.debian.org/doc/debian-policy/
[8] http://wiki.debian.org/SummerOfCode2011/Jigsaw
[9] $JIGSAW/jdk/make/common/BuildNativePackages.gmk
[10] http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Version
[11] http://www.debian.org/doc/debian-policy/ch-relationships.html#s-depsyntax
[12] https://docs.fedoraproject.org/en-US/Fedora_Draft_Documentation/0.1/html/RPM_Guide/ch-advanced-packaging.html#id1988433
[13] http://www.rpm.org/wiki/PackagerDocs/Dependencies#RequiringPackages
[14] http://docs.codehaus.org/display/MAVEN/Versioning
[15] https://svn.apache.org/viewvc/maven/maven-3/trunk/maven-artifact/src/main/java/org/apache/maven/artifact/versioning/ComparableVersion.java?view=markup
[16] https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html
[17] http://openjdk.java.net/projects/jigsaw/doc/topics/grammar.html
[18] http://in.relation.to/Bloggers/ModulesInJDK8AndProjectJigsawRequirementsReviewPart1



More information about the jigsaw-dev mailing list