Another reminder: Updated JSR-277 specification (04/19/2007)

Tue May 22 14:42:17 PDT 2007

Richard S. Hall wrote:
> Bryan Atsatt wrote:
>> Hey Stanley,
>>
>> More comments inline.
>>
>> // Bryan
>>
>> Stanley M. Ho wrote:
>>> Hi Bryan,
>>>
>>> Thanks for the inputs. See my comments inline.
>>>
>>> Bryan Atsatt wrote:
>>>> Stanley,
>>>>
>>>> Here are my comments on both the spec and the apis.
>>>>
>>>> // Bryan
>>>>
>>>> SPEC
>>>> ====
>>>>
>>>> 1.5
>>>>
>>>> You are assuming that module-name==superpackage-name, right? Otherwise,
>>>> if we add import statements to 294, then these would have to specify
>>>> module names, thus creating a circular dependency.
>>>
>>> Right. I will clarify it in the next revision of the spec.
>>>
>>>> 2.1
>>>>
>>>> "is properly understood as a mathematical abstraction"
>>>>
>>>> This phrase seems unnecessary, and comes across as a bit highbrow to
>>>> me.
>>>>
>>>> "A module archive is inherently sealed, and it does not reference any
>>>> resource externally"
>>>>
>>>> The term "sealed" here could be confused with the manifest attribute.
>>>> Perhaps a better term is "self-contained". And the second phrase is a
>>>> bit confusing also, since imports do reference external "resources".
>>>> Perhaps if you said "may not directly reference external classes or
>>>> resources"?
>>>
>>> I will look for better words to describe it.
>>>
>>>> 2.2
>>>>
>>>> Should state that module name must == superpackage name.
>>>>
>>>> 2.4
>>>>
>>>> The members definition should be a list of *packages*, not classes.
>>>> That
>>>> is all the 294 proposal supports (I actually can't see it at the moment
>>>> since my HD died, but this is my recollection!). I think it is
>>>> reasonable to assume it will stay this way.
>>>
>>> The member declared in the source file is indeed a list of packages.
>>> That said, based on my understanding of 294, the compiler would expand
>>> the member packages to a list of classes in the artifact. I will clarify
>>> it in the spec.
>>>
>>>> Version may be optional in a superpackage, but it is mandatory for a
>>>> module, right? This should be clarified.
>>>
>>> "mandatory" would mean something that we can check and enforce. Since
>>> version is declared using annotation in a superpackage, I don't think we
>>> can enforce it at build time. That said, we can still enforce it at
>>> deployment time if this is what you meant.
>>
>> Yes, the 277 spec should describe it as mandatory, and we should enforce
>> it at runtime. If it becomes possible to enforce it a compile time, we
>> would do so.
>>
>>>
>>>> The example members list should be package names only.
>>>>
>>>> 2.7.3
>>>>
>>>> I like this addition.
>>>
>>> I'm glad to know.
>>>
>>>> 2.7.4
>>>>
>>>> There may well be cases in which a module wants to re-export a
>>>> subset of
>>>> imported classes/resources. We should consider supporting this case.
>>>
>>> Could you describe the actual use cases for this?
>>
>> Package a.b contains and exports classes C and D.
>> Package x.y imports a.b, but only wants to re-export C.
>
> I am not sure that this case is such a good idea. By allowing it,
> wouldn't consumers of package x.y then potentially be able to see a D
> from some other provider...this seems like a recipe for subtle bugs. You
> mention the OSGi model below, but such subsetting of packages really
> breaks the OSGi model of atomic packages.

Right, I was thinking that we would enforce this during resolution,
but... It is certainly simpler to simply support re-export of entire
packages.

It is a little weird for a.b to re-export D if C doesn't refer to D, and
no classes in a.b refer to D, but this is perhaps too subtle.

>
>>>> Refactoring in this way results in a perhaps unexpected runtime/memory
>>>> overhead. A pure wrapper module that re-exports must have its own class
>>>> loader, and, even though it won't define any classes, the VM *cache*
>>>> will be updated as that loader is used.
>>
>> I think we need to think hard about this issue. The OSGi model of import
>> by *package name* decouples the importer from any explicit binding to a
>> bundle/module name. Refactoring under that model is *much* cleaner, and
>> far more natural. As is the usage model. After all, Foo.java import
>> statements contain package/class names, *not* module names. Programmers
>> think in terms of classes and packages.
>>
>> Peter makes this point pretty strongly, and I have to say I agree
>> wholeheartedly:
>>
>> http://www.aqute.biz/Blog/2006-04-29
>
> Me too.
>
> -> richard
>
>>>> 4.1
>>>>
>>>> Since not everyone uses the Sun terms (FCS, GA), we might want to also
>>>> define qualifier here as "beta", or "pre-release", or
>>>> "release-candidate". And a "release" version must not have a qualifier.
>>>
>>> I will clarify it.
>>>
>>>> 5.2.2
>>>>
>>>>
>>>> We should also make it clear that resources will also often be
>>>> contained
>>>> in "legacy" jars, for which no ClassesDirectoryPath is required.
>>>
>>> There is related to an open issue on how we want to support "legacy"
>>> jars in the implementation. I will open up a new thread for the
>>> discussion on this topic.
>>>
>>>> 5.6
>>>>
>>>> Might be a good idea to explicitly state that the Class-Path attribute
>>>> is ignored.
>>>
>>> Right.
>>>
>>>> #3 This would appear to rule out OSGi's support of split packages. This
>>>> spec should enable each ModuleSystem to control this behavior.
>>>
>>> The JAM file is the distribution format for the module system defined by
>>> 277, and I don't think the treatment of JAR manifest in this section
>>> applies to OSGi.
>>
>> Sure, for the manifest of a .jam file. But the statement:
>>
>> "All packages defined in a module definition are inherently sealed, and
>> this entry is ignored by the module system."
>>
>> is pretty broad, and would seem to indicate that it applies to the
>> definitions produced by any module system.
>>
>>
>>>
>>>> 6.1
>>>>
>>>> Third paragraph: Remove the leading "Besides, "
>>>>
>>>> 6.2.2
>>>>
>>>> Repository shutdown must define the semantics for any module instances
>>>> from that repository.
>>>>
>>>> I still strongly believe we need a module shutdown method that
>>>> optionally disables the loader.
>>>>
>>>> And this should also be used during repository shutdown. See my
>>>> comments
>>>> below for section E.5.
>>>
>>> This is on my radar, and is briefly mentioned in E.4 and E.5. We will
>>> need to discuss what kind of lifecycle support we want to provide, and I
>>> will open a new thread for the discussion soon.
>>>
>>>> 6.2.3/6.3
>>>>
>>>> For EE and other similar systems, it may be useful to have different
>>>> VisibilityPolicy instances per Repository. We may want to have a
>>>> getter/setter here, with the default implementation of get returning
>>>> the
>>>> default policy.
>>>
>>> Could you describe the use cases you had in mind for this?
>>
>> It creates a nice way to create wrapper repository instances that
>> provide a customized view...
>>
>> EE systems are required to isolate applications from each other. And
>> each may have very different external dependencies. If each repository
>> instance can have its own VisibilityPolicy, then a wrapper Repository
>> can be constructed for each application, using a different policy.
>>
>> Same goes for ImportOverridePolicy.
>>
>>>
>>>> 6.2.6
>>>>
>>>> Local repository instances will "know" when any of these events occurs
>>>> (via their API). Remote ones will not, so reload() is a reasonable
>>>> method.
>>>>
>>>> Should reload be *required* before any repository returns new/modified
>>>> instances?
>>>
>>> My expectation is that most repository implementations would want to
>>> optimize their performance by caching most things in memory and not poll
>>> the underlying storages unless they have to, so I think "reload" should
>>> be required. That said, if you have use cases that show otherwise, I
>>> would be interested to know about them.
>>
>> I'm just considering the simple case of deploying a new module to a
>> LocalRepository. I think it will be a bit surprising if some special act
>> is required to make it available. For example, this would fail:
>>
>>   localRepos.install(aModuleNamedSue);
>>   ModuleDefinition sue = localRepos.find("sue");
>>
>> And it is pretty awkward to insert a call to reload() here. Given that
>> the primary client of Repository is ModuleSystem, I guess the
>> preparation logic can call reload() and re-try IFF a suitable definition
>> cannot be found.
>>
>>>
>>>> 6.5.1
>>>>
>>>> It isn't clear from this that each repository instance sharing an
>>>> interchange dir must have a unique "expansion" directory.
>>>>
>>>> While some variant subclass could be created that uses appropriate
>>>> locking to share the expansion dir, the provided LocalRepository does
>>>> not.
>>>
>>> I think this will be up to each repository implementation to decide. No
>>> matter what the implementation strategy is, it should not affect the
>>> repository's semantic.
>>
>> Yes, but LocalRepository is a specific implementation, and one that
>> requires an expansion directory for each instance, right? We should make
>> that clear.
>>
>>>
>>>> 6.5.3
>>>>
>>>> #5 Should be "shut down" not "shutted down".
>>>>
>>>> 7.1.1/7.1.2
>>>>
>>>> Should say that a system property may be used to control the type
>>>> returned by ModuleSystem.getDefault().
>>>
>>> This method would be used to return the default module system which has
>>> ties with the underlying JRE implementation. As we discussed, this is
>>> not replaceable, so I don't think providing a system property for
>>> override makes sense. ;)
>>
>> Thinking about this a bit more, the idea of a "default" module system is
>> a bit odd. Certainly there will be one module system used by the JRE
>> itself, and this is what you were thinking about here. And clearly
>> LocalRepository and URLRepository are hard-wired to this same module
>> system, so it makes some sense to be able to say in the javadoc for
>> these classes that they use the "default" module (which needs to be
>> added, btw). Do you have any other uses in mind? Perhaps we should be
>> more explicit:
>>
>>    public static ModuleSystem getJREModuleSystem();
>>
>>
>> Regardless, what I was really thinking about before is ModuleSystem
>> initialization.
>>
>> ModuleDefinition has a getModuleSystem() method, but how is it
>> implemented? Our model so far appears to assume that ModuleSystem
>> instances will be shared, which is certainly reasonable, but... how? One
>> simple model is that it is module system dependent, e.g. each module
>> system implementation provides a singleton or equivalent. But this just
>> pushes the problem up a layer, since the runtime type of
>> ModuleDefinition will vary based on module system.
>>
>> I see that you added a ctor arg/getModuleSystem() method to Repository
>> as well. In this model, the type problem is pushed up one more layer. It
>> is the Repository implementation that must know the ModuleDefinition
>> type, and therefore the ModuleSystem type.
>>
>> But we need to specify exactly how the initial repositories are
>> configured/initialized, using the correct types, so that the JRE can
>> instantiate them. (I took a cut at this in the class loading doc, if you
>> recall.) This should be added to Appendix E--even if some of the details
>> are JVM vendor specific, the requirements must at least be defined.
>>
>>
>> But... didn't we decide long ago that a repository should not be
>> restricted to a single module system? With a multi-repository search
>> model, this is less of an issue, but why impose this limit? Composite
>> implementations might have a hard time implementing getModuleSystem().
>>
>> Perhaps we should consider a model in which we:
>>
>> 1. Require ModuleSystem implementations to have a globally unique name.
>> (There aren't likely to be hundreds of them, so this shouldn't be much
>> of a hardship!)
>>
>> 2. Have a persistent configuration mechanism for ModuleSystems, like the
>> one for the initial Repository instances. In addition to the runtime
>> types for ModuleSystem subclasses, the instances themselves will likely
>> need configuration, just like Repositories (logging, security,
>> import/visibility policies, etc.)
>>
>> 3. During startup, the JRE initializes the registered ModuleSystems,
>> then the registered Repositories.
>>
>> 4. Add to ModuleSystem:
>>
>>    public abstract String getName();
>>
>>    public static ModuleSystem getModuleSystem(String name);
>>    public static List<ModuleSystem> getModuleSystems();
>>
>> Now repository implementations are not involved in *instantiating*
>> ModuleSystems; they look them up by name.
>>
>> (and having a centralized lookup mechanism may enable other interesting
>> behaviors)
>>
>>>
>>>> I think we need more flexible mechanisms to control this. At the very
>>>> least, there should be a way to define it in the local JRE in a
>>>> persistent fashion.
>>>>
>>>> But we also may want a way for the launcher to choose the default
>>>> module
>>>> system based on some delegation model. Perhaps the available systems
>>>> can
>>>> be specified in a config file, each in a standard 277 module. And the
>>>> launcher's search algorithm could iterate across the ModuleSystem
>>>> instances and ask each to try loading the main module.
>>>>
>>>> Obviously the specifics need to be worked out, but you get the idea.
>>>
>>> This will depend on the implementation logic in the launcher in the JDK,
>>> but the launcher is out of scope in the 277 spec. On the other hand, if
>>> the launcher requires APIs from 277 in order to do its work, then we
>>> will need to consider the requirements.
>>>
>>>> 7.3.1
>>>>
>>>> #4, bullet 2: loadClass() must not mask any thrown Error or
>>>> RuntimeException by converting it to a ClassNotFoundException! This
>>>> would make debugging the problem much more difficult, and would be a
>>>> significant change in class loader behavior.
>>>
>>> I will make the logic more clear.
>>>
>>>> 7.3.2.2
>>>>
>>>> The search order for resources MUST be the same as that for classes. I
>>>> can support this statement if you like.
>>>
>>> How to actually support the notion of exported resources is another open
>>> issue, and I will open a new thread on this soon.
>>>
>>>> We should also say something about the protocol for resource URLs.
>>
>> Shouldn't we? Like whether there will be a new protocol for them, or not.
>>
>>>>
>>>> E.5
>>>>
>>>> Again, module shutdown can only be performed within the scope of some
>>>> larger construct, such as an EE application/container.
>>>>
>>>> Such an environment can and does have an explicit lifecycle model,
>>>> including threading.
>>>>
>>>> Consider what can (and will!) happen after a module uninstall, if that
>>>> module is left "alive"...
>>>>
>>>> If the uninstall removes the deployed module from the storage, then
>>>> subsequent subsequent attempts to load classes/resources from that
>>>> module can fail (depending on what is cached by the underlying
>>>> ZipFile/file system/OS implementation).
>>>>
>>>> And it doesn't need to be a direct call to loadClass(). I have often
>>>> seen this kind of failure happen with lazy dependency loading.
>>>>
>>>> So, the uninstall either must defer physically removing the deployed
>>>> module until all instances are gone, or it must schedule them for
>>>> removal on next startup, or whatever... crazy complexity.
>>>>
>>>> Can the storage of a shutdown repository be safely deleted? Not if any
>>>> process is still using it.
>>>>
>>>> For all of these reasons, I strongly believe we need a module shutdown
>>>> method. And yes, its use must be well documented!
>>>
>>> If I understand your requirements properly, I think what you are really
>>> asking is a classloader disable/shutdown method, and such a method would
>>> be called when a module is released (or shutdown) from the module system
>>> in the context of EE servers.
>>>
>>>> API
>>>> ---
>>>
>>> Since the APIs are still evolving significantly, I would like to defer
>>> most of that discussions until the next revision of the spec and APIs
>>> are available for the EG to review. That said, I will take your comments
>>> into account when the APIs are refactored.
>>>
>>> - Stanley
>>>
>>>
>>>> ModuleDefinition
>>>>
>>>> * The getModuleDefinitionContent() method must return a list of
>>>> instances, since multiple jars/directories may be present. Might also
>>>> make sense to shorten the name to getContents() since the return type
>>>> makes it pretty clear what you are getting.
>>>>
>>>>
>>>> ModuleDefinitionContent
>>>>
>>>> * The semantics of the open() method are fuzzy: who calls it, and when?
>>>> Can it be safely called multiple times? What happens when the other
>>>> methods are called and open() hasn't occurred?
>>>>
>>>> * Perhaps the open should be automatic on use of any accessor method if
>>>> needed?
>>>>
>>>> * There should be an isOpen() method.
>>>>
>>>> * I think the name should be shortened to ModuleContent. It is not
>>>> actually part of the definition.
>>>>
>>>>
>>>> JamsModuleDefinition
>>>>
>>>> * Should be singular: JamModuleDefinition
>>>>
>>>>
>>>> ImportPolicy
>>>>
>>>> * Should have a static getDefault() method.
>>>>
>>>>
>>>> Version/VersionRange/VersionContstraint
>>>>
>>>> (See proposal in earlier email)
>>>>
>>>>
>>>> PlatformBinding
>>>>
>>>> * Why not just call this "Platform"? It is no different than Version,
>>>> just another attribute that can be used to filter during resolution.
>>>>
>>>>
>>>> ModuleArchiveInfo
>>>>
>>>> * Need a way to get this from either a Module or a ModuleDefinition.
>>>>
>>>>
>>>> Repository
>>>>
>>>> * Implementations of findModuleDefinition() will want to discover if
>>>> the
>>>> query contains a module name (equality test only!), so that the name
>>>> can
>>>> be used as a primary key. Either we should add a method to Query to
>>>> test
>>>> for this, or we should pass a name argument to this method. In the
>>>> latter case, a null name must be allowed.
>>>
>