Module isolation

Mon Jun 18 14:09:13 PDT 2007

Stanley M. Ho wrote:
> Hello Bryan,
>
> Bryan Atsatt wrote:
>> ...
>> But this approach doesn't really work very well. What happens when,
>> moments after releasing a module, another application is deployed that
>> needs the same module? It gets a different instance. And that could
>> easily lead to ClassCastExceptions and/or LinkageErrors.
>>
>> How is it possible to know when it is safe to "free up resources"?
>
> The use case I have is that sometimes we might want to make a module
> temporary unavailable (e.g. turning off a plugin in the IDE), without
> shutting down the repository (possibly with hundreds/thousands of other
> modules) or uninstall the module. In this case, the container will
> trigger not only the release of the existing module instance (so it will
> have a chance to be GCed eventually), but it will also make the module
> definition invisible (through visibility policy) from other modules.
> Without the second part, it has the potential problem you described.

This use case seems to presume that the IDE can/will ensure that there
is only *one* consumer of the module: itself. What if a different plugin
has a dependency on it, and has already been resolved? Is the intention
that the IDE will be wired into the module system deeply enough to
manage this correctly?

Again, this is why we need either a real, general, isolation model, or
an access control model:

Containers, of any kind, must be able to explicitly control access to
"private" modules.

An application server, for example, will need to keep one application
from accessing the modules of another. And it must *know* that there is
no sharing, so that the application lifecycles can remain independent.

So we need some notion of a context. A purely private repository
instance is one (probably good) possibility. Another is the wrapper
Repository approach, but this requires definition copies (and management
of sharing content, lifecycle, etc).

I started this thread with another, explicit type (ModuleContext), but
it isn't obvious how to use such beast correctly.

An access control model would also work, where all parties can share
Repository instances, but we somehow discriminate between *callers* to
return different results.

We may even want to support some mixture of private + access control.

Regardless of what approach we take, the releaseModule() idea is too
simplistic. Having originally created the detach() method, thinking
along similar lines as you are with the plugin case, I do understand the
idea; I just no longer think it is sufficient :^).

The only "safe" time to release a module is when there are *zero*
importers, and, even then, you must hide/release atomically, ensuring
that no new imports spring up during the operation.

>
> Note that I don't think this is a common thing many developers want to
> do. In fact, I think we should discourage most developers from calling
> releaseModule() because of the potential consequences. On the other
> hand, we shouldn't preclude this use case either. If you have better
> suggestion to address this use case, I would like to hear it.
>
>> Well, I agree in theory. But... I am struggling to understand how we
>> provide an isolation model. If:
>>
>> - There is a 1:1 relation for ModuleDefinition<->Module instance, and
>> - Isolation requires separate Module instances, then
>> - Isolation requires separate ModuleDefinition instances
>>
>> If this is our isolation model, then how does a ModuleSystem instance
>> support this? Clearly, it would need to keep a mapping from each
>> ModuleDefinition to its Module instance. How is this simpler, or better?
>>
>> In your current model (just as in my detach() model), there is still a
>> 1:1 from *at any given moment*, at least from the perspective of the
>> definition.
>>
>> Are you thinking that the ModuleSystem would have to keep track of
>> released modules?
>
> The ModuleSystem instance would have a <ModuleDefinition, Module>
> mapping, and this should be a very simple thing to support and maintain.
> Also, the ModuleSystem needs to maintain this information anyway to
> avoid multiple Module instances to be instantiated for a given
> ModuleDefinition, or to avoid a Module instance to be instantiated if a
> ModuleDefinition has been uninstalled or belongs to a repository which
> has been shutdown. And yes, the ModuleSystem would have another map to
> keep track of all the ModuleDefinitions that are no longer
> usable/instantiated. Having the information centralized in one place has
> other benefits too, e.g. if we want to find out what the outstanding
> module instances from module definitions in all repositories, the
> ModuleSystem can provide the answer easily.

Sure. (And the ModuleSystem could not use strong references to hold
released Module instances, else it would prevent GC.)

> My view is that ModuleSystem needs to keep track of various runtime
> information related to ModuleDefinition and Module anyway, so I don't
> see clear benefit in moving part of that information into other classes.

Other than Module instances, what other "runtime information" is there
to keep track of? Caches of exported packages?

Understand here that I am *not* focused on the idea of caching Module
instances on ModuleDefinition; I do understand the desire to avoid
polluting a stateless type.

I am trying to figure out if we *could* do so, as a means of pinning
down precisely what our model is.

For example, if we were to eliminate the releaseModule() method (in
favor of some more complete mechanism), then there really is always a
1:1 for Module:ModuleDefinition, and the model is simple and obvious.
(And therefore a field cache *could* be used).

>> But this approach means that the wrapper must be tightly coupled to the
>> wrappee. While I can imagine this holding true in some cases, it
>> certainly doesn't seem like the common case.
>>
>> Why should an applet container, for example, be required to know the
>> *type* of ModuleDefinition subclasses in the repository?
>>
>> Providing some sort of copy() operation on the base class eliminates
>> this kind of coupling.
>
> Perhaps I still don't fully understand how you will want to create a new
> repository for isolation, and why having the knowledge of the
> ModuleDefinition subclasses is not acceptable in this context.
>
> As I previously hinted, having some sort of copy() operation in the base
> class is not feasible because many module definitions does not make
> sense to be cloned. Cloning also implies that the underlying lifetime of
> the ModuleDefinition (and the ModuleDefinitionContent) from two
> different repositories could be arbitrarily tied, and I don't think it
> makes sense unless the repositories have the same owner.

I completely agree that cloning raises concurrency and lifecycle issues.
But these don't go away just because you know the actual type of a
ModuleDefinition!

I am not in any way wedded to the copy idea. I am just trying to find
*some* solution that enables private Module instances; the copy idea was
actually yours :^).

> In the case of an applet container, my expectation is that the container
> will construct some kind of AppletURLRespository for each codebase with
> some ModuleDefinition subclass, and each ModuleDefinition is
> instantiated with a custom ModuleDefinitionContent implementation with
> the applet cache as the backing store. In other words, the applet
> container or the AppletURLRepository already has knowledge about the
> ModuleDefinition subclass. If the applet container needs a new
> repository for isolation, then it would construct another
> AppletURLRespository, and this new AppletURLRespository could construct
> each ModuleDefinition using the existing ModuleDefinitionContent instance.

So, in effect, we have 100% *private repository* instances.

I've been thinking that we need an intermediate somewhere between a
shared/public repo instance and an entirely private one, but... that now
strikes me as too fuzzy, and I can't see a real use case :^)

So an application server would have to create, say, a private
LocalRepository instance to hold the modules of a single application.
And it would have to ensure that no other application could get it's
grubby paws on that repository instance.

Ok. That works for me. And it eliminates the need for cloning AND for
releaseModule():

1. Any given Repository instance is either 100% shared or 100% private,
with *no* in-between.

2. The lifecycle of a shared repository instance is that of the process.

3. The lifecycle of a private repository instance is entirely up to the
creator of that instance.

4. The lifecycle of a ModuleDefinition/Module is at most that of the
enclosing Repository instance, and at least is bounded by
install/uninstall (no finer granularity).

This seems like a clean simple model: private Modules via private
Repositories.

And a private repository can work for the Applet, IDE plugin or EE
application cases just fine.

It does leave open the issue of dependencies *within* a private
repository. The simple model would be to treat the entire repository as
atomic, with any change requiring a new Repository instance. This is
probably too simplistic, however.

In an EE app, web-modules are supposed to be isolated from each other
and from other parts of the app (ejb, connectors, etc.). So this
requires either a further partitioning of the app into multiple
repositories, or some form of access control.

Further, it is possible to re-start or re-deploy/re-start only a single
web-module, *without* restarting the rest of the app. The re-start case
could use releaseModule(), though a "real" stop method would be
preferable. But this is a very special case in which the specs
essentially dictate the possible dependencies between modules. In the
general case where the dependencies are not dictated, releaseModule() is
problematic.

The re-deploy case would use uninstall/install (but would still like stop!).

If you *really* want the releaseModule() functionality, I would suggest
that we introduce a PrivateRepository type, and support release *only*
on that type.

// Bryan

>
> - Stanley
>