[External] : Re: Inconsistency with service loading by layer or by class loader

Tue Dec 17 14:21:27 UTC 2024

On Mon, Dec 16, 2024 at 4:40 PM Ron Pressler <ron.pressler at oracle.com>
wrote:

> > I'm asking for the OpenJDK team to instead consider a different design
> for service loading in the Java platform itself, in the absence of an
> alternative solution which can use what exists today. Unfortunately, the
> constraints around the JPMS are such that I do not believe there is another
> design that will generally work for us.
>
> I understand what solution you’re proposing, but as with any feature
> design, the severity of the problem and its root causes need to be
> understood before settling on a specific solution.
>
> The issue you’re facing arises when making a particular design choice: a
> container that loads every module in a separate layer and with multiple
> parents. We need to ask ourselves, how common is this design, why do people
> pick it (yes, you explained why you picked it, but we’ll need to study
> alternatives), and are we expecting the number of such use cases to grow or
> shrink over time?
>

Of course, such container environments are rare. The
`ModuleLayer.Controller` API itself exists at all because of what could
easily be described as a very small/narrow use case. So I would not expect
a large number of users of the API.

> I do believe that there are possible workarounds which belie the conceit
> that the encapsulation of the module system is as strong as you present it;
> the existence of `ModuleLayer.Controller` itself shows that it is nowhere
> near absolute. This level of encapsulation is much more superficial than,
> say, member access control, which is to say that it can be worked around in
> various ways without "breaking the rules" (for example, by having a class
> loader define generated helper classes in every package of a module, I can
> gain access to their original Lookup objects, breaking *any* form modular
> of encapsulation without breaking any of the rules of the platform
> specification). If you would recoil at such a thing, consider again the
> difference between "intent" and "specification".
>
> I would be interested to hear more about this, because our assumption is
> that the encapsulation — once integrity by default is done, i.e. there is
> no hidden use of Unsafe or JNI etc. — is, in fact, absolute (modulo bugs),
> to the point that finally the runtime will be able to actually trust Java’s
> invariants. Both the security of the platform and the correctness of the
> compilers depend on it, so if you think we’re wrong — this is important.
> You mention ModuleLayer.Controller, but I don’t see how it undermines
> anything (in particular notice how the method that enables a capability
> with potentially global impact — native access — is caller-sensitive). As
> an example, how would library use it to mutate a string (something that was
> trivial before strong encapsulation and the JNI restriction)?
>

You're confusing the idea of modular encapsulation with the idea of
platform integrity. I'm talking specifically about the encapsulation of
modules that are loaded by a custom ClassLoader. If the user defines their
own version of a string class which purports to be immutable, and that
class is defined within my ClassLoader, then yes, I can mutate it if I want
to. There are many ways to do this.

When I load a module in a class loader, I can basically do whatever I want
with it, if I'm willing to do some tedious (and possibly
performance-degrading) work. All I'm proposing is to make the work of doing
*certain* things less tedious and performance degrading.

For example, `Module` has an `addReads` method. I could, with a custom
ClassLoader that loads some module, define an extra class within the module
which lets me call that method from my ClassLoader. This work is tedious
and has a performance impact. Thankfully, `ModuleLayer.Controller` lets me
bypass having to do this, recognizing that I would already have the ability
to bypass encapsulation in this way.

However, to give a counter-example, `Module` has `addUses` as well. But if
I want to call `addUses` on behalf of a module I've defined, there is no
corresponding `ModuleLayer.Controller` method so I do have to define the
extra class, and that's a bit silly. This is an example of an easy
enhancement that would not affect the integrity of the platform, would not
significantly increase maintenance burden (since the logic would be nearly
identical to its sibling methods), and would be easy to achieve.

Likewise, for service loading I can rewrite every class that is defined in
the class loader to redirect `ServiceLoader.load()` calls through an API I
specify, which in turn generates a synthetic module and layer which
contains generated classes that can load services from whatever I choose,
all without breaking a single encapsulation rule. While this is useful to
illustrate a point, it would be much better if I could (for example) call
`addProvides` at run time to link a module to its available service
providers, or otherwise intercept the service loading process.

> > I can tell you that several design choices of the JPMS had nothing to do
> with encapsulation principles and everything to do with the philosophy of
> those who created the constraints. There's nothing inherent in the system
> that makes it necessary to prevent circularity, *or* to eagerly load all
> modules in a layer, *or* to eagerly resolve services.
>
> True, but strong encapsulation is not the only benefit this feature
> provides and not every aspect of it serves strong encapsulation.
>

Of course.

> > there are real implications to these choices that mean that many kinds
> of Java application containers cannot be satisfactorily migrated to use
> JPMS modules. I hope you recognize that I am trying to change that, but
> that doing so may require some form of compromise from the platform itself.
>
> I presume you’re trying to use the feature because you want to enjoy the
> benefits it offers now and in the future.

>From our perspective, we started off with an open system, and the system
has been made less and less open over time. Strictly speaking, the only
"feature" here is fancy stack traces. Everything else arises out of adding
restrictions. I'm not saying this to complain about modularity, but to
remind you that we are presently only weakly incentivized to use modules at
all, so dangling intangible "benefits" before me does not entice me much.
We get more "features" by not using it. I'm essentially trying to be a team
player here, but only out of a sense of cooperation.

> Obviously, we want all our features to be as easy to use to maximise the
> value users get from them, but we also need to balance the cost of
> additional complexity and the utility it adds to the ecosystem as a whole.
> So we can consider that, and you can consider if there really is no
> acceptable design for a container other than loading every module in its
> own layer and constructing a complex layer graph. Maybe it turns out that
> *both* of these things are true, i.e. it would be a good idea to change
> ServiceLoader and also your design.
>

Sure.

> Yes. Quarkus currently does not use JPMS modules for the user application
> at build time or in any deployment mode, both of which instead rely on an
> arrangement of specialized class loaders, with all classes ending up in
> flat unnamed modules. My current experiments revolve around trying to
> create more modular-oriented and encapsulated packaging options.
>
> So you want Quarkus to also load one module per layer?
>

I'm not sure yet. But I think so.

To be more constructive, one idea that was floated in this discussion is to
> use a small number of layers. Perhaps one for the container and one for
> each user application.
>
> You claimed that doing so would be inefficient (and that that is the
> reason, or one reason, why you opted for the layer-per-module design).
> Thing is that we do need to support many modules per layer efficiently, and
> since there’s currently much work focused on improving startup time,
> knowing if there’s a problem there would be very useful.
>
> So if there’s a problem there, we’ll need to address it anyway, and if we
> do, then wouldn’t it also address yours? This way there is no addition of
> spec complexity, and many more usecases can benefit.
>

Inefficiency is just one reason that I believe that a small number of
large, flat layers won't work for us, but we can focus on it. I don't
really see any good way around it though. It's tied directly to the problem
of having to eagerly resolve all modules in a graph, which as I said
before, I believe is a very clear design error. I'll explain once again.

Imagine a layer of 1000 modules, packaged in JAR files. To load a single
class from the layer, the layer must be defined. To do this, the graph must
be resolved, and to accomplish that, the descriptors of every module must
be created. To get the descriptors, each of the 1000 JAR files has to be
opened and the bytes of the descriptors must be read from each one. Because
of eager resolution, the load time of a module layer will always scale in
linear proportion to the potential number of modules in the layer, no
matter how optimized each step in the process is made to be. The only
possible optimization strategies involve paring down the root set -
something which requires ahead-of-time analysis which again would not be
necessary if modules were loaded and linked lazily, like classes are.
Loading modules on demand would solve this performance issue fairly
decisively; it would also not forbid ahead-of-time assembly of a module
graph if that is what the user wants.

However, based on the history of the JPMS, I don't see this ever changing
(unfortunately) but you did ask.
--
- DML • he/him
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jigsaw-dev/attachments/20241217/cbeb9988/attachment-0001.htm>