<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Sat, Dec 14, 2024 at 1:57 AM Alan Bateman <<a href="mailto:alan.bateman@oracle.com">alan.bateman@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>


  <div>

    On 14/12/2024 00:02, David Lloyd wrote:<br>

    <blockquote type="cite">

      
      <div dir="ltr">

        <div dir="ltr">

          <div style="font-family:arial,helvetica,sans-serif">:<br>

          </div>

        </div>

        <div class="gmail_quote">

          <div><br>

          </div>

          <div style="font-family:arial,helvetica,sans-serif">The problem

            with having one (or a few) broad layer for all named modules

            is twofold: first, that every module that *might* be needed

            for an application must be found and loaded before *any*

            module is able to be loaded. This works in some simpler

            packaging scenarios but is too startup-heavy in cases with

            very large numbers of modules. The "--add-modules" switch on

            the Java launcher is a direct result of forbidding late

            binding of modules. From a usability perspective, this is

            already far from ideal, and if you start talking about

            hundreds or thousands of modules, it becomes completely

            unworkable. The second problem is that it makes it very

            difficult to support any kind of dynamicity (for example

            adding additional plugins/service implementations at run

            time) since an outsized amount of static analysis must be

            done to categorize the layers, whereas lazily loading layers

            solves the problem easily and elegantly.</div>

        </div>

        <br>

      </div>

    </blockquote>

    If the real issue here is "too startup-heavy" then that might be

    something to focus on.<br>

    <br>

    The --add-modules command line option serves many cases where

    additional modules may be needed. The intention with `--add-modules

    ALL-DEFAULT` was to help container like applications that in turn

    load other applications at run-time. The container can of course

    create a module layer before creating layers for applications and

    the only real challenge there is the JDK "platform modules", cue the

    requirement for "dynamic augmentation of platform modules". We only

    got so far on this topic, but enough for needs such as allow the JMX

    agent or a Java agent be loaded into a running VM when the modules

    required to support that are not in the boot layer.<br>

    <br>

    Module layers works well for plugins and services and are of course

    created on-demand. I don't think I understand what you mean by

    "outsized amount of static analysis must be done to categorize the

    layers", is this a reference to your exploration into multi-parent

    configurations and one-module-per-layer?<br></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The reason to lazily load and link modules is basically analogous to the reason that we lazily load and link classes. In fact there is very little difference. While requiring all classes to be preloaded may have some benefits in certain circumstances (take GraalVM for example), it also significantly restricts flexibility in a few important ways, and of course has a heavy performance cost, and so is not generally considered to be a good strategy for application runtimes or even the Java launcher itself.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">For modules, we face the same issue. I can, for example, create a single application layer with a thousand modules in it. Before the application can start, every JAR has to be opened and every module has to be loaded, which entails parsing or dynamically creating descriptors (generally a combination of these things), each with dozens of support objects for things like dependencies, exports, and services, and then their internal graphs have to be resolved and wired and checked for consistency. So there is a significant performance cost there. But, that is not the only problem. A user application is often a combination of many modules in addition to our own basic modules, plus a significant amount of generated code. The resultant module graph might even have internal consistency problems (for example, having multiple versions of a module, or cyclical dependencies) that can't realistically be resolved socially because many of these modules are going to be from third parties that we may or may not have any control over, and might even be dependencies brought in by the user which we know nothing about. These graphs could be the result of very complex resolution of Maven artifacts for example.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Now I could be clever and break this application up into different layers that I can lazily resolve as a unit. As designed, layers are strictly hierarchical (i.e. non cyclical). So I need a way to analyze my thousand JARs and find islands of interdependence that can work as a layer (which is to say, that do not need to be encapsulated from "lower" layers or can work as independent layers), for example by analyzing the service graphs of every module. This is not an easy problem. Sometimes an application is logically layered, but usually it is not. Most often, the layers themselves end up being quite large anyway once all the interdependencies are worked out.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Layering (or more generally, arranging program modules into a DAG) has always been, in my opinion, a poor way to model a complex program; it creates an arbitrary constraint that never quite pays off in terms of performance, usability, or maintainability (one symptom of the flaws in this design is that services found via multiple routes to a single parent are reported multiple times; another is the parent-first/child-first dichotomy which has subtle implications). We moved from layered class loading to parentless, arbitrary class loader graphs for the JBoss application service about 15 years ago for this reason (one class loader per (pre-JDK9) module), and this move essentially saw the end of "JAR hell" for us and brought many benefits (such as a massive reduction in maintenance costs, better startup time, and much stronger encapsulation) besides. The improvement was so dramatic in fact that it is now my belief that the parent delegation model was essentially a design error. So from my view, the question isn't "why can you not conform to the design of the module system" but rather "why doesn't the module system design conform to the requirements of those who seek to use it". The only way to bring our desired model to the JPMS appears to be by using a layer per module (and a corresponding class loader per module). This way, we don't have to do any analysis of the module graph beyond a single module, and only on demand, which makes things much simpler and faster for us at startup and as the application runs. The application can start up quickly regardless of whether there are only a handful or modules or many thousands of them. Many inconsistencies that don't affect the integrity of the application can be handled gracefully.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The service loader problem is the first, and so far only, insurmountable road block for me in pursuing this model, because with separate layers per module, no module can find services from any other module under any circumstances (unless they use a different service loading API that we provide, or use some other provided API to find the layers to search, which as I said before isn't likely to ever happen).</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">One workaround I was playing with was defining a special kind of "reverse dependency" which causes the layers of all modules containing implementations of a service to be parents of the module layers that require it. Obviously this is fragile in the face of cycles though, and service requirements may commonly be cyclical, so it's not really a viable solution for us.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Another workaround is to put anything which provides a service in an unnamed module. This actually works fairly well, because then my class loader implementation can reroute services just like we did in the old days. In this case though, the service provider won't be able to use the `provider` method mechanism for service loading. But, any library which expects to function correctly as an unnamed module (i.e. most of them) generally cannot take advantage of this mechanism anyway in most circumstances. We also lose some encapsulation features, and finding the name of a class's module is a bit trickier. We also lose some nice stack trace stuff, but I work around this by naming the class loader with the module name and version, but wrapped in brackets `[` `]` so that we know at a glance that the module is unnamed.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">My current experiment is to instrument all loaded code and intercept all attempts to load services, and mediate these requests through our own API. I had a problem with the classfile API that held me up last week but I expect that I'll have some results from this next week. That said, it's not a realistic long term solution, because instrumenting classes at run time with the classfile API has too high a cost.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">What I would really like to see is a way for us to gain control over service loading when using custom layers or class loaders. One idea would be for class loaders to gain an API which is used by the service loader to find all services for a given service type e.g. `protected List<String> findServiceProviders(Class<?> serviceType)`. The default implementation might find all `META-INF/services/*` resources and read the module descriptor of the layer for example. But custom run time systems would be able to find services based on other kinds of dependencies that are not known to the JPMS.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr">- DML • he/him<br></div></div></div>