Preload attribute
John Rose
john.r.rose at oracle.com
Sat Jun 10 01:38:48 UTC 2023
On 9 Jun 2023, at 12:41, Dan Heidinga wrote:
> On Thu, Jun 8, 2023 at 4:51 PM John Rose <john.r.rose at oracle.com>
> wrote:
>
>> On 8 Jun 2023, at 9:52, Dan Heidinga wrote:
>>
>> On Thu, Jun 8, 2023 at 12:44 PM John Rose <john.r.rose at oracle.com>
>> wrote:
>>
>> On 8 Jun 2023, at 9:01, Dan Heidinga wrote:
>>
>> If we decouple the list of preloadable classes from the classfile,
>> how
>> would non-jdk classes be handled?> What if instead of ditching the
>>
>> attribute, or treating it like an
>>
>> optimization, we firmed up the contract and treated it as a
>> guarantee…
>>
>> If we go down this route, let’s consider putting the control
>> information
>> into a module file (only) for starters. (Maybe class file later if
>> needed.) There would be fewer states to document and test, since (by
>> definition) class files could not get out of sync.
>>
>> A module would document, in one mplace, which types it would
>> “prefer” to
>> preload in order to optimize its APIs (internal or external).
>>
>> This might lead to more class loading than intended. The current
>> approach
>> has each classfile register the list of classes it wants preloaded to
>> get
>> the best linkage which means we only have to load those classes if we
>> link
>> the original class. There's a natural trigger for the preload and a
>> limited set of classes to load.
>>
>> There’s a spectrum of tradeoffs here: We could put preload
>> attributes on
>> every method and field, to get the maximum amount of fine-grained
>> lazy
>> (pre-)loading, or put them in a global file per JVM instance. The
>> more
>> fine-grained, the harder it will be to write compliance testing, I
>> think.
>>
>
> Agreed. There's a sweet spot between expressiveness and overheads
> (testing, metadata, etc). Classfiles have historically been the place
> where the JVM tracks this kind of information as that fits well with
> separate compilation and avoids the "external metadata" problems of
> ie:
> GraalVM's extra-linguistic configuration files.
>
> When compiling the current class, javac already requires directly
> referenced classes to be findable and thus has the info required to
> write a
> preload attribute. Does javac necessarily have the same info when
> compiling the module-info classfile? Maybe when finding the
> non-exported
> packages for the module javac (or jlink? or jmod?) could also find the
> value classes that need preloading?
That is what I am assuming. The module file would be edited by those
guys. Or (maybe better) a plain flat textual list is put somewhere the
JVM can find it.
>
> Moving it into a separate pass like this doesn't feel like quite the
> right
> fit though as it excludes the classpath and complicates the other
> tools
> processing of the modules.
I think it’s better than that. When we are assembling a program
(jlink or a Leyden condenser), the responsibility of publicizing value
classes (for Preload) surely belongs to the declaration, not
collectively on all the uses.
So every module (jmod or whatever) that declares 1 or more value classes
(if they are exported, at least) should list them on a publicized watch
list.
There is no need to replicate these watch lists across all potential API
clients of a value class. There are reasons *not* to do this, since the
clients have only partial, provisional information about the values.
>
>> Moving to a single per-module list loses the natural trigger and may
>> pre-load more classes than the application will use. If Module A has
>> classes {A, B, C} and each one preloads 5 separate classes, with a
>> per-module list that's forcing the loading of 15 additional classes
>> (plus
>> supers, etc). With a per-class list, we only preload the classes on a
>> per-use basis. More of a pay for what you use model.
>>
>> Is there a natural trigger or way to limit the preloads to what I
>> might
>> use
>> with the per-module file?
>>
>> That’s a very good question. I think what Preload *really is* is a
>> list
>> of “names that may require special handling before using in
>> APIs”. They
>> don’t need to be loaded when the preload attribute is parsed; they
>> are
>> simply put in a “watch list” to trigger additional loading *when
>> necessary*. (This is already true.) So I think if we move the preload
>> list to (say) the module level (if not a global file), then the JVM
>> will
>> have its watch list. (And, in fewer chunks than if we put all the
>> stuff all
>> the time redundantly in all class files that might need them: That
>> requires
>> frequent repetition.) The JVM can use its watch list as it does
>> today, with
>> watch lists populated separately for each class file.
>>
> I initially thought a global list would lead to issues if two
> different
> classloaders defined classes of the same name but since this is a "go
> and
> look" signal, early loading based on name should be fine even in that
> case
> as each loader that mentions the name would be asked to be asked to
> load
> their version of the named class. So I think a per-JVM list would be
> OK
> from that perspective (though I still don't like it).
Agreed.
>
>
>> To emphasize: A watch list does not require loading. It means, “if
>> you see
>> this name at a point where you could use extra class info, then I
>> encourage
>> you to load sooner rather than later”. The only reason it is “a
>> thing” at
>> all is that the default behavior (of loading either as late as
>> possible, or
>> as part of a CDS-like thingy) should be changed only on an explicit
>> signal.
>>
> While true for what the JVM needs, this is hard behaviour to explain
> to
> users and challenging for compliance test writers (or maybe not if we
> continue to treat preload as an optimization).
I’m trying to reduce this to a pure optimization. In that case,
“watch lists” are just helpers, which are allowed to fail, and
allowed to be garbage.
> Is this where we want to
> spend our complexity budget?
(No, hence it should be an optimization.)
> Part of why I'm circling back to treating
> preload as a per-classfile attribute that forms a requirement on the
> VM
> rather than as an optimization is that the model becomes clearer for
> users,
> developers and testers.
I think it’s still going to be murky. Why is putting the watch list
on the API clients better than putting it on (or near) the value class
definitions?
>
>
>> And, hey, maybe CDS is all the primitive we need here: Just run
>> -Xdump
>> with all of your class path loaded. Et voila, no Preload at all.
>>
> Users may find this behaviour surprising - I ran with a CDS archive
> and my
> JVM loaded classes earlier than it would have otherwise?
CDS has the effect of making class loading in a more timely fashion, and
(under Leyden) will almost certainly trigger reordering of loading as
well. So promulgating a “watch list” has goals which align with
CDS.
I’m starting to think that the right “level” to pull for
optimizing value-based APIs is to put the value classes in a CDS
archive. That is a defacto watch list. The jlink guy should just make
a table of all value classes. That’s the best form of Preload I can
imagine, frankly.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-observers/attachments/20230609/511a9ad5/attachment-0001.htm>
More information about the valhalla-spec-observers
mailing list