Preload attribute

Sat Jun 10 01:38:48 UTC 2023

On 9 Jun 2023, at 12:41, Dan Heidinga wrote:

> On Thu, Jun 8, 2023 at 4:51 PM John Rose <john.r.rose at oracle.com> 
> wrote:
>
>> On 8 Jun 2023, at 9:52, Dan Heidinga wrote:
>>
>> On Thu, Jun 8, 2023 at 12:44 PM John Rose <john.r.rose at oracle.com> 
>> wrote:
>>
>> On 8 Jun 2023, at 9:01, Dan Heidinga wrote:
>>
>> If we decouple the list of preloadable classes from the classfile, 
>> how
>> would non-jdk classes be handled?> What if instead of ditching the
>>
>> attribute, or treating it like an
>>
>> optimization, we firmed up the contract and treated it as a 
>> guarantee…
>>
>> If we go down this route, let’s consider putting the control 
>> information
>> into a module file (only) for starters. (Maybe class file later if
>> needed.) There would be fewer states to document and test, since (by
>> definition) class files could not get out of sync.
>>
>> A module would document, in one mplace, which types it would 
>> “prefer” to
>> preload in order to optimize its APIs (internal or external).
>>
>> This might lead to more class loading than intended. The current 
>> approach
>> has each classfile register the list of classes it wants preloaded to 
>> get
>> the best linkage which means we only have to load those classes if we 
>> link
>> the original class. There's a natural trigger for the preload and a
>> limited set of classes to load.
>>
>> There’s a spectrum of tradeoffs here: We could put preload 
>> attributes on
>> every method and field, to get the maximum amount of fine-grained 
>> lazy
>> (pre-)loading, or put them in a global file per JVM instance. The 
>> more
>> fine-grained, the harder it will be to write compliance testing, I 
>> think.
>>
>
> Agreed.  There's a sweet spot between expressiveness and overheads
> (testing, metadata, etc).  Classfiles have historically been the place
> where the JVM tracks this kind of information as that fits well with
> separate compilation and avoids the "external metadata" problems of 
> ie:
> GraalVM's extra-linguistic configuration files.
>
> When compiling the current class, javac already requires directly
> referenced classes to be findable and thus has the info required to 
> write a
> preload attribute.  Does javac necessarily have the same info when
> compiling the module-info classfile?  Maybe when finding the 
> non-exported
> packages for the module javac (or jlink? or jmod?) could also find the
> value classes that need preloading?

That is what I am assuming.  The module file would be edited by those 
guys.  Or (maybe better) a plain flat textual list is put somewhere the 
JVM can find it.

>
> Moving it into a separate pass like this doesn't feel like quite the 
> right
> fit though as it excludes the classpath and complicates the other 
> tools
> processing of the modules.

I think it’s better than that.  When we are assembling a program 
(jlink or a Leyden condenser), the responsibility of publicizing value 
classes (for Preload) surely belongs to the declaration, not 
collectively on all the uses.

So every module (jmod or whatever) that declares 1 or more value classes 
(if they are exported, at least) should list them on a publicized watch 
list.

There is no need to replicate these watch lists across all potential API 
clients of a value class.  There are reasons *not* to do this, since the 
clients have only partial, provisional information about the values.

>
>> Moving to a single per-module list loses the natural trigger and may
>> pre-load more classes than the application will use. If Module A has
>> classes {A, B, C} and each one preloads 5 separate classes, with a
>> per-module list that's forcing the loading of 15 additional classes 
>> (plus
>> supers, etc). With a per-class list, we only preload the classes on a
>> per-use basis. More of a pay for what you use model.
>>
>> Is there a natural trigger or way to limit the preloads to what I 
>> might
>> use
>> with the per-module file?
>>
>> That’s a very good question. I think what Preload *really is* is a 
>> list
>> of “names that may require special handling before using in 
>> APIs”. They
>> don’t need to be loaded when the preload attribute is parsed; they 
>> are
>> simply put in a “watch list” to trigger additional loading *when
>> necessary*. (This is already true.) So I think if we move the preload
>> list to (say) the module level (if not a global file), then the JVM 
>> will
>> have its watch list. (And, in fewer chunks than if we put all the 
>> stuff all
>> the time redundantly in all class files that might need them: That 
>> requires
>> frequent repetition.) The JVM can use its watch list as it does 
>> today, with
>> watch lists populated separately for each class file.
>>
> I initially thought a global list would lead to issues if two 
> different
> classloaders defined classes of the same name but since this is a "go 
> and
> look" signal, early loading based on name should be fine even in that 
> case
> as each loader that mentions the name would be asked to be asked to 
> load
> their version of the named class.  So I think a per-JVM list would be 
> OK
> from that perspective (though I still don't like it).

Agreed.

>
>
>> To emphasize: A watch list does not require loading. It means, “if 
>> you see
>> this name at a point where you could use extra class info, then I 
>> encourage
>> you to load sooner rather than later”. The only reason it is “a 
>> thing” at
>> all is that the default behavior (of loading either as late as 
>> possible, or
>> as part of a CDS-like thingy) should be changed only on an explicit 
>> signal.
>>
> While true for what the JVM needs, this is hard behaviour to explain 
> to
> users and challenging for compliance test writers (or maybe not if we
> continue to treat preload as an optimization).

I’m trying to reduce this to a pure optimization.  In that case, 
“watch lists” are just helpers, which are allowed to fail, and 
allowed to be garbage.

> Is this where we want to
> spend our complexity budget?

(No, hence it should be an optimization.)

> Part of why I'm circling back to treating
> preload as a per-classfile attribute that forms a requirement on the 
> VM
> rather than as an optimization is that the model becomes clearer for 
> users,
> developers and testers.

I think it’s still going to be murky.  Why is putting the watch list 
on the API clients better than putting it on (or near) the value class 
definitions?

>
>
>> And, hey, maybe CDS is all the primitive we need here: Just run 
>> -Xdump
>> with all of your class path loaded. Et voila, no Preload at all.
>>
>  Users may find this behaviour surprising - I ran with a CDS archive 
> and my
> JVM loaded classes earlier than it would have otherwise?

CDS has the effect of making class loading in a more timely fashion, and 
(under Leyden) will almost certainly trigger reordering of loading as 
well.  So promulgating a “watch list” has goals which align with 
CDS.

I’m starting to think that the right “level” to pull for 
optimizing value-based APIs is to put the value classes in a CDS 
archive.  That is a defacto watch list.  The jlink guy should just make 
a table of all value classes.  That’s the best form of Preload I can 
imagine, frankly.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-experts/attachments/20230609/511a9ad5/attachment-0001.htm>