RFR(L) 8244778 Archive full module graph in CDS

Ioi Lam ioi.lam at oracle.com
Wed Aug 12 22:06:35 UTC 2020


Hi Lois,

Thanks for the comments. I have an updated webrev

http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02/
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02.delta/

Here are the general notes on the changes. Replies to your questions are 
in-line below.

(1) Integrated updates in the Java code from Alan Bateman. Here are Alan's
     notes:

     The archive of the boot layer is as before, except that archiving is
     skipped if there are split packages or incubator modules. Incubating
     modules aren't resolved by default so we shouldn't have them in the
     boot layer by default anyway.

     I've dropped the module finders from the boot layer archive. I've
     left the IllegalAccessLogger.Builder in the acrhive for now (even
     though it is not the boot layer). We should be able to remove that
     once the JEP to disallow illegal access by default is in.

     Related is that I don't like the archive for the module graph
     (ArchivedModuleGraph, pre-dates this RFE) including the set of
     packages to export/open for illegal access as they aren't part
     of the module graph. I've left it for now but we can also remove
     that once we disallow illegal access by default (as those sets
     will be empty).

     The archive of built-in class loaders is now in one object
     jdk.internal.loader.ArchivedClassLoaders which I think will be a
     bit more maintainable. I've also drop the ucp field from the
     AppClassLoader as the changes to BuiltinClassLoader means is no
     longer needs to duplicated.

     There's one remaining issue in ModuleBootstrap.boot where it has fix
     an app class loader value (ModuleLayer.CLV). Ideally the initialization
     of the built-in class loaders would do this but we are kinda forced
     to separate the archiving of the built-in class loaders from the boot
     layer. I might look at this again some time.


(2) Moved code from classLoaderData.cpp -> classLoaderSharedData.cpp

(3) Reverted unnecessary changes in JavaClasses::compute_offset

(4) Minor clean up to use QuickSort::sort() instead of qsort()

(5) Moved the C-side of module initialization for platform/system
     loaders to inside java.lang.System::initPhase2(), so this happens
     at the same time as without full module archiving.

(6) Moved the use of Module_lock to so all modules in a class loader
     are restored atomically. See ArchivedClassLoaderData::restore()

     This fixed a bug where test/jdk/com/sun/jdi/ModulesTest.java
     would fail as it sees a partially restored set of modules.



On 8/7/20 12:06 PM, Lois Foltan wrote:
> Hi Ioi,
>
> Overall looks promising.  I have some review comments below, but not a 
> complete review.  I concentrated on the JVM side primarily how the 
> archived module graph is read in, how the ModuleEntry and PackageEntry 
> tables are created from the archive for the 3 builtin loaders and 
> potential timing issues during start up.
>
> On 7/22/2020 3:36 PM, Ioi Lam wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8244778
>> http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v01/ 
>>
>>
>> Please review this patch that stores the full module graph in the CDS
>> archive heap. This reduces the initialization time of the basic JVM by
>> about 22%:
>>
>> $ perf stat -r 100 bin/java -version
>> before: 98,219,329 instructions 0.03971 secs elapsed (+- 0.44%)
>> after:  55,835,815 instructions 0.03109 secs elapsed (+- 0.65%)
>>
>> [1] Start with ModuleBootstrap.java. The current implementation is
>>     quite restrictive: the archived module graph is used only when no
>>     module options are specified.
>>
>>     See ModuleBootstrap.mayUseArchivedBootLayer().
>>
>>     We can probably support options such as main module and module path
>>     in a future RFE.
>
> Yes, I noticed the restrictions.  The JBS issue discusses the 
> possibility of using the dumped module graph in the event that the 
> value of the options are the same.  I think for this implementation to 
> be viable and have a positive impact you should consider comparing at 
> least the options --add-modules, --add-exports and --add-reads.

I agree. I want to keep the changes minimal in this RFE, and will add 
more support for other use cases in follow-on RFEs. Instead of requiring 
these options to be unspecified, we can relax the restriction such that 
these options must be the same between archive dump time and run time.

>
>
>>
>> [2] In the current JDK implementation, there is no single object
>>     that represents "the module graph". Most of the information
>>     is stored in the archive bootLayer object, but a few additional
>>     restoration operations need to be performed:
>>
>>     + See ModuleBootstrap.getArchivedBootLayer()
>>     + Some static fields need to be archived/restored in
>>       Module.java, BuiltinClassLoader.java, ClassLoaders.java
>>       and BootLoader.java
>>
>> [3] I ran into a complication with two loader instances of
>>     PlatformClassLoader and AppClassLoader. They are stored in
>>     multiple tables inside the module graph (e.g.,
>>     BuiltinClassLoader$LoadedModule) so I cannot easily recreate
>>     them at runtime.
>>
>>     However, these two loaders contain information specific to the
>>     dump time VM lifecycle (such as the classes that were loaded
>>     during CDS dumping) that need to be scrubbed. I couldn't find an
>>     elegant way of doing this, so I added a private 
>> "resetArchivedStates"
>>     method to a few classes. They are called inside
>>     HeapShared::reset_archived_object_states().
>>
>> [4] Related native data structures (PackageEntry and ModuleEntry)
>>     are also archived. Start with classLoaderData.cpp
>>
>> Passes mach5 tiers 1-4. I will test with additional tiers.
>>
>> Thanks
>> - Ioi
>
> classfile/classLoader.cpp
> - line #1644 pulling the javabase_moduleEntry() check outside of the 
> Module_lock maybe problematic if after you check it another
>   thread initializes in the time between the check and the obtaining 
> of the Module_lock on line #1645.

Fixed.

>
> classfile/classLoader.hpp
> - somewhere in ArchivedClassLoaderData there should be an assert to 
> make sure that the CLD, whose package entries and module entries are 
> being archived is not a "_has_class_mirror_holder" CLD for a weakly 
> defined hidden class.  Those dedicated CLDs should never have package 
> entries or module entries.
>

I added ArchivedClassLoaderData::assert_valid()

> classfile/moduleEntry.cpp
> - line #400, typo "conver" --> "convert"
> - line #500, maybe sort if n is greater than 1 instead of 0 (same 
> comment for packageEntry.cpp line #270)
>
Fixed

> classfile/systemDictionary.cpp
> - It looks like the PackageEntry and ModuleEntry tables for the system 
> and platform class loaders are  added within 
> SystemDictionary::compute_java_loaders() which is called from 
> Thread::create_vm() but after the call in Thread::create_vm() to 
> call_initPhase2 which is when Universe::_module_initialized is set to 
> true.  Did I read this correctly?  If yes, then I think you have to be 
> sure that Universe::_module_initialized is not set until the module 
> graph for the 3 builtin loaders is loaded from the archive.
>

I moved the code to be called by ModuleBootstrap::boot() so it should 
now happen inside call_initPhase2.

> memory/filemap.cpp
> - line #237 typo "modue" --> "module"
>

Fixed

> - Since the CDS support for capturing the module graph does not 
> archive unnamed modules, I assume 
> Modules::set_bootloader_unnamed_module() is still called.  Is this 
> true or does the unnamed module for the boot loader get archived?
The unnamed module for the boot loader is not archived.

Modules::set_bootloader_unnamed_module()  wasn't called in my last 
webrev. Thanks for catching it.

I added a call to BootLoader.getUnnamedModule() in 
ModuleBootstrap::boot()  to trigger <clinit> of BootLoader, which will 
call into the VM for Modules::set_bootloader_unnamed_module().



> Clarification/timing questions:
>

Here's an overall problem I am facing:

The module graph is archived after the module system has fully started 
up. This means that for the boot loader, we only know the full set of 
modules/packages, but we don't know which part is the subset that was 
initialized during early VM bootstraping (e.g., when 
ModuleEntryTable::javabase_defined() == false).

So the behavior is a bit different during the early bootstrap phase (all 
the way before we reach ModuleBootstrap::boot()): 
ClassLoaderData::the_null_class_loader_data()->modules() and 
ClassLoaderData::the_null_class_loader_data()->packages() are already 
populated before a single class is loaded.

This difference doesn't seem to make a practical difference. E.g., none 
of our code seems to assume that "before any classes in the java.util 
package is loaded, 
ClassLoaderData::the_null_class_loader_data()->packages() must not 
contain an entry for java.util".

I think we have two choices when the archived module graph is used:

[1] We require that the state of the module system must, at every step 
of VM initialization, be identical to that of a VM that doesn't use an 
archived module graph.

[2] We make sure that the VM/JDK bootstrap code can tolerate the 
existence of module/packages that are added earlier than a VM that 
doesn't use an archived module graph.

I tried doing a version of [1] and found that to be too difficult. [2] 
seems much simpler and is the approach I am using now.

> oops/instanceKlass.cpp
> - line #2545, comment "TO DO  -- point it to the archived PackageEntry 
> (JDK-8249262)"
> are you thinking that since the module graph is read in ahead of time 
> that it can be set when an InstanceKlass is created?  There is a point 
> before java.base is defined that InstanceKlass::set_package takes into 
> account that could be a timing issue.
>
>

I think it will work as if another class in the same package has already 
been defined.

> - There are some checks in modules.cpp that are valuable to have 
> during bootstrapping.  For example, packages that are encountered 
> during bootstrapping before java.base is defined are put into the 
> java.base module but then verified after java.base is defined via 
> verify_javabase_packages.  So at what point in the bootstrapping 
> process does the boot loader's archived module's become known? Could 
> classes have been defined to the VM before this point?  Then their 
> packages will have to be verified to be sure that they are indeed 
> packages within java.base.  Consider looking at other checks that 
> occur within classfile/modules.cpp as well.
>
As mentioned above, calling verify_javabase_packages() at run time will 
fail, as we have loaded all packages for the boot loader, not just those 
for java.base.

Well, verify_javabase_packages() was called once and it succeeded, but 
that was during CDS dump time. So we know at least we have verified this 
once :-)

Thanks
- Ioi

> I may have more review comments as I continue to look at this!
>
> Thanks,
> Lois
>
>
>
>
>



More information about the hotspot-runtime-dev mailing list