RFR JDK-8232222: Set state to 'linked' when an archived class is restored at runtime

Jiangli Zhou jianglizhou at google.com
Wed May 20 00:22:41 UTC 2020


Hi Ioi,

On Mon, May 18, 2020 at 10:15 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
> Hi Jiangli,
>
> Thank you for posting the design proposal. As you probably know, we have
started a discussion for Project Leyden [1] to improve start-up performance
of the JDK. It looks like you proposal will touch upon many of the topics
that are also of interest in Leyden. I would encourage you to participate
in the discussion so we can find the best way of moving forward.
>

Yes, I've responded to Mark's call for discussion for Project Leyden
yesterday (see
https://mail.openjdk.java.net/pipermail/discuss/2020-May/005464.html).

> Part of your proposal is a generally available heap snapshot mechanism. I
think it will be really useful in speeding up start-up. I want to
understand how it works. Given a very simple program:
>
>     @Preserve class A {
>         static final int A = B.B + 1;
>     }
>     @Preserve class B {
>         static final int B = A.A + 1;
>     }
>     class App {
>         public static void main(String args[]) {
>             if (args[0].equals("A")) {
>                 System.out.println("A.A = " + A.A);
>                 System.out.println("B.B = " + B.B);
>             } else {
>                 System.out.println("B.B = " + B.B);
>                 System.out.println("A.A = " + A.A);
>             }
>         }
>     }
>
>     $ java -cp . App A
>     A.A = 2
>     B.B = 1
>     $ java -cp . App B
>     B.B = 2
>     A.A = 1
>
> As you can see, the values of A.A and B.B depends on the order of
execution. However, the heap snapshot can save only one possible set of
values.
>
> So let's assume we have saved {A.A=2, B.B=1}. If you run the program like:
>
>     $ java -cp . App B
>     B.B = 1
>     A.A = 2
>
> The output is not valid according to the current Java Language Spec.
>
> I think there are ways to solve this. For example, we could evolve the
JLS to introduce a new concept of "linking" an Java app. The can allow the
programmer to specify a set of classes to be loaded/linked/initialized
ahead-of-time. I think it will be best to discuss such JLS changes in
Leyden. It is important not just for class loading, but also for the AOT
compiler.
>

Perhaps finding the answers to the following questions may help?


   - How often does the above pattern (similar to a circular dependency) in
   the test code occur in practices (in JDK code and application code)?
   - Is that intentional or unintentional? Does the behavior difference
   cause any unwanted side-effects from developer & user point of view?
   - What's the timeline for the JLS changes?
   - If the pattern is not common in practices, is it possible to solve the
   problem incrementally by addressing common cases first and deliver
   performance improvements, while addressing the needed JLS changes in
   parallel (then add support for this type of use cases)? An opt-in solution
   would make that possible.

Glad to see Mark's initiation of project Leyden. Looking forward to
participate. Let's discuss more as part of the project.

Thanks!
Jiangli

> Thanks
> - Ioi
>
> [1] https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html
>
>
>
> On 5/15/20 7:35 PM, Jiangli Zhou wrote:
>
> I'd like to resume the discussion on this topic (sorry for the long
delay) and seek early feedback on the Java Class Pre-resolution and
Pre-initialization Design Proposal. The proposal enhances the existing
Caching Java Heap Objects approach and provides more general support to
pre-initialize classes and preserve static field values for both JDK
classes and application classes (loaded by system class loader). Using
annotation to tag a class or field for pre-initialization was suggested by
others (probably including Ioi, David ..., if my memory serves me
correctly) during the Caching Java Heap Objects work, so that part of the
credit goes to them. I adopted that suggestion in the proposal after
comparing it to other alternatives, please see details in the doc.
> I have a class pre-initialization/preservation prototype now that's built
on top of the JDK-8232222 change with JDK 11. Quick experiment that applied
the pre-initialization to ~100 JDK classes used during VM startup time
showed about ~2ms improvement on my linux x86 machine.
>
> From what I learned so far, developers do see benefits of class
pre-initialization and use it in practices. Any work in this area,
including making it easier to achieve class pre-initialization, is welcomed
by Java developers.
> Feedback on the proposal and steps/process to moving forward to
contributing to OpenJDK is appreciated.
> Best regards,
> Jiangli
>
>
> On Wed, Jan 8, 2020 at 3:10 PM Jiangli Zhou <jianglizhou at google.com>
wrote:
>>
>> Hi David,
>>
>> Thanks a lot for the inputs from JVMS compliance perspective!
>>
>> On Tue, Jan 7, 2020 at 10:02 PM David Holmes <david.holmes at oracle.com>
wrote:
>> >
>> > Hi Jiangli,
>> >
>> > On 21/12/2019 11:50 am, Jiangli Zhou wrote:
>> > > Hi David,
>> > >
>> > > On Thu, Dec 19, 2019 at 7:43 PM David Holmes <david.holmes at oracle.com>
wrote:
>> > >>
>> > >> Hi Jiangli,
>> > >>
>> > >> On 20/12/2019 1:01 pm, Jiangli Zhou wrote:
>> > >>> Hi Coleen,
>> > >>>
>> > >>> Thank you for your feedback! I was planning to ping you and get
your
>> > >>> opinion. Happy to see your comment and involvement!
>> > >>>
>> > >>> On Thu, Dec 19, 2019 at 2:26 PM <coleen.phillimore at oracle.com>
wrote:
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> On 12/16/19 10:30 PM, Jiangli Zhou wrote:
>> > >>>>> Since this change involves security related issues, Ioi, Claes
and I
>> > >>>>> had off-mailing list discussion and initial code review
(thanks!).
>> > >>>>> Thanks Ioi for investigating and providing test cases in security
>> > >>>>> related area. As the security related parts are settled, I'd
like to
>> > >>>>> resume the discussion and review in the open,  and hope to move
this
>> > >>>>> forward.
>> > >>>>>
>> > >>>>> My initial change (webrev.00) set all archived classes loaded by
>> > >>>>> builtin loaders in 'linked' state during restoration in
>> > >>>>> InstanceKlass::restore_unshareable_info(). Ioi demonstrated a
>> > >>>>> potential issue for classes loaded by the AppClassLoader (thanks
>> > >>>>> again). More investigation and work are needed to handle the
general
>> > >>>>> case for AppClassLoader. Following are two proposals provide the
>> > >>>>> support with narrower scopes with no potential security issue.
Option
>> > >>>>> (1) is an approach that I implemented in one of my earlier
patches.
>> > >>>>>
>> > >>>>> Option (1)
>> > >>>>> ========
>> > >>>>> Set an archived class in 'linked' state during restoration in
>> > >>>>> InstanceKlass::restore_unshareable_info() for the following
cases:
>> > >>>>>
>> > >>>>> Case 1): For an archived class loaded by the NULL class loader by
>> > >>>>> default (when verification is not required for classes loaded by
the
>> > >>>>> NULL class loader, which is the default case).
>> > >>>>> Case 2): For an archived class loaded by the PlatformClassLoader
and
>> > >>>>> AppClassLoader when verification is disabled at runtime. This
use case
>> > >>>>> (disabling class verification) appears to be important in some
cases
>> > >>>>> for real world applications where verification can be safely
disabled.
>> > >>>>>
>> > >>>>> http://cr.openjdk.java.net/~jiangli/8232222/webrev.01/
>> > >>>>
>> > >>>> Hi Jiangli,
>> > >>>>
>> > >>>> We've deprecated -verify:none and I don't want to have the JVM
add any
>> > >>>> feature or an optimization in particular that uses this.  I don't
think
>> > >>>> we should invite these bugs.
>> > >>>
>> > >>> Your feedback sound reasonable for options (1).
>> > >>>
>> > >>> Not completely related to this discussion, I have a feedback/input
>> > >>> about -Xverify:none removal based on my observation of the flag
>> > >>> usages. -Xverify:none is useful for reducing performance overhead
>> > >>> caused by verification and users do enable it when it is safe and
>> > >>> performance is important. In some cases, there can be ~10%
performance
>> > >>> difference. For -Xverify:none removal, it would be important to
have a
>> > >>> good strategy on how and when to do so, at least after dynamic
>> > >>> archiving is completely mature, and ready for its prime time and
broad
>> > >>> adoptions.
>> > >>>
>> > >>>>
>> > >>>>>
>> > >>>>> Option (2)
>> > >>>>> ========
>> > >>>>> Set an archived class loaded by the NULL loader in 'linked' state
>> > >>>>> during runtime restoration in
>> > >>>>> InstanceKlass::restore_unshareable_info() by default. Option (2)
only
>> > >>>>> supports the case (1) described in option (1).
>> > >>>>>
>> > >>>>> http://cr.openjdk.java.net/~jiangli/8232222/webrev.02/
>> > >>>>>
>> > >>>>> Both of the above proposals provide the performance benefit
described
>> > >>>>> in the original review request.
>> > >>>>>
>> > >>>>> In addition to the performance benefit, my other motivation for
this
>> > >>>>> change is to provide a foundation for potential optimization
with more
>> > >>>>> general class pre-initialization (currently only a few special
cases
>> > >>>>> are supported in the JDK classes) on top of the heap object graph
>> > >>>>> archiving mechanism (, and constant pool field and method
references
>> > >>>>> pre-resolution) in the future. Apologize for not clearly explain
that
>> > >>>>> in the original review request, although the RFE was linked to
the
>> > >>>>> umbrella RFE for pre-initialization/pre-resolution. I've
clarified it
>> > >>>>> in the RFE as well.
>> > >>>>
>> > >>>> I've read the RFE for this and the umbrella RFE, but I don't
understand
>> > >>>> the design or how this would accomplish it.  Could you have the
>> > >>>> initialized version of the class in the archive instead and use
it when
>> > >>>> we call klass->initialize() rather than calling into the <clinit>
function?
>> > >>>
>> > >>> It is a fine idea. If there is no JVMS compliance issue, we can
>> > >>> explore in this direction since it could be slightly more efficient
>> > >>> (not observable). We still need to post JVMTI events for loading,
>> > >>> preparing, etc at runtime. Can we involve David H. for the Spec
>> > >>> related discussions?
>> > >>
>> > >> I'm always lurking :)
>> > >>
>> > >> I'm not clear what exactly is being proposed at this stage, but I
assume
>> > >> this is a discussion for a future RFE in the New Year.
>> > >
>> > > It's related to Coleen's proposal of 'preserving'  an archived class
>> > > in 'initialized' state  at CDS dump time for cases that are suitable.
>> > > Currently all archived classes are reset back to 'allocated' state at
>> > > dump time. If a class' static fields have no dependency on runtime
>> > > context and can be safely pre-initialized at dump time, we can
>> > > potentially leave the class in 'initialized' state. At runtime, we
>> > > still need to do all the necessary operations for restoration but
>> > > without modifying the class state itself. It would be good to have a
>> > > spec related discussion upfront so no work is wasted. My feeling is
it
>> > > doesn't violate the JVMS, but it's better to confirm it before the
>> > > actual implementation.
>> >
>> > The spec is very strict on when initialization must occur, there is no
>> > option to be stricter or lazier. But if the initialized fields have
>> > idempotent values no matter when the initialization occurs then the
fact
>> > they already had that value won't be visible to any use of the class.
>> > However ...
>> >
>> > > We also need to make sure that we don't expose a class with an
>> > > 'inconsistent' state to a JVMTI agent before it's fully restored.
>> > > There was a JVMTI/CDS bug:
>> > > https://bugs.openjdk.java.net/browse/JDK-8210926. A JVMTI agent may
>> > > observe a class (from the ClassLoaderData::_klasses list) before the
>> > > mirror is restored. So we would need to do some reordering in that
>> > > area if an archived class is preserved in 'initialized' state.
>> >
>> > ... you would need to be careful about not showing the class as
>> > initialized to tools, prior to when actual initialization should occur
>> > as per the spec.
>>
>> Agreed, this approach could introduce potential complications for
>> JVMTI agents. It would need to be very carefully ordered to make sure
>> tools don't observe a class that's not completely linked (or
>> initialized) but with linked (or initialized) state. Keeping the class
>> state in 'allocated' in the archive effectively avoids the
>> complication.
>>
>> >
>> > One possibility may be to preserve the fields in the archive but
>> > maintain the class state as "linked" until initialization is mandated,
>> > at which point you can just flip the state to "initialized" without
>> > needing to execute the <clinit> code.
>>
>> Since currently we still have some restoration work for the linking
>> part, would it be better to maintain the class state as 'allocated'
>> until runtime restoration? Restoration can flip the state to 'linked',
>> which avoids the complications for JVMTI agent.
>>
>> For the 'initialized' state, flipping it at the natural initialization
>> time of a class adds some complications and a small overhead to
>> field/method constant pool reference pre-resolving. I'll give some
>> more thoughts to that.
>>
>> Thanks again!
>>
>> Best,
>> Jiangli
>>
>> >
>> > Cheers,
>> > David
>> > -----
>> >
>> > > With that, some of the changes in my current webrev.02
>> > > (http://cr.openjdk.java.net/~jiangli/8232222/webrev.02/) may still be
>> > > needed, but without the change for the '_init_state'.
>> > >
>> > > We can have more discussions after the holidays.
>> > >
>> > > Best,
>> > > Jiangli
>> > >
>> > >
>> > >>
>> > >> Cheers,
>> > >> David
>> > >>
>> > >>>>
>> > >>>> The JEP process is being streamlined more, so maybe this work
requires
>> > >>>> one.  I only just dropped into this thread but I'd love to learn
more
>> > >>>> about it, but after the holidays.  I have some concerns about
this code
>> > >>>> change.  It seems innocuous enough but it's limited to the
>> > >>>> bootclassloader and I don't know how this will help.
>> > >>>
>> > >>> Ioi and I also briefly discussed about JEP process. Maybe we could
>> > >>> break the work into a series of JEPs if necessary, so it still
enables
>> > >>> incremental developments? Each self-contained block of work could
be
>> > >>> mapped to an individual JEP. That would help us validate the
benefit
>> > >>> of each sub-work (of pre-resolution and pre-initialization) in real
>> > >>> world production environment more efficiently, and give the access
to
>> > >>> these new improvements to OpenJDK users in the community more
quickly.
>> > >>> It would also help me streamline the private patches without
dealing
>> > >>> with a lot of code conflicts and reduce the cost/overhead of
>> > >>> maintaining the patches.
>> > >>>
>> > >>> I'd love to discuss more with you after the holidays. I will also
need
>> > >>> to do some more researches (about the process) on my side.
>> > >>>
>> > >>> Best regards,
>> > >>> Jiangli
>> > >>>
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Coleen
>> > >>>>>
>> > >>>>> Following are more details on pre-initialization for interested
>> > >>>>> readers. I'm still investigating this area and flushing out the
>> > >>>>> details. So the best approach is to do those incrementally. The
>> > >>>>> information that I provide below may change. It is probably
better to
>> > >>>>> have a general pre-initialization and pre-resolving discussion
that's
>> > >>>>> separate from this review request. And we can move the current
change
>> > >>>>> forward once it's reviewed and accepted.
>> > >>>>>
>> > >>>>> One immediate improvement that the current change would enable is
>> > >>>>> simplifying the existing static field pre-initialization applied
to
>> > >>>>> some of the JDK classes. When an archived class can be set in the
>> > >>>>> 'linked' state during runtime restoration, it makes it possible
to
>> > >>>>> place the class in 'initialized' state at restoration. As
loading,
>> > >>>>> verifying, linking/preparing and initializing orders are
mandated by
>> > >>>>> JVMS, a class can only be placed in 'initialized' after it is
already
>> > >>>>> in 'linked' state. Also, we want to avoid a partially initialized
>> > >>>>> class state. Currently when static fields for a given JDK class
is
>> > >>>>> pre-initialized at CDS dump time, we need to store the preserved
field
>> > >>>>> values separately from the class mirror object (j.l.Class
instance).
>> > >>>>> At runtime, during running the class' static initializer
(<clinit>),
>> > >>>>> the VM finds and stores the values back to the corresponding
static
>> > >>>>> fields in the mirror.  Following is an example where we
pre-initialize
>> > >>>>> java.lang.module.Configuration. In the code below,
>> > >>>>> VM.initializeFromArchive() is called to retrieve the preserved
value
>> > >>>>> and store back to the mirror. A check is performed to verify if
the
>> > >>>>> value is restored from archive, otherwise we do runtime
>> > >>>>> initialization. The complication is due to the separate storage
for
>> > >>>>> the preserved values (stored outside the mirror).
>> > >>>>>
>> > >>>>>        static {
>> > >>>>>            // Initialize EMPTY_CONFIGURATION from the archive.
>> > >>>>>            VM.initializeFromArchive(Configuration.class);
>> > >>>>>
>> > >>>>>            // Create a new empty Configuration if there is no
archived version.
>> > >>>>>            if (EMPTY_CONFIGURATION == null) {
>> > >>>>>                EMPTY_CONFIGURATION = new Configuration();
>> > >>>>>            }
>> > >>>>>        }
>> > >>>>>
>> > >>>>> With the proposed change in the current review request, it would
>> > >>>>> enable us to set Configuration in 'initialized' state during
runtime
>> > >>>>> restoration, and therefore allow us to preserve the static field
value
>> > >>>>> as part of the mirror without violating the mandated ordering
>> > >>>>> requirement. With that, we can eliminate the
>> > >>>>> VM.initializeFromArchive() call and the if check in the Java
code for
>> > >>>>> some of the use cases. Once Configuration is fully 'restored' at
>> > >>>>> runtime, there is no need to execute the <clinit> at all. It
would
>> > >>>>> allow us to apply pre-initialization to more JDK classes if
desired,
>> > >>>>> without complicating the Java code.
>> > >>>>>
>> > >>>>> When we can properly set an archived class to 'initialized'
state at
>> > >>>>> restore time, it then will enable us to pre-resolve related
constant
>> > >>>>> pool method and field references.
>> > >>>>>
>> > >>>>> Ioi had some questions for pre-initialization and pre-resolution
which
>> > >>>>> I'm including below with my answers inlined. As I'm still
>> > >>>>> investigating this area, so most likely they are not final.
>> > >>>>>
>> > >>>>> - Are you going to do the same thing for the initialized state
-- all
>> > >>>>> classes loaded by boot loader will be automatically initialized?
>> > >>>>>
>> > >>>>> By default, an archived class loaded by the boot loader will
only be
>> > >>>>> placed in 'linked' state at restoration time when it is
suitable. For
>> > >>>>> 'value' object that cannot be placed in 'linked' state at
restore time
>> > >>>>> when archiving supported for that is added in the future, it can
be
>> > >>>>> handled differently and the related class can remain in 'alloced'
>> > >>>>> state during restoration.
>> > >>>>>
>> > >>>>> An archived class will only be set to 'initialized' during
runtime
>> > >>>>> restoration if the class can be in fully 'initialized' state
with all
>> > >>>>> static fields pre-initialized and preserved. An archived class
can be
>> > >>>>> 'flagged' for that case at dump time. Specific implementation
details
>> > >>>>> need further investigation.
>> > >>>>>
>> > >>>>> - That certainly doesn't seem feasible, so you will probably
need to
>> > >>>>> put some restrictions. So why aren't you putting in those
restrictions
>> > >>>>> now? Do you know what these restrictions are?
>> > >>>>>
>> > >>>>> The restriction only applies when we add support for more general
>> > >>>>> 'pre-initialization' and 'value' object archiving. Currently,
for the
>> > >>>>> restore-time 'linked' state implementation in the webrevs, the
>> > >>>>> restriction does not apply.
>> > >>>>>
>> > >>>>> - Will they be so restrictive that this will not be practical?
>> > >>>>>
>> > >>>>> As pre-initialization cannot be applied to all cases, selective
and
>> > >>>>> targeted pre-initialization (following current CDS static field
>> > >>>>> pre-initializing model) is a more flexible approach that
provides both
>> > >>>>> performance benefits and JVMS/language spec compliant.
>> > >>>>>
>> > >>>>> Comments are welcome and appreciated!
>> > >>>>>
>> > >>>>> Best regards,
>> > >>>>> Jiangli
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> On Thu, Nov 28, 2019 at 9:33 AM Jiangli Zhou <
jianglizhou at google.com> wrote:
>> > >>>>>> Hi,
>> > >>>>>>
>> > >>>>>> Please review the following optimization related to archived
classes'
>> > >>>>>> (from builtin loaders) runtime restoration and linking. It is
not
>> > >>>>>> intended for OpenJDK 14. After reviewers review the change,
I'll wait
>> > >>>>>> for 14 fork before pushing.
>> > >>>>>>
>> > >>>>>> webrev: http://cr.openjdk.java.net/~jiangli/8232222/webrev.00/
>> > >>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8232222
>> > >>>>>>
>> > >>>>>> Motivation and details of the change, which are duplicated in
the RFE
>> > >>>>>> =====================================================
>> > >>>>>>
>> > >>>>>> When linking a class, InstanceKlass::link_class_impl() first
links all
>> > >>>>>> super classes and super interfaces of the current class. For the
>> > >>>>>> current class, it then verifies and rewrites the bytecode, links
>> > >>>>>> methods, initializes the itable and vtable, and sets the
current class
>> > >>>>>> to 'linked' state.
>> > >>>>>>
>> > >>>>>> When loading an archived class at runtime,
>> > >>>>>> SystemDictionary::load_shared_class makes sure the super types
(all
>> > >>>>>> super classes and super interfaces) in the class hierarchy are
loaded
>> > >>>>>> first. If not, the archived class is not used. The archived
class is
>> > >>>>>> restored when 'loading' from the archive. At the end of the
>> > >>>>>> restoration, all methods are linked. As bytecode verification
and
>> > >>>>>> rewriting are done at CDS dump time, the runtime does not redo
the
>> > >>>>>> operations for an archived class.
>> > >>>>>>
>> > >>>>>> If we make sure the itable and vtable are properly initialized
(not
>> > >>>>>> needed for classes loaded by the NULL class loader) and
>> > >>>>>> SystemDictionaryShared::check_verification_constraints is
performed
>> > >>>>>> for an archived class during restoration, then the archived
class
>> > >>>>>> (from builtin loaders) is effectively in 'linked' state.
>> > >>>>>>
>> > >>>>>> For all archived classes loaded by the builtin loaders, we can
safely
>> > >>>>>> set to 'linked' state at the end of restoration. As a result,
we can
>> > >>>>>> save the work for iterating the super types in
>> > >>>>>> InstanceKlass::link_class_impl() at runtime.
>> > >>>>>>
>> > >>>>>> Performance results
>> > >>>>>> ================
>> > >>>>>>
>> > >>>>>> With both JDK 11 and the latest jdk/jdk, the proposed change
saves
>> > >>>>>> ~1.5M instruction execution when running HelloWorld with the
default
>> > >>>>>> CDS. Please see raw data in the RFE. For applications using more
>> > >>>>>> archived classes at runtime, larger saving should be
experienced.
>> > >>>>>>
>> > >>>>>> Testing
>> > >>>>>> ======
>> > >>>>>> Tested with all jtreg cds/* tests, which include all appcds
tests.
>> > >>>>>> Submit repo testing passed.
>> > >>>>>>
>> > >>>>>> The change has also gone through internal testing with very
large
>> > >>>>>> number of tests (all with default CDS enabled) for more than a
month.
>> > >>>>>>
>> > >>>>>> Best,
>> > >>>>>> Jiangli
>> > >>>>
>
>


More information about the hotspot-runtime-dev mailing list