RFR(S) 8218751 Do not store original classfiles inside the CDS archive
Jiangli Zhou
jianglizhou at google.com
Wed Feb 20 21:32:50 UTC 2019
On Tue, Feb 19, 2019 at 8:31 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>
> On 2/19/19 5:32 PM, Jiangli Zhou wrote:
>
> Hi Ioi,
>
> This looks good to me. Thanks for adding the command-line option. The
> option name sounds good to me.
>
> Please see two minor comments/questions below:
>
> - src/hotspot/share/classfile/klassFactory.cpp
>
> The following new code could be eliminated if you want, because the name
> of the path at the path_index should be the same as the one from 'cfs' when
> cfs is not NULL.
>
> 94 if (cfs != NULL) { 95 pathname = cfs->source(); 96 } else if (path_index < 0) {
>
>
> The purpose of doing this is to avoid the
> (mod_entry->location()->as_C_string()) call below.
>
I figured that's the intention.
>
> - src/hotspot/share/memory/filemap.cpp
>
> 507 if (_classpath_entries_for_jvmti != NULL) { 508 os::free(_classpath_entries_for_jvmti); 509 }
>
> When _classpath_entries_for_jvmti would not be NULL when we get here? How
> do we handle two archives when base archive + dynamic archive are involved?
> Or, this is already based on the case when the base archive is validated
> before the top layer dynamic archive?
>
>
> This is forward-compatible code -- when the top layer archive is loaded,
> this code will be called again, and we may need a longer
> _classpath_entries_for_jvmti.
>
Yes.
There are two small issues that I missed to point out earlier. You only
need to malloc _classpath_entries_for jvmti if the 'od' size is 0. Also, we
need to free the table properly.
BTW, to clarify one point you raised below when OD is not used:
>
> "Runtime footprint increase (causes more memory fragmentation even
> the memory is freed after use)"
>
> This is not true. The decoded classfile data is allocated using
> ResourceMark, which uses scoped allocation. There's no fragmentation after
> you free the memory.
>
Thanks for pointing that out and correcting me.
Thanks,
Jiangli
>
> Thanks
> - Ioi
>
> Thanks,
> Jiangli
>
> On Mon, Feb 18, 2019 at 10:58 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>> Hi Jiangli,
>>
>> I think you're right that apps that use CFLH but rarely redefine classes
>> might see a slight speed up with the OD space. I don't know how prevalent
>> such uses cases are, but I won't object to making the OD space optional.
>>
>> So, I added a new flag to enable the OD space. Instead of the name you
>> suggested, I used EnableOptionalDataSharedSpace so that it's related to
>> other XXXSharedSpace flags. I also added a sanity test case, and fixed a Misplaced
>> ResourceMark in klassFactory.cpp
>>
>>
>> http://cr.openjdk.java.net/~iklam/jdk13/8218751-dont-store-classfiles-in-cds.v02/
>>
>> http://cr.openjdk.java.net/~iklam/jdk13/8218751-dont-store-classfiles-in-cds.v02-delta/
>>
>> Please let me know what you think.
>>
>> Thanks
>> - Ioi
>>
>> On 2/18/19 8:12 PM, Jiangli Zhou wrote:
>>
>>
>> On Mon, Feb 18, 2019 at 11:33 AM Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>>>
>>>
>>> On 2/15/19 7:11 PM, Jiangli Zhou wrote:
>>>
>>> On Fri, Feb 15, 2019 at 6:56 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>
>>>> On 2/15/19 4:54 PM, Jiangli Zhou wrote:
>>>>
>>>> Hi,
>>>>
>>>> To answer your use case question, one of the case reported last year in
>>>> OpenJDK was JRebel (please go back to the hotspot-dev mail list 2018
>>>> Oct. archive). That is an existing example as I've tried to point out in my
>>>> earlier email.
>>>>
>>>> First of all, if you read the email carefully, CDS was not even in
>>>> their usual test matrix. Also, if you're modifying classes on the fly,
>>>> you're slowing down start-up big time. A few ms spent in decoding the
>>>> classfile data will be completely drown out by the overhead of the
>>>> classfile parsing and patching code in the JVMTI agent.
>>>>
>>>> So no, JRebel will not start up a few % faster if CDS stores the
>>>> classfile data. It will most likely be not measurable.
>>>>
>>>
>>> My understanding of the case is that their users generated a CDS archive
>>> and used it together with JRebel. That's why the issue was unnoticed until
>>> their users reported. There are use cases out there in the community that
>>> we simply are not aware of, so it would not be wise to assume there is no
>>> usage.
>>>
>>> Just to confirm one thing -- you do not dispute my claim that this
>>> "optimization" has no benefits even for those that actually use CDS and
>>> CFLH together. Correct?
>>>
>>
>> That's actually the part that I have a different opinion. With the
>> proposed change in the current webrev, the performance hit is across the
>> board for loading shared classes regardless if there is a class being
>> redefined/retransformed, as long as the can_generate_all_class_hook_events
>> capability is enabled.
>>
>>
>>>
>>> We have plenty of nice-to-have harmless optimizations in HotSpot that
>>> probably weren't vigorously validated. However, this is not one of them.
>>>
>>> Here we have clear evidence of harm (50% footprint increase), with no
>>> evidence of benefit (theoretical or real-life). This code was checked into
>>> HotSpot without proper validation. That was wrong, and that's why I am
>>> taking it out now.
>>>
>>> It's easy to prove me wrong. Just supply a proper benchmark that shows
>>> benefits, and I will change my mind. That's the minimal standard for an
>>> optimization that has harmful side effects. I have created JDK-8219255 for
>>> tracking this.
>>>
>>
>> Looks like there is a deadlock in the discussion. To summarize and help
>> move this forward. Here are your reasons for dropping the 'od' space:
>>
>> - static footprint increase due to 'od' space
>> - no benefit
>> - maintenance of the code
>>
>> Regarding static footprint, the change proposed in the current webrev is
>> a short-term and temporary solution. For a long-term solution, the original
>> class files can be removed from the final image created by jlink when jlink
>> can work with CDS to produce a single runtime image.
>>
>> On the benefit side, the 'od' space provides both startup speedup and
>> runtime memory saving (when can_generate_all_class_hook_events capability
>> is enabled), which I've already pointed those out in earlier emails.
>>
>> For maintenance, it is also not an issue with the support from the
>> community (I'll for sure to continue contributing to it). What's more
>> important is to continue keeping CDS/AppCDS moving in the right direction
>> by providing more performance & memory benefits and making it easier to use.
>>
>> So I think providing a command-line option, -XX:EnableOptionalDataSpace
>> that allows user to enable/disable the 'od' space at both dump time and
>> runtime is the best choice that can work with the short-term fix proposed
>> in the current webrev.
>>
>> Thanks,
>> Jiangli
>>
>> Thanks,
>> Jiangli
>>
>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>> Thanks,
>>> Jiangli
>>>
>>>
>>>> The issue with the current change is that it's only targeted for
>>>> reducing the static footprint (static footprint reduction could be achieved
>>>> with alternative approaches such as file compression). The removal of
>>>> the 'od' space and the benefits (both the startup and runtime footprint) is
>>>> not backed up by a clear requirement here and the change should not go in
>>>> as is.
>>>>
>>>> As part of the JVM development, we constantly review and remove
>>>> unnecessary code to keep the JVM healthy. That's a clear requirement and
>>>> mandate on Oracle as the major contributor to the JDK.
>>>>
>>>> In this case, we are removing a premature optimization in the original
>>>> design. The classfile data stored in the CDS archive are rarely used
>>>> (99.99+%), and when they are used, the speed optimization is irrelevant
>>>> because the user of this data (JVMTI agent) is so much slower. So everyone
>>>> is taking a footprint hit with no benefit to anyone.
>>>> Thanks
>>>>
>>>> - Ioi
>>>>
>>>>
>>>> Just to summarize the drawbacks of the current change:
>>>>
>>>> - Runtime footprint increase (causes more memory fragmentation even
>>>> the memory is freed after use)
>>>> - Startup time regression when can_generate_all_class_hook_events
>>>> capability is enabled
>>>> - May cause issue with future optimization
>>>>
>>>> Thanks,
>>>> Jiangli
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Feb 15, 2019 at 2:07 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>
>>>>> Hi Jiangli,
>>>>>
>>>>> Thanks for volunteering on this. I think it's best to do that
>>>>> (optional support for storing uncompressed classfile data into the CDS
>>>>> archive) as a separate bug than the current issue.
>>>>>
>>>>> If you decide to provide a patch for that, I think it would be best to
>>>>> find actual users or use cases that would find it beneficial. Otherwise it
>>>>> will appear to be a solution looking for a problem.
>>>>>
>>>>> Thanks
>>>>>
>>>>> - Ioi
>>>>>
>>>>>
>>>>> On 2/15/19 10:35 AM, Jiangli Zhou wrote:
>>>>>
>>>>> I'm willing to continue contributing to the support and maintaining of
>>>>> CDS/AppCDS. Often startup improvements with measurable gain (>4% in this
>>>>> case) are non-trivial. The support for 'od' is not overly complex comparing
>>>>> to the performance gain. In a rosier picture if Jlink/AOT/CDS can work
>>>>> together in harmony to create a single image in the future, class files
>>>>> (and even JAR files) can be eliminated and no class file data can be loaded
>>>>> at runtime from a JAR file. We should consider all aspects and not make a
>>>>> decision lightly.
>>>>>
>>>>> For the dynamic archiving, this is not a blocking issue. I can step in
>>>>> an help with making 'od' optional if needed.
>>>>>
>>>>> Thanks,
>>>>> Jiangli
>>>>>
>>>>> On Thu, Feb 14, 2019 at 7:53 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>
>>>>>> That means we have to add a new -XX option to enable this, and need
>>>>>> to add extra test cases and maintenance.
>>>>>>
>>>>>> We are talking about only single digit % benefits in very narrow use
>>>>>> cases, and it's not clear whether performance is important in such cases
>>>>>> (or if people would bother to use this -XX flag to gain a few %). Without
>>>>>> user input, it's hard to justify keep spending time on an optimization that
>>>>>> literally no one had asked for.
>>>>>>
>>>>>> I just don't think we have time to keep maintaining this. If someone
>>>>>> is willing to step up and provide support for this, feel free.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> - Ioi
>>>>>>
>>>>>>
>>>>>> On 2/14/19 1:20 PM, Jiangli Zhou wrote:
>>>>>>
>>>>>> Based on the issues raised in OpenJDK last year, there appears to be
>>>>>> existing use cases with AppCDS + CLFH. When designing the platform, we
>>>>>> should think from the user perspective. Making 'od' optional is not complex
>>>>>> and gives the control to the Java users. It also avoids potential waste of
>>>>>> effort for removing&putting back the optimization.
>>>>>>
>>>>>> Thanks,
>>>>>> Jiangli
>>>>>>
>>>>>> On Thu, Feb 14, 2019 at 11:46 AM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>
>>>>>>> Such optional support increase the complexity of the VM. My
>>>>>>> preference would be to wait until there's an actual need for this.
>>>>>>>
>>>>>>> The JDK is release under 6 months cadence, so if the removal turns
>>>>>>> out to be a bad idea, we can reinstate the code in the next release. Or,
>>>>>>> someone can just make their own JDK build with a simple anti-delta of this
>>>>>>> patch.
>>>>>>>
>>>>>>> If we are too reluctant to remove anything, the JDK will eventually
>>>>>>> become a hopeless mess.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> - Ioi
>>>>>>> On 2/14/19 10:43 AM, Jiangli Zhou wrote:
>>>>>>>
>>>>>>> Sorry for the slow turnaround. Hopefully it will get better after
>>>>>>> this week. As there is no enough user data/requirement to determine which
>>>>>>> optimization direction is more important in this case, it might be
>>>>>>> reasonable to make the 'optional data' space truly optional, which can be
>>>>>>> controlled by a command-line option. If the performance is more important
>>>>>>> (in case should_post_class_file_load_hook is required for a particular use
>>>>>>> case), user can enable dumping out the 'od' space with archived class file
>>>>>>> data, otherwise the data can be loaded at runtime. What's your thought on
>>>>>>> this?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jiangli
>>>>>>>
>>>>>>> On Wed, Feb 13, 2019 at 12:03 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On 2/13/19 11:24 AM, Jiangli Zhou wrote:
>>>>>>>>
>>>>>>>> Hi Ioi,
>>>>>>>>
>>>>>>>> Thanks for the additional information and performance data. Please
>>>>>>>> see more comments below.
>>>>>>>>
>>>>>>>> On Wed, Feb 13, 2019 at 5:33 AM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Jiangli,
>>>>>>>>>
>>>>>>>>> The main reason for doing this is to reduce the size of the CDS
>>>>>>>>> file.
>>>>>>>>> For example, when archiving Eclipse IDE with JDK-8215311 (Dynamic
>>>>>>>>> Class
>>>>>>>>> Metadata Archive), the CDS archive is reduced from 100MB to about
>>>>>>>>> 67MB.
>>>>>>>>>
>>>>>>>>> We have not heard any requirement for high performance with
>>>>>>>>> CDS+ClassFileLoadHook. It doesn't seem right for everyone to take
>>>>>>>>> a 50%
>>>>>>>>> file size penalty for an optimization that no one has asked for.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Jut to clarify, it is ~30% not 50% (possibly a typo?), correct?
>>>>>>>>
>>>>>>>>
>>>>>>>> Depending on how you look at it. Going from 67MB to 100MB is an
>>>>>>>> increase of 33MB which is 50%.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here are some perf numbers (running "java -version" with an agent
>>>>>>>>> that
>>>>>>>>> installs a CFLH that doesn't nothing). Yes, it's somewhat slower
>>>>>>>>> as
>>>>>>>>> expected, but it doesn't seem to be catastrophic. Using CDS is
>>>>>>>>> nevertheless much faster than without CDS anyway.
>>>>>>>>>
>>>>>>>>> After No CFLH 0.0476 seconds time elapsed ( +- 0.20% )
>>>>>>>>> After with CFLH 0.0513 seconds time elapsed ( +- 0.18% )
>>>>>>>>> 7.773%
>>>>>>>>> slower
>>>>>>>>>
>>>>>>>>> Before no CFLH 0.0472 seconds time elapsed ( +- 0.21% )
>>>>>>>>> Before with CFLH 0.0492 seconds time elapsed ( +- 0.18% )
>>>>>>>>> 4.237%
>>>>>>>>> slower
>>>>>>>>>
>>>>>>>>
>>>>>>>> Comparing 'After with CFLH' (0.0513 seconds) with 'Before with
>>>>>>>> CFLH' (0.0492 seconds), there is about 4.27% performance degradation
>>>>>>>> observed from your data.
>>>>>>>>
>>>>>>>> Another disadvantage of this proposed change is the increasing of
>>>>>>>> runtime memory. When the class file data is stored in the shared archive
>>>>>>>> file and mapped as RO (read-only), it can be shared among multiple JVM
>>>>>>>> processes. For the default CDS archive case, it's shared by all JVM
>>>>>>>> processes running simultaneously. With the proposed change in the webrev,
>>>>>>>> the memory for the class file data is allocated at runtime. It moves the
>>>>>>>> cost from static footprint to runtime memory footprint, which is a higher
>>>>>>>> price because it needs to be multiplied by the number of running JVM
>>>>>>>> processes.
>>>>>>>>
>>>>>>>>
>>>>>>>> With this patch, the decoded classdata is freed after the CFLH is
>>>>>>>> called, so there's no resident memory cost.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> CDS Off with CFLH 0.0869 seconds time elapsed ( +- 0.18% )
>>>>>>>>> CDS Off without CFLH 0.0852 seconds time elapsed ( +- 0.14% )
>>>>>>>>> 1.995%
>>>>>>>>> slower
>>>>>>>>>
>>>>>>>>> I am not sure if there's a performance critical case for CFLH.
>>>>>>>>
>>>>>>>>
>>>>>>>> I have the similar question. It would be good to go back and
>>>>>>>> re-investigate the original requirements for supporting CFLH use cases.
>>>>>>>>
>>>>>>>>
>>>>>>>> I think when we added CFLH support for CDS, the goal was "to not
>>>>>>>> lose all benefit of CDS when CFLH is enabled". I think we still achieve
>>>>>>>> that after this patch. There really was no requirement "to make CFLH
>>>>>>>> blazing fast at all cost".
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> - Ioi
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> I suppose
>>>>>>>>> if performance is critical you would rewrite the classes
>>>>>>>>> statically and
>>>>>>>>> rebuild your app, instead of patching its bytecodes at runtime.
>>>>>>>>> However,
>>>>>>>>> if there were indeed a performance critical case, I think it's
>>>>>>>>> better to
>>>>>>>>> change JVMTI to allow a 2-level filtering for CFLH:
>>>>>>>>>
>>>>>>>>> I believe the overwhelming use case, where performance is
>>>>>>>>> critical, the
>>>>>>>>> CFLH will just patch a small number of class files. I can't fathom
>>>>>>>>> any
>>>>>>>>> use case where someone wants to patch EVERY loaded class.
>>>>>>>>>
>>>>>>>>> The current CFLH passes both the name of the class, as well as the
>>>>>>>>> classfile data, in a single call. This forces CDS to decode the
>>>>>>>>> classfile data for every hook call. However, in most cases, the
>>>>>>>>> CFLH
>>>>>>>>> will just examine the class name, and do nothing unless the name
>>>>>>>>> matches
>>>>>>>>> a certain pattern, so we end up wasting the decoding effort.
>>>>>>>>>
>>>>>>>>> My suggested improvement is to add a new filtering call in JVMTI
>>>>>>>>> that
>>>>>>>>> passes only the name. If the CFLH wants to patch the class, it
>>>>>>>>> will then
>>>>>>>>> request the classfile data, at which point CDS will decode it from
>>>>>>>>> the
>>>>>>>>> modules file or JAR file.
>>>>>>>>>
>>>>>>>>
>>>>>>>> That probably is the right direction.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jiangli
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> - Ioi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2/12/19 5:18 PM, Jiangli Zhou wrote:
>>>>>>>>> > Hi Ioi,
>>>>>>>>> >
>>>>>>>>> > I'd like to understand the performance impact with this change.
>>>>>>>>> Do you have
>>>>>>>>> > any performance numbers when
>>>>>>>>> JvmtiExport::should_post_class_file_load_hook()
>>>>>>>>> > is required? This is a performance vs footprint trade-off. For
>>>>>>>>> some users,
>>>>>>>>> > performance is more important than static footprint.
>>>>>>>>> >
>>>>>>>>> > Could you also please provide some background/motivation for
>>>>>>>>> this change?
>>>>>>>>> >
>>>>>>>>> > Thanks,
>>>>>>>>> > Jiangli
>>>>>>>>> >
>>>>>>>>> > On Mon, Feb 11, 2019 at 9:24 AM Ioi Lam <ioi.lam at oracle.com>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> >>
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk13/8218751-dont-store-classfiles-in-cds.v01/
>>>>>>>>> >> https://bugs.openjdk.java.net/browse/JDK-8218751
>>>>>>>>> >>
>>>>>>>>> >> For JVMTI ClassFileLoadHook support, the CDS archive currently
>>>>>>>>> stores
>>>>>>>>> >> the original classfile data of all archived classes.
>>>>>>>>> >>
>>>>>>>>> >> However, this consists of over 30% of the archive size. Because
>>>>>>>>> all
>>>>>>>>> >> original classfile data are already available in other files
>>>>>>>>> (such as the
>>>>>>>>> >> JDK lib/modules file, or JAR files in the classpath), we can
>>>>>>>>> simply read
>>>>>>>>> >> from these locations when needed by JVMTI.
>>>>>>>>> >>
>>>>>>>>> >> For the default CDS archive (included as part of the JDK
>>>>>>>>> distribution),
>>>>>>>>> >> the size is reduced from about 18.5MB to 12.1MB on Linux/x64.
>>>>>>>>> >>
>>>>>>>>> >> Thanks
>>>>>>>>> >>
>>>>>>>>> >> - Ioi
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>>
>>>>>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list