RFR(S) 8218751 Do not store original classfiles inside the CDS archive
Ioi Lam
ioi.lam at oracle.com
Fri Feb 15 22:07:30 UTC 2019
Hi Jiangli,
Thanks for volunteering on this. I think it's best to do that (optional
support for storing uncompressed classfile data into the CDS archive) as
a separate bug than the current issue.
If you decide to provide a patch for that, I think it would be best to
find actual users or use cases that would find it beneficial. Otherwise
it will appear to be a solution looking for a problem.
Thanks
- Ioi
On 2/15/19 10:35 AM, Jiangli Zhou wrote:
> I'm willing to continue contributing to the support and maintaining of
> CDS/AppCDS. Often startup improvements with measurable gain (>4% in
> this case) are non-trivial. The support for 'od' is not overly complex
> comparing to the performance gain. In a rosier picture if
> Jlink/AOT/CDS can work together in harmony to create a single image in
> the future, class files (and even JAR files) can be eliminated and no
> class file data can be loaded at runtime from a JAR file. We should
> consider all aspects and not make a decision lightly.
>
> For the dynamic archiving, this is not a blocking issue. I can step in
> an help with making 'od' optional if needed.
>
> Thanks,
> Jiangli
>
> On Thu, Feb 14, 2019 at 7:53 PM Ioi Lam <ioi.lam at oracle.com
> <mailto:ioi.lam at oracle.com>> wrote:
>
> That means we have to add a new -XX option to enable this, and
> need to add extra test cases and maintenance.
>
> We are talking about only single digit % benefits in very narrow
> use cases, and it's not clear whether performance is important in
> such cases (or if people would bother to use this -XX flag to gain
> a few %). Without user input, it's hard to justify keep spending
> time on an optimization that literally no one had asked for.
>
> I just don't think we have time to keep maintaining this. If
> someone is willing to step up and provide support for this, feel free.
>
> Thanks
>
> - Ioi
>
>
> On 2/14/19 1:20 PM, Jiangli Zhou wrote:
>> Based on the issues raised in OpenJDK last year, there appears to
>> be existing use cases with AppCDS + CLFH. When designing the
>> platform, we should think from the user perspective. Making 'od'
>> optional is not complex and gives the control to the Java users.
>> It also avoids potential waste of effort for removing&putting
>> back the optimization.
>>
>> Thanks,
>> Jiangli
>> On Thu, Feb 14, 2019 at 11:46 AM Ioi Lam <ioi.lam at oracle.com
>> <mailto:ioi.lam at oracle.com>> wrote:
>>
>> Such optional support increase the complexity of the VM. My
>> preference would be to wait until there's an actual need for
>> this.
>>
>> The JDK is release under 6 months cadence, so if the removal
>> turns out to be a bad idea, we can reinstate the code in the
>> next release. Or, someone can just make their own JDK build
>> with a simple anti-delta of this patch.
>>
>> If we are too reluctant to remove anything, the JDK will
>> eventually become a hopeless mess.
>>
>> Thanks
>>
>> - Ioi
>>
>> On 2/14/19 10:43 AM, Jiangli Zhou wrote:
>>> Sorry for the slow turnaround. Hopefully it will get better
>>> after this week. As there is no enough user data/requirement
>>> to determine which optimization direction is more important
>>> in this case, it might be reasonable to make the 'optional
>>> data' space truly optional, which can be controlled by a
>>> command-line option. If the performance is more important
>>> (in case should_post_class_file_load_hook is required for a
>>> particular use case), user can enable dumping out the 'od'
>>> space with archived class file data, otherwise the data can
>>> be loaded at runtime. What's your thought on this?
>>>
>>> Thanks,
>>> Jiangli
>>>
>>> On Wed, Feb 13, 2019 at 12:03 PM Ioi Lam <ioi.lam at oracle.com
>>> <mailto:ioi.lam at oracle.com>> wrote:
>>>
>>>
>>> On 2/13/19 11:24 AM, Jiangli Zhou wrote:
>>>> Hi Ioi,
>>>>
>>>> Thanks for the additional information and performance
>>>> data. Please see more comments below.
>>>>
>>>> On Wed, Feb 13, 2019 at 5:33 AM Ioi Lam
>>>> <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>> wrote:
>>>>
>>>> Hi Jiangli,
>>>>
>>>> The main reason for doing this is to reduce the
>>>> size of the CDS file.
>>>> For example, when archiving Eclipse IDE with
>>>> JDK-8215311 (Dynamic Class
>>>> Metadata Archive), the CDS archive is reduced from
>>>> 100MB to about 67MB.
>>>>
>>>> We have not heard any requirement for high
>>>> performance with
>>>> CDS+ClassFileLoadHook. It doesn't seem right for
>>>> everyone to take a 50%
>>>> file size penalty for an optimization that no one
>>>> has asked for.
>>>>
>>>>
>>>> Jut to clarify, it is ~30% not 50% (possibly a typo?),
>>>> correct?
>>>
>>>
>>> Depending on how you look at it. Going from 67MB to
>>> 100MB is an increase of 33MB which is 50%.
>>>
>>>
>>>>
>>>> Here are some perf numbers (running "java -version"
>>>> with an agent that
>>>> installs a CFLH that doesn't nothing). Yes, it's
>>>> somewhat slower as
>>>> expected, but it doesn't seem to be catastrophic.
>>>> Using CDS is
>>>> nevertheless much faster than without CDS anyway.
>>>>
>>>> After No CFLH 0.0476 seconds time elapsed
>>>> ( +- 0.20% )
>>>> After with CFLH 0.0513 seconds time elapsed
>>>> ( +- 0.18% ) 7.773%
>>>> slower
>>>>
>>>> Before no CFLH 0.0472 seconds time elapsed
>>>> ( +- 0.21% )
>>>> Before with CFLH 0.0492 seconds time elapsed
>>>> ( +- 0.18% ) 4.237%
>>>> slower
>>>>
>>>>
>>>> Comparing 'After with CFLH' (0.0513 seconds) with
>>>> 'Before with CFLH' (0.0492 seconds), there is about
>>>> 4.27% performance degradation observed from your data.
>>>>
>>>> Another disadvantage of this proposed change is the
>>>> increasing of runtime memory. When the class file data
>>>> is stored in the shared archive file and mapped as RO
>>>> (read-only), it can be shared among multiple JVM
>>>> processes. For the default CDS archive case, it's
>>>> shared by all JVM processes running simultaneously.
>>>> With the proposed change in the webrev, the memory for
>>>> the class file data is allocated at runtime. It moves
>>>> the cost from static footprint to runtime memory
>>>> footprint, which is a higher price because it needs to
>>>> be multiplied by the number of running JVM processes.
>>>
>>>
>>> With this patch, the decoded classdata is freed after
>>> the CFLH is called, so there's no resident memory cost.
>>>
>>>
>>>>
>>>>
>>>> CDS Off with CFLH 0.0869 seconds time elapsed
>>>> ( +- 0.18% )
>>>> CDS Off without CFLH 0.0852 seconds time elapsed
>>>> ( +- 0.14% ) 1.995%
>>>> slower
>>>>
>>>> I am not sure if there's a performance critical
>>>> case for CFLH.
>>>>
>>>>
>>>> I have the similar question. It would be good to go
>>>> back and re-investigate the original requirements for
>>>> supporting CFLH use cases.
>>>
>>>
>>> I think when we added CFLH support for CDS, the goal was
>>> "to not lose all benefit of CDS when CFLH is enabled". I
>>> think we still achieve that after this patch. There
>>> really was no requirement "to make CFLH blazing fast at
>>> all cost".
>>>
>>> Thanks
>>>
>>> - Ioi
>>>
>>>> I suppose
>>>> if performance is critical you would rewrite the
>>>> classes statically and
>>>> rebuild your app, instead of patching its bytecodes
>>>> at runtime. However,
>>>> if there were indeed a performance critical case, I
>>>> think it's better to
>>>> change JVMTI to allow a 2-level filtering for CFLH:
>>>>
>>>> I believe the overwhelming use case, where
>>>> performance is critical, the
>>>> CFLH will just patch a small number of class files.
>>>> I can't fathom any
>>>> use case where someone wants to patch EVERY loaded
>>>> class.
>>>>
>>>> The current CFLH passes both the name of the class,
>>>> as well as the
>>>> classfile data, in a single call. This forces CDS
>>>> to decode the
>>>> classfile data for every hook call. However, in
>>>> most cases, the CFLH
>>>> will just examine the class name, and do nothing
>>>> unless the name matches
>>>> a certain pattern, so we end up wasting the
>>>> decoding effort.
>>>>
>>>> My suggested improvement is to add a new filtering
>>>> call in JVMTI that
>>>> passes only the name. If the CFLH wants to patch
>>>> the class, it will then
>>>> request the classfile data, at which point CDS will
>>>> decode it from the
>>>> modules file or JAR file.
>>>>
>>>>
>>>> That probably is the right direction.
>>>>
>>>> Thanks,
>>>> Jiangli
>>>>
>>>>
>>>> Thanks
>>>>
>>>> - Ioi
>>>>
>>>>
>>>> On 2/12/19 5:18 PM, Jiangli Zhou wrote:
>>>> > Hi Ioi,
>>>> >
>>>> > I'd like to understand the performance impact
>>>> with this change. Do you have
>>>> > any performance numbers when
>>>> JvmtiExport::should_post_class_file_load_hook()
>>>> > is required? This is a performance vs footprint
>>>> trade-off. For some users,
>>>> > performance is more important than static footprint.
>>>> >
>>>> > Could you also please provide some
>>>> background/motivation for this change?
>>>> >
>>>> > Thanks,
>>>> > Jiangli
>>>> >
>>>> > On Mon, Feb 11, 2019 at 9:24 AM Ioi Lam
>>>> <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>> wrote:
>>>> >
>>>> >>
>>>> http://cr.openjdk.java.net/~iklam/jdk13/8218751-dont-store-classfiles-in-cds.v01/
>>>> >> https://bugs.openjdk.java.net/browse/JDK-8218751
>>>> >>
>>>> >> For JVMTI ClassFileLoadHook support, the CDS
>>>> archive currently stores
>>>> >> the original classfile data of all archived classes.
>>>> >>
>>>> >> However, this consists of over 30% of the
>>>> archive size. Because all
>>>> >> original classfile data are already available in
>>>> other files (such as the
>>>> >> JDK lib/modules file, or JAR files in the
>>>> classpath), we can simply read
>>>> >> from these locations when needed by JVMTI.
>>>> >>
>>>> >> For the default CDS archive (included as part of
>>>> the JDK distribution),
>>>> >> the size is reduced from about 18.5MB to 12.1MB
>>>> on Linux/x64.
>>>> >>
>>>> >> Thanks
>>>> >>
>>>> >> - Ioi
>>>> >>
>>>> >>
>>>>
More information about the hotspot-runtime-dev
mailing list