RFR: 8207812: Implement Dynamic CDS Archive

Thu Apr 25 17:35:34 UTC 2019

Jiangli,

Thank you for thinking so carefully about this. You ask great questions.

One detailed question that arose for me from your email was:
What happens today if you specify -XX:ArchiveClassesAtExit and the archive
already exists? Is it the case that we delete the existing file before open the new one?
Just want to make sure we document that.

Here is my understanding of where we are thinking of going - folks please correct anything -
of course as we learn more this will evolve.

1. ArchiveClassesAtExit
This is potentially an intermediate step, which as you point out is only here to create an archive
at the end of an execution.
We don’t yet have enough experience with this and with a potential future continuous dumping
mode to know if this mode will still be useful to customers when more advanced modes are 
available. My personal sense is that this could be a useful model even long-term.

2. I agree that we have talked about longer-term - 
a) incremental archiving additional loaded classes
   - there are a lot of ways to do this
      - possibly creating a separate dynamic archive, archiving as you go rather than at exit
      - possibly creating another layer of dynamic archive
      - possibly updating the current dynamic archive and being the only one who can read/write it
   - I’m not sure if this is needed or the exit model is sufficient (and possibly more efficient?)
      - prototyping could help us learn this
b) possibilities of sharing an archive while it is being updated
   - I confess I am not excited about the concurrency complexities here - so not sure it is worth it
   - less useful with current models of Cloud usage
c) a single command that could create an archive if it does not exist and use it if one does
d) possibly making the default be to generate a dynamic archive if it is not available and use if it is
   - this could be built on archive at exit model or on incremental archive
   - I think this is a reasonable model to explore - customers today expect the first execution of applications
     to be slower and the dynamic linker’s caching will make second runs faster
   - I think we need more field experience and user feedback before we go here

I do appreciate your link to mmap. 
I agree we don’t handle that today. I believe we don’t need to support it unless we explicitly want
concurrent reader/writers. Am I correctly hearing what you are saying?

So I think the model is to add ArchiveClassesAtExit now, and to reserve the more flexible
command-line argument for when we move to more automation.

I don’t think we know yet whether that would use a dumpatexit or an incremental model. 
I would rather we save the automation model until we know more about where we are going, rather
than use it for this step.

At that point, try to find a command-line argument that has a create-if-does-not-exist/use-if-does-exist model
(perhaps -Xshare:dynamic or -Xshare::reallyauto …).

My translation here is that we are all aiming in the same direction, the discussion is really about
how to phase this.

Fair?

I share a sadness that we are not already there.

However I am actually quite excited about where we are - many thanks to you and Calvin and Ioi for years of design and implementation!

thanks,
Karen

> On Apr 25, 2019, at 12:40 PM, Jiangli Zhou <jianglizhou at google.com> wrote:
> 
> Karen,
> 
> On Tue, Apr 23, 2019 at 11:16 AM Karen Kinnear <karen.kinnear at oracle.com <mailto:karen.kinnear at oracle.com>> wrote:
>> 
>> Calvin,
>> 
>> I added to the CSR a comment from my favorite customer - relative to the user model for the command-line flags.
>> He likes the proposal to reduce the number of steps a customer has to perform to get startup and footprint benefits
>> from the archive.
>> 
>> The comment was that it would be very helpful if the user only needed to change their scripts once - so
>> a single command-line argument would create a dynamic archive if one did not exist, and use it if it
>> already existed.
> 
> This is a very plausible idea and is in the same direction of the
> multi-staged rollout of the end goal of dynamic archiving, which is
> making the archive generation/usage completely transparent and
> automatic (no command-line option is need to generate and use the
> dynamic archive).
> 
> Using a single command-line option to control the creation of the
> archive (when one doesn't exist) and the use of the archive (when one
> already exists) is needed for cases when users want more control.
> 
> The following is copied from
> http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html <http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html>. The
> behavior is unspecified if the underlying file changes after mmap is
> established. Currently, we are not handling the case. The dynamic
> archive simply follow the usage model of the existing static archive
> with separate steps for dump-time and runtime, which however doesn't
> preclude the case. In the current usage model, concurrency is less
> often. With the single command-line option controlled creation&use,
> there might be more concurrences and we probably want to handle the
> issue.
> 
> "If the size of the mapped file changes after the call to mmap() as a
> result of some other operation on the mapped file, the effect of
> references to portions of the mapped region that correspond to added
> or removed portions of the file is unspecified."
> 
> The current flag, ArchiveClassesAtExit doesn't scale and can't handle
> the single option controlled creation/use case. It would be a good
> idea to re-think the command-line option now and change it before the
> first integration, so it can scale. -Xshare:dump/on/auto/dynamic might
> be able to serve the purpose. 'dynamic'
> can be used to trigger just dynamic dumping for now. It can be
> augmented to support the use case that you described above.
> 
> Best regards,
> Jiangli
> 
>> 
>> Is there a way to evolve the ArchiveClassesAtExit=<dynamic archive> to have that functionality?
>> 
>> thanks,
>> Karen
>> 
>> p.s. I think it makes more sense to put performance numbers in the implementation RFE comments rather than the JEP
>> comments
>> 
>> On Apr 22, 2019, at 5:16 PM, Jiangli Zhou <jianglizhou at google.com> wrote:
>> 
>> Hi Calvin,
>> 
>> Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section?
>> 
>> Thanks,
>> Jiangli
>> 
>> On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung <calvin.cheung at oracle.com <mailto:calvin.cheung at oracle.com>> wrote:
>>> 
>>> Hi Karen,
>>> 
>>> Thanks for your review!
>>> Please see my replies in-line below.
>>> 
>>> On 4/19/19, 9:29 AM, Karen Kinnear wrote:
>>>> Calvin,
>>>> 
>>>> Many thanks for all the work getting this ready, significantly
>>>> enhancing the testing and bug fixes.
>>>> 
>>>> I marked the CSR as reviewed-by - it looks great!
>>>> 
>>>> I reviewed this set of changes - I did not review the tests - I assume
>>>> you can get someone
>>>> else to do that.  I am grateful that Jiangli and Ioi are going to
>>>> review this also - they are much closer to
>>>> the details than I am.
>>>> 
>>>> 1. Do you have any performance numbers?
>>>> 1a. Startup: does using a combined dynamic CDS archive + base archive
>>>> give similar startup benefits
>>>> when you have the same classes in the archives?
>>> Below are some performance numbers from Eric, each number is for 50 runs:
>>> (base: using the default CDS archive,
>>>  test: using the dynamic archive,
>>>  Eric will get some numbers with a single archive which I think that's
>>> what you're looking for)
>>> 
>>> Lambda-noop:
>>> base:
>>> 0.066441427 seconds time elapsed
>>> test:
>>> 0.075428824 seconds time elapsed
>>> 
>>> Noop:
>>> base:
>>> 0.057614537 seconds time elapsed
>>> test:
>>> 0.066061557 seconds time elapsed
>>> 
>>> Netty:
>>> base:
>>> 0.827013307 seconds time elapsed
>>> test:
>>> 0.604982805 seconds time elapsed
>>> 
>>> Spring:
>>> base:
>>> 2.376707358 seconds time elapsed
>>> test:
>>> 1.927618893 seconds time elapsed
>>> 
>>> The first 2 apps only have 2 to 3 classes in the dynamic archive. So the
>>> overhead is likely due to having to open and map the dynamic archive and
>>> performs checking on header, etc. For small apps, I think it's better to
>>> use a single archive. The Netty app has around 1400  classes in the
>>> dynamic archive; the Spring app has about 3700 classes in the dynamic
>>> archive.
>>> 
>>> I also used our LotsOfClasses test to collect some perf numbers. This is
>>> more like runtime performance, not startup performance.
>>> 
>>> With dynamic archive (100 runs each):
>>> real    2m37.191s
>>> real    2m36.003s
>>> Total loaded classes = 24254
>>> Loaded from base archive = 1186
>>> Loaded from top archive = 23042
>>> Loaded from jrt:/ (runtime module) = 26
>>> 
>>> With single archive (100 runs each):
>>> real    2m38.346s
>>> real    2m36.947s
>>> Total loaded classes = 24254
>>> Loaded from archive = 24228
>>> Loaded from jrt:/ (runtime module) = 26
>>> 
>>>> 
>>>> 1b. Do you have samples of uses of the combined dynamic CDS archive +
>>>> base archive vs. a single
>>>> static archive built for an application?
>>>>    - how do the sets of archived classes differ?
>>> Currently, the default CDS archive contains around 1187 classes. With
>>> the -XX:ArchiveClassesAtExit option, if the classes are not found in the
>>> default CDS archive, they will be archived in the dynamic archive. The
>>> above LotsOfClasses example shows some distributions between various
>>> archives.
>>>>    - one note was that the AtExit approach exclude list adds anything
>>>> that has not yet linked - does that make a significant difference in
>>>> the number of classes that are archived? Does that make a difference
>>>> in either startup time or in application execution time? I could see
>>>> that going either way.
>>> As the above numbers indicated, there's not much difference in terms of
>>> execution time using a dynamic vs a single archive with a large number
>>> of classes loaded. The numbers from Netty and Spring apps show an
>>> improvement over default CDS archive.
>>>> 
>>>> 1c. Any sense of performance cost for first run - how much time does
>>>> it take to create an incremental archive?
>>>>    - is the time comparable to an existing dump for a single archive
>>>> for the application?
>>>>    - this is an ease-of-use feature - so we are not expecting that to
>>>> be fast
>>>>    - the point is to set expectations in our documentation
>>> I did some rough measurements with the LotsOfClasses test with around
>>> 15000 classes in the classlist.
>>> 
>>> Dynamic archive dumping (one run each):
>>> real    0m19.756s
>>> real    0m20.241s
>>> 
>>> Static archive dumping (one run each):
>>> real    0m17.725s
>>> real    0m16.993s
>>>> 
>>>> 2. Footprint
>>>> With two archives rather than one, is there a significant footprint
>>>> difference? Obviously this will vary by app and archive.
>>>> Once again, the point is to set expectations.
>>> Sizes of the archives for the LotsOfClasses test in 1a.
>>> 
>>> Single archive: 242962432
>>> Default CDS archive: 12365824
>>> Dynamic archive: 197525504
>>> 
>>>> 
>>>> 3. Runtime performance
>>>> With two sets of archived dictionaries &  symbolTables - is there any
>>>> significant performance cost to larger benchmarks, e.g. for class
>>>> loading lookup for classes that are not in the archives?  Or symbol
>>>> lookup?
>>> I used the LotsOfClasses test again. This time archiving about half of
>>> the classes which will be loaded during runtime.
>>> 
>>> Dynamic archive (10 runs each):
>>> real    0m30.214s
>>> real    0m29.633s
>>> Loaded classes = 24254
>>> Loaded from dynamic archive: 13168
>>> 
>>> Single archive (10 runs each):
>>> real    0m32.383s
>>> real    0m32.905s
>>> Loaded classes = 24254
>>> Loaded from single archive = 15063
>>>> 
>>>> 4. Platform support
>>>> Which platforms is this supported on?
>>>> Which ones did you test? For example, did you run the tests on Windows?
>>> I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris,
>>> Windows).
>>>> 
>>>> Detailed feedback on the code: Just minor comments - I don’t need to
>>>> see an updated webrev:
>>> I'm going to look into your detailed feedback below and may reply in a
>>> separate email.
>>> 
>>> thanks,
>>> Calvin
>>>> 
>>>> 1. metaSpaceShared.hpp
>>>> line 156:
>>>> what is the hardcoded -100 for? Should that be an enum?
>>>> 
>>>> 2. jfrRecorder.cpp
>>>> So JFR recordings are disabled if DynamicDumpSharedSpaces?
>>>> why?
>>>> Is that a future rfe?
>>>> 
>>>> 3. systemDictionaryShared.cpp
>>>> Could you possibly add a comment to add_verification_constraint
>>>> for if (DynamicDumpSharedSpaces)
>>>>   return false
>>>> 
>>>> -- I think the logic is:
>>>>  because we have successfully linked any instanceKlass we archive
>>>> with DynamicDumpSharedSpaces, we have resolved all the constraint classes.
>>>> 
>>>> -- I didn't check the order - is this called before or after
>>>> excluding? If after, then would it make sense to add an assertion
>>>> here is_linked? Then if you ever change how/when linking is done, this
>>>> might catch future errors.
>>>> 
>>>> 4. systemDictionaryShared.cpp
>>>> EstimateSizeForArchive::do_entry
>>>> Is it the case that for info.is_builtin() there are no verification
>>>> constraints? So you could skip that calculation? Or did I misunderstand?
>>>> 
>>>> 5. compactHashtable.cpp
>>>> serialize/header/calculate_header_size
>>>> -- could you dynamically determine size_of header so you don't need
>>>> to hardcode a 5?
>>>> 
>>>> 6. classLoader.cpp
>>>> line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are
>>>> mutually exclusive.
>>>> Can you clarify for me:
>>>> My memory of the base archive is that we do not allow the following
>>>> options at dump time - and these
>>>> are the same for the dynamic archive: —limit-modules,
>>>> —upgrade-module-path, —patch-module.
>>>> 
>>>> I have forgotten:
>>>> Today with UseSharedSpaces - do we allow these flags? Is that also the
>>>> same behavior with the dynamic
>>>> archive?
>>>> 
>>>> 7. classLoaderExt.cpp
>>>> assert line 66: only used with -Xshare:dump
>>>> -> "only used at dump time"
>>>> 
>>>> 8. symbolTable.cpp
>>>> line 473: comment // used by UseSharedArchived2
>>>> — command-line arg name has changed
>>>> 
>>>> 9. filemap.cpp
>>>> Comment lines 529 ...
>>>> Is this true - that you can only support dynamic dumping with the
>>>> default CDS archive? Could you clarify what the restrictions are?
>>>> The CSR implies you can support “a specific base CDS archive"
>>>>  - so base layer can not have appended boot class path
>>>>  - and base layer can't have a module path
>>>> 
>>>> What can you specify for the dynamic dumping relative to the base archive?
>>>>  - matching class path?
>>>>  - appended class path?
>>>>  in future - could it have a module path that matched the base archive?
>>>> 
>>>> Should any of these restrictions be clarified in documentation/CSR
>>>> since they appear to be new?
>>>> 
>>>> 10. filemap.cpp
>>>> check_archive
>>>> Do some of the return false paths skip performing os::close(fd)?
>>>> 
>>>> and get_base_archive_name_from_header
>>>> Does the first return false path fail to os::free(dynamic_header)
>>>> 
>>>> lines 753-754: two FIXME comments
>>>> 
>>>> Could you delete commented out line 1087 in filemap.cpp ?
>>>> 
>>>> 11. filemap.hpp
>>>> line 214: TODO left in
>>>> 
>>>> 12. metaspace.cpp
>>>> line 1418 FIXME left in
>>>> 
>>>> 13. java.cpp
>>>> FIXME: is this the right place?
>>>> For starting the DynamicArchive::dump
>>>> 
>>>> Please check with David Holmes on that one
>>>> 
>>>> 14. dynamicArchive.hpp
>>>> line 55 (and others): MetsapceObj -> MetaspaceObj
>>>> 
>>>> 15. dynamicArchive.cpp
>>>> line 285 rel-ayout -> re-layout
>>>> 
>>>> lines 277 && 412
>>>> Do we archive array klasses in the base archive but not in the dynamic
>>>> archive?
>>>> Is that a potential RFE?
>>>> Is it possible that GatherKlassesAndSymbols::do_unique_ref could be
>>>> called with an array class?
>>>> Same question for copy_impl?
>>>> 
>>>> line 934: "no onger" -> "no longer"
>>>> 
>>>> 16. What is AllowArchivingWithJavaAgent? Is that a hook for a
>>>> potential future rfe?
>>>> Do you want to check in that code at this time? In product?
>>>> 
>>>> thanks,
>>>> Karen
>>>> 
>>>> 
>>>>> On Apr 11, 2019, at 5:18 PM, Calvin Cheung <calvin.cheung at oracle.com <mailto:calvin.cheung at oracle.com>
>>>>> <mailto:calvin.cheung at oracle.com <mailto:calvin.cheung at oracle.com>>> wrote:
>>>>> 
>>>>> This is a follow-up on the preliminary code review sent by Jiangli in
>>>>> January[1].
>>>>> 
>>>>> Highlights of changes since then:
>>>>> 1. New vm option for dumping a dynamic archive
>>>>> (-XX:ArchiveClassesAtExit=<dynamic archive>) and enhancement to the
>>>>> existing -XX:SharedArchiveFile option. Please refer to the
>>>>> corresponding CSR[2] for details.
>>>>> 2. New way to run existing AppCDS tests in dynamic CDS archive mode.
>>>>> At the jtreg command line, the user can run many existing AppCDS
>>>>> tests in dynamic CDS archive mode by specifying the following:
>>>>>   -vmoptions:-Dtest.dynamic.cds.archive=true
>>>>> <jdk>/open/test/hotspot/jtreg:hotspot_appcds_dynamic
>>>>>  We will have a follow-up RFE to determine in which tier the above
>>>>> tests should be run.
>>>>> 3. Added more tests.
>>>>> 4. Various bug fixes to improve stability.
>>>>> 
>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 <https://bugs.openjdk.java.net/browse/JDK-8207812>
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ <http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/>
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8207812_dynamic_cds_archive/webrev.00/ <http://cr.openjdk.java.net/%7Eccheung/8207812_dynamic_cds_archive/webrev.00/>>
>>>>> 
>>>>> (The webrev is based on top of the following rev:
>>>>> http://hg.openjdk.java.net/jdk/jdk/rev/805584336738 <http://hg.openjdk.java.net/jdk/jdk/rev/805584336738>)
>>>>> 
>>>>> Testing:
>>>>>   - mach5 tiers 1- 3 (including the new tests)
>>>>>   - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few
>>>>> tests require more investigation)
>>>>> 
>>>>> thanks,
>>>>> Calvin
>>>>> 
>>>>> [1]
>>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html <https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html>
>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8221706 <https://bugs.openjdk.java.net/browse/JDK-8221706>