RFR: 8207812: Implement Dynamic CDS Archive

Fri Apr 26 14:36:15 UTC 2019

Karen,

Please see comments below.

On Thu, Apr 25, 2019 at 10:35 AM Karen Kinnear <karen.kinnear at oracle.com> wrote:
>
> Jiangli,
>
> Thank you for thinking so carefully about this. You ask great questions.
>
> One detailed question that arose for me from your email was:
> What happens today if you specify -XX:ArchiveClassesAtExit and the archive
> already exists? Is it the case that we delete the existing file before open the new one?

Yes.

> Just want to make sure we document that.
>
> Here is my understanding of where we are thinking of going - folks please correct anything -
> of course as we learn more this will evolve.
>
> 1. ArchiveClassesAtExit
> This is potentially an intermediate step, which as you point out is only here to create an archive
> at the end of an execution.
> We don’t yet have enough experience with this and with a potential future continuous dumping
> mode to know if this mode will still be useful to customers when more advanced modes are
> available. My personal sense is that this could be a useful model even long-term.
>
> 2. I agree that we have talked about longer-term -
> a) incremental archiving additional loaded classes
>    - there are a lot of ways to do this
>       - possibly creating a separate dynamic archive, archiving as you go rather than at exit
>       - possibly creating another layer of dynamic archive
>       - possibly updating the current dynamic archive and being the only one who can read/write it
>    - I’m not sure if this is needed or the exit model is sufficient (and possibly more efficient?)
>       - prototyping could help us learn this
> b) possibilities of sharing an archive while it is being updated
>    - I confess I am not excited about the concurrency complexities here - so not sure it is worth it
>    - less useful with current models of Cloud usage
> c) a single command that could create an archive if it does not exist and use it if one does
> d) possibly making the default be to generate a dynamic archive if it is not available and use if it is
>    - this could be built on archive at exit model or on incremental archive
>    - I think this is a reasonable model to explore - customers today expect the first execution of applications
>      to be slower and the dynamic linker’s caching will make second runs faster
>    - I think we need more field experience and user feedback before we go here
>
> I do appreciate your link to mmap.
> I agree we don’t handle that today. I believe we don’t need to support it unless we explicitly want
> concurrent reader/writers. Am I correctly hearing what you are saying?
>
> So I think the model is to add ArchiveClassesAtExit now, and to reserve the more flexible
> command-line argument for when we move to more automation.
>
> I don’t think we know yet whether that would use a dumpatexit or an incremental model.
> I would rather we save the automation model until we know more about where we are going, rather
> than use it for this step.
>
> At that point, try to find a command-line argument that has a create-if-does-not-exist/use-if-does-exist model
> (perhaps -Xshare:dynamic or -Xshare::reallyauto …).
>
> My translation here is that we are all aiming in the same direction, the discussion is really about
> how to phase this.
>
> Fair?

Sounds reasonable.

Thanks and regards,
Jiangli

>
> I share a sadness that we are not already there.
>
> However I am actually quite excited about where we are - many thanks to you and Calvin and Ioi for years of design and implementation!
>
> thanks,
> Karen
>
> On Apr 25, 2019, at 12:40 PM, Jiangli Zhou <jianglizhou at google.com> wrote:
>
> Karen,
>
> On Tue, Apr 23, 2019 at 11:16 AM Karen Kinnear <karen.kinnear at oracle.com> wrote:
>
>
> Calvin,
>
> I added to the CSR a comment from my favorite customer - relative to the user model for the command-line flags.
> He likes the proposal to reduce the number of steps a customer has to perform to get startup and footprint benefits
> from the archive.
>
> The comment was that it would be very helpful if the user only needed to change their scripts once - so
> a single command-line argument would create a dynamic archive if one did not exist, and use it if it
> already existed.
>
>
> This is a very plausible idea and is in the same direction of the
> multi-staged rollout of the end goal of dynamic archiving, which is
> making the archive generation/usage completely transparent and
> automatic (no command-line option is need to generate and use the
> dynamic archive).
>
> Using a single command-line option to control the creation of the
> archive (when one doesn't exist) and the use of the archive (when one
> already exists) is needed for cases when users want more control.
>
> The following is copied from
> http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html. The
> behavior is unspecified if the underlying file changes after mmap is
> established. Currently, we are not handling the case. The dynamic
> archive simply follow the usage model of the existing static archive
> with separate steps for dump-time and runtime, which however doesn't
> preclude the case. In the current usage model, concurrency is less
> often. With the single command-line option controlled creation&use,
> there might be more concurrences and we probably want to handle the
> issue.
>
> "If the size of the mapped file changes after the call to mmap() as a
> result of some other operation on the mapped file, the effect of
> references to portions of the mapped region that correspond to added
> or removed portions of the file is unspecified."
>
> The current flag, ArchiveClassesAtExit doesn't scale and can't handle
> the single option controlled creation/use case. It would be a good
> idea to re-think the command-line option now and change it before the
> first integration, so it can scale. -Xshare:dump/on/auto/dynamic might
> be able to serve the purpose. 'dynamic'
> can be used to trigger just dynamic dumping for now. It can be
> augmented to support the use case that you described above.
>
> Best regards,
> Jiangli
>
>
> Is there a way to evolve the ArchiveClassesAtExit=<dynamic archive> to have that functionality?
>
> thanks,
> Karen
>
> p.s. I think it makes more sense to put performance numbers in the implementation RFE comments rather than the JEP
> comments
>
> On Apr 22, 2019, at 5:16 PM, Jiangli Zhou <jianglizhou at google.com> wrote:
>
> Hi Calvin,
>
> Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section?
>
> Thanks,
> Jiangli
>
> On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung <calvin.cheung at oracle.com> wrote:
>
>
> Hi Karen,
>
> Thanks for your review!
> Please see my replies in-line below.
>
> On 4/19/19, 9:29 AM, Karen Kinnear wrote:
>
> Calvin,
>
> Many thanks for all the work getting this ready, significantly
> enhancing the testing and bug fixes.
>
> I marked the CSR as reviewed-by - it looks great!
>
> I reviewed this set of changes - I did not review the tests - I assume
> you can get someone
> else to do that.  I am grateful that Jiangli and Ioi are going to
> review this also - they are much closer to
> the details than I am.
>
> 1. Do you have any performance numbers?
> 1a. Startup: does using a combined dynamic CDS archive + base archive
> give similar startup benefits
> when you have the same classes in the archives?
>
> Below are some performance numbers from Eric, each number is for 50 runs:
> (base: using the default CDS archive,
>  test: using the dynamic archive,
>  Eric will get some numbers with a single archive which I think that's
> what you're looking for)
>
> Lambda-noop:
> base:
> 0.066441427 seconds time elapsed
> test:
> 0.075428824 seconds time elapsed
>
> Noop:
> base:
> 0.057614537 seconds time elapsed
> test:
> 0.066061557 seconds time elapsed
>
> Netty:
> base:
> 0.827013307 seconds time elapsed
> test:
> 0.604982805 seconds time elapsed
>
> Spring:
> base:
> 2.376707358 seconds time elapsed
> test:
> 1.927618893 seconds time elapsed
>
> The first 2 apps only have 2 to 3 classes in the dynamic archive. So the
> overhead is likely due to having to open and map the dynamic archive and
> performs checking on header, etc. For small apps, I think it's better to
> use a single archive. The Netty app has around 1400  classes in the
> dynamic archive; the Spring app has about 3700 classes in the dynamic
> archive.
>
> I also used our LotsOfClasses test to collect some perf numbers. This is
> more like runtime performance, not startup performance.
>
> With dynamic archive (100 runs each):
> real    2m37.191s
> real    2m36.003s
> Total loaded classes = 24254
> Loaded from base archive = 1186
> Loaded from top archive = 23042
> Loaded from jrt:/ (runtime module) = 26
>
> With single archive (100 runs each):
> real    2m38.346s
> real    2m36.947s
> Total loaded classes = 24254
> Loaded from archive = 24228
> Loaded from jrt:/ (runtime module) = 26
>
>
> 1b. Do you have samples of uses of the combined dynamic CDS archive +
> base archive vs. a single
> static archive built for an application?
>    - how do the sets of archived classes differ?
>
> Currently, the default CDS archive contains around 1187 classes. With
> the -XX:ArchiveClassesAtExit option, if the classes are not found in the
> default CDS archive, they will be archived in the dynamic archive. The
> above LotsOfClasses example shows some distributions between various
> archives.
>
>    - one note was that the AtExit approach exclude list adds anything
> that has not yet linked - does that make a significant difference in
> the number of classes that are archived? Does that make a difference
> in either startup time or in application execution time? I could see
> that going either way.
>
> As the above numbers indicated, there's not much difference in terms of
> execution time using a dynamic vs a single archive with a large number
> of classes loaded. The numbers from Netty and Spring apps show an
> improvement over default CDS archive.
>
>
> 1c. Any sense of performance cost for first run - how much time does
> it take to create an incremental archive?
>    - is the time comparable to an existing dump for a single archive
> for the application?
>    - this is an ease-of-use feature - so we are not expecting that to
> be fast
>    - the point is to set expectations in our documentation
>
> I did some rough measurements with the LotsOfClasses test with around
> 15000 classes in the classlist.
>
> Dynamic archive dumping (one run each):
> real    0m19.756s
> real    0m20.241s
>
> Static archive dumping (one run each):
> real    0m17.725s
> real    0m16.993s
>
>
> 2. Footprint
> With two archives rather than one, is there a significant footprint
> difference? Obviously this will vary by app and archive.
> Once again, the point is to set expectations.
>
> Sizes of the archives for the LotsOfClasses test in 1a.
>
> Single archive: 242962432
> Default CDS archive: 12365824
> Dynamic archive: 197525504
>
>
> 3. Runtime performance
> With two sets of archived dictionaries &  symbolTables - is there any
> significant performance cost to larger benchmarks, e.g. for class
> loading lookup for classes that are not in the archives?  Or symbol
> lookup?
>
> I used the LotsOfClasses test again. This time archiving about half of
> the classes which will be loaded during runtime.
>
> Dynamic archive (10 runs each):
> real    0m30.214s
> real    0m29.633s
> Loaded classes = 24254
> Loaded from dynamic archive: 13168
>
> Single archive (10 runs each):
> real    0m32.383s
> real    0m32.905s
> Loaded classes = 24254
> Loaded from single archive = 15063
>
>
> 4. Platform support
> Which platforms is this supported on?
> Which ones did you test? For example, did you run the tests on Windows?
>
> I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris,
> Windows).
>
>
> Detailed feedback on the code: Just minor comments - I don’t need to
> see an updated webrev:
>
> I'm going to look into your detailed feedback below and may reply in a
> separate email.
>
> thanks,
> Calvin
>
>
> 1. metaSpaceShared.hpp
> line 156:
> what is the hardcoded -100 for? Should that be an enum?
>
> 2. jfrRecorder.cpp
> So JFR recordings are disabled if DynamicDumpSharedSpaces?
> why?
> Is that a future rfe?
>
> 3. systemDictionaryShared.cpp
> Could you possibly add a comment to add_verification_constraint
> for if (DynamicDumpSharedSpaces)
>   return false
>
> -- I think the logic is:
>  because we have successfully linked any instanceKlass we archive
> with DynamicDumpSharedSpaces, we have resolved all the constraint classes.
>
> -- I didn't check the order - is this called before or after
> excluding? If after, then would it make sense to add an assertion
> here is_linked? Then if you ever change how/when linking is done, this
> might catch future errors.
>
> 4. systemDictionaryShared.cpp
> EstimateSizeForArchive::do_entry
> Is it the case that for info.is_builtin() there are no verification
> constraints? So you could skip that calculation? Or did I misunderstand?
>
> 5. compactHashtable.cpp
> serialize/header/calculate_header_size
> -- could you dynamically determine size_of header so you don't need
> to hardcode a 5?
>
> 6. classLoader.cpp
> line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are
> mutually exclusive.
> Can you clarify for me:
> My memory of the base archive is that we do not allow the following
> options at dump time - and these
> are the same for the dynamic archive: —limit-modules,
> —upgrade-module-path, —patch-module.
>
> I have forgotten:
> Today with UseSharedSpaces - do we allow these flags? Is that also the
> same behavior with the dynamic
> archive?
>
> 7. classLoaderExt.cpp
> assert line 66: only used with -Xshare:dump
> -> "only used at dump time"
>
> 8. symbolTable.cpp
> line 473: comment // used by UseSharedArchived2
> — command-line arg name has changed
>
> 9. filemap.cpp
> Comment lines 529 ...
> Is this true - that you can only support dynamic dumping with the
> default CDS archive? Could you clarify what the restrictions are?
> The CSR implies you can support “a specific base CDS archive"
>  - so base layer can not have appended boot class path
>  - and base layer can't have a module path
>
> What can you specify for the dynamic dumping relative to the base archive?
>  - matching class path?
>  - appended class path?
>  in future - could it have a module path that matched the base archive?
>
> Should any of these restrictions be clarified in documentation/CSR
> since they appear to be new?
>
> 10. filemap.cpp
> check_archive
> Do some of the return false paths skip performing os::close(fd)?
>
> and get_base_archive_name_from_header
> Does the first return false path fail to os::free(dynamic_header)
>
> lines 753-754: two FIXME comments
>
> Could you delete commented out line 1087 in filemap.cpp ?
>
> 11. filemap.hpp
> line 214: TODO left in
>
> 12. metaspace.cpp
> line 1418 FIXME left in
>
> 13. java.cpp
> FIXME: is this the right place?
> For starting the DynamicArchive::dump
>
> Please check with David Holmes on that one
>
> 14. dynamicArchive.hpp
> line 55 (and others): MetsapceObj -> MetaspaceObj
>
> 15. dynamicArchive.cpp
> line 285 rel-ayout -> re-layout
>
> lines 277 && 412
> Do we archive array klasses in the base archive but not in the dynamic
> archive?
> Is that a potential RFE?
> Is it possible that GatherKlassesAndSymbols::do_unique_ref could be
> called with an array class?
> Same question for copy_impl?
>
> line 934: "no onger" -> "no longer"
>
> 16. What is AllowArchivingWithJavaAgent? Is that a hook for a
> potential future rfe?
> Do you want to check in that code at this time? In product?
>
> thanks,
> Karen
>
>
> On Apr 11, 2019, at 5:18 PM, Calvin Cheung <calvin.cheung at oracle.com
> <mailto:calvin.cheung at oracle.com>> wrote:
>
> This is a follow-up on the preliminary code review sent by Jiangli in
> January[1].
>
> Highlights of changes since then:
> 1. New vm option for dumping a dynamic archive
> (-XX:ArchiveClassesAtExit=<dynamic archive>) and enhancement to the
> existing -XX:SharedArchiveFile option. Please refer to the
> corresponding CSR[2] for details.
> 2. New way to run existing AppCDS tests in dynamic CDS archive mode.
> At the jtreg command line, the user can run many existing AppCDS
> tests in dynamic CDS archive mode by specifying the following:
>   -vmoptions:-Dtest.dynamic.cds.archive=true
> <jdk>/open/test/hotspot/jtreg:hotspot_appcds_dynamic
>  We will have a follow-up RFE to determine in which tier the above
> tests should be run.
> 3. Added more tests.
> 4. Various bug fixes to improve stability.
>
> RFE: https://bugs.openjdk.java.net/browse/JDK-8207812
> webrev:
> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/
> <http://cr.openjdk.java.net/%7Eccheung/8207812_dynamic_cds_archive/webrev.00/>
>
> (The webrev is based on top of the following rev:
> http://hg.openjdk.java.net/jdk/jdk/rev/805584336738)
>
> Testing:
>   - mach5 tiers 1- 3 (including the new tests)
>   - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few
> tests require more investigation)
>
> thanks,
> Calvin
>
> [1]
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html
> [2] https://bugs.openjdk.java.net/browse/JDK-8221706
>
>