From thomas.schatzl at oracle.com Mon Jan 8 16:45:04 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 08 Jan 2018 17:45:04 +0100 Subject: EpsilonGC and throughput. In-Reply-To: <304ef832-5e17-b53e-8b31-7c26fcb8c593@redhat.com> References: <3d50e13e-3324-7058-3c8c-725a52207074@oracle.com> <1162631c-1ebc-f562-f096-d1ac861a42a4@redhat.com> <37071ceb-d21b-cf96-2dea-87da409b2e68@oracle.com> <1513781182.2415.29.camel@oracle.com> <07b29153-f01f-f1b5-18bc-9c38caaaf7eb@redhat.com> <1513805149.2542.112.camel@oracle.com> <304ef832-5e17-b53e-8b31-7c26fcb8c593@redhat.com> Message-ID: <1515429904.2381.208.camel@oracle.com> Hi Aleksey, I apologize for my somewhat inappropriate words, this has been due to some frustration; also for the long delay that were due to the winter holidays. Let's try to start all over with this... I will try to be constructive this time. Feel free to remind me if needed. One purpose of the JEP is to share a problem and propose an idea (often already accompanied by a solution) to solve them. This problem and the idea is then discussed by the community, eventually refining it along the way. The community then evaluates that idea based on its contents, of course starting with the people trying to determine whether there is a problem, what the problem is, and whether the proposed idea will fix the problem. For this evaluation to happen, the JEP needs to clearly state the problem, it's seriousness, and the proposed idea. It also helps if the JEP is written in a way to make it interesting for the community to read it, and respond. The less thinking a reader has to do to answer whether he is impacted or not, and whether and by how much it would simplify the life of himself or in general Java users, the more people will feel urged to get this in (or at least not deterred). Finally, I assume you do understand that, in general, although there is always a certain level of duplication in the VM, but if a change only solves the problems that existing code already solves, or solves problems almost nobody has, or it does not give enough benefit (also dependent on the complexity of a change), it makes it a hard(er) sell? So the JEP template (http://openjdk.java.net/jeps/2) provides some questions on how to structure this idea proposal and what to put into the various sections. In general this is to help you providing the relevant information to the community. While this might be onerous for a writer at first glance, it saves everyone else lots of time trying to find out what and how you want to solve something. I am going over the Motivation section in detail in the remainder of this email, with some comments at the end about the Alternatives one which seem to be the most important here. The JEP template states under the Motivation section: "Motivation ---------- // Why should this work be done? What are its benefits? Who's asking // for it? How does it compare to the competition, if any?" Now let me try to associate these questions to the relevant parts of the existing JEP 318 (http://openjdk.java.net/jeps/318) text. And please, before reading below, I really do not want to shoot down the proposal if you see a question mark. It should indicate just that there is a question where I honestly do not know the answer to, but which I hope you do. Similarly if I raise some concerns about some statements I expect you to notice that there may be something missing here, nothing else. I.e. not necessarily that I am "right" about something. You said you already talked about it many times with other people in the field, thought it over for a long time, so hopefully these questions can be answered quickly, and in the future the JEP also contains this information for other people too. Some may not need an answer as they only try to make you think about the seriousness of a stated problem. JEP text: "Java implementations are well known for a broad choice of highly configurable GC implementations." Potential answer to "Why should this work be done?". Or does the sentence indicate we need another GC because we already have so many, and another does not hurt? I am asking this in full seriousness, I really do not know. Or is this only an introductory sentence without meaning? JEP text: "There are four use cases where a trivial no-op GC proves useful." This seems to be a transition sentence, but is fine to make it flow better. Reading this, and given that only a list of benefits follows, I assume that these two sentences were supposed to answer the "Why should this work be done? Who's asking for it?" questions from the JEP. In the earlier email you mentioned these power users that want full control. Mention them here. Define them. Also mention other user groups that might be interested. Particularly groups the benefits list could refer to. Let's go into these benefits in more detail: JEP text: "Performance testing. Having a GC that does almost nothing is a useful tool to do differential performance analysis for other, real GCs. Having a no-op GC can help to filter out GC-induced performance artifacts." Benefit. Maybe it would be useful to list a few of these performance artifacts here ("... , e.g. barrier code, concurrent threads"). Who are the benefactors of this? Not sure about these "power users" (see M. Berger's response in this exact thread). Probably developers of new GC algorithms? An alternative could be a developer just nop'ing out the relevant GC interface section. That is somewhat cumbersome, but for how many users is this a problem? Spell that out in the appropriate Alternatives section. Also tell that using Epsilon GC for barrier testing may not be an ideal tool, because all other existing collectors are generational (but in the future it might apply to Shenandoah unless it goes generational too, idk), and testing generational barriers on a non-generational heap may not give a complete picture of barrier overhead. JEP text: "Functional testing. For Java code testing, a way to establish a threshold for allocated memory is useful to assert memory pressure invariants. Today, we have to pick up the allocation data from MXBeans, or even resort to parsing GC logs. Having a GC that accepts only the bounded number of allocations, and fails on heap exhaustion, simplifies testing." Benefit. For regression testing, in how many cases do you think it is sufficient (or in what circumstances) to get a fail/no-fail answer only? This seems to pass work on a failure to the dev, them needing to write another test that also prints and monitors the memory usage increases over time anyway. How much work, given that you already need to monitor memory usage is the test to fail when heap usage goes above a threshold then? "VM interface testing. For VM development purposes, having a simple GC helps to understand the absolute minimum required from the VM-GC interface to have a functional allocator. This serves as proof that the VM-GC interface is sane, which is important in lieu of JEP 304 ("Garbage Collector Interface")." Benefit. Who are the (main) benefactors for that - probably developers? For a developer, how much is that benefit if there are already 5 or 6 implementations of that interface? "Last-drop performance improvements. For ultra-latency-sensitive applications, where developers are conscious about memory allocations and know the application memory footprint exactly, or even have (almost) completely garbage-free applications. In those applications, GC cycles may be considered an implementation bug that wastes CPU cycles for no good reason." This is the only benefit in this list that actually mentions its target group. I assume it is those power users (not necessarily developers only?), that are ultra-latency aware. This paragraph further characterizes them that they are also throughput conscious. The discussion earlier also characterized them as also being very conscious about memory layout etc, they do not want object reordering because it is inconsistent between GCs (which is a different issue, and I do not want to discuss it here). >From what I gathered so far, they want absolute control over memory management - but the real question is whether this is their real or only problem with the Java VM to achieve consistent VM behavior. There are certainly more components in the VM that introduce potentially more significant jitter (now assuming that that power user can set heap sizes accordingly to use e.g. Serial GC). This execution consistency is maybe another goal that is even more important than last-drop performance. It may be useful to investigate the problem of these power users in more detail, and see if we could provide a (more?) complete solution for them. "Extremely short lived jobs are one example of this." I do not understand the use of Epsilon in such use case. The alternative I can see would be to restart the VM after every short lived job (something for the Alternatives section). That seems strange to me, depending on the definition of a "short lived job", particularly if nothing survives after execution of that short lived job, a GC will be extremely fast. Further I assume this example is about FaaS (Function-as-a-service) and their users, and while there may be an overlap with those "power users", I would expect the "regular java users" a way larger group than the power users. There may be an overlap with those power users, power users probably would not want to incur the associated loss of control. "There are also cases when restarting the JVM -- letting load balancers figure out failover -- is sometimes a better recovery strategy than accepting a GC cycle." I really can't find a good example where a GC, particularly in the situation that has been described so far, also for these short-lived jobs, where a GC (on an almost empty heap) is not at least as fast as a restart. It would make for a very good paragraph explaining this use case in the alternatives section. Another problem with these two sentences to me is (and I am by no means a "FaaS power user") that I believe that waiting for the VM to crash/shut down to steer the load balancers is not a good strategy. Maybe you can give some more information about this use case? "Even for non-allocating workloads, the choice of GC means choosing the set of GC barriers that the workload has to use, even if no GC cycle actually happens. Most JDK GCs are generational, and they emit at least one reference write barrier. Avoiding this barrier brings the last bit of performance improvement." (_All_ JDK GCs are currently generational) Now, as mentioned earlier in the thread, when talking about performance improvements, it would be nice to mention the potential gains that can be made (or elsewhere, like in the alternatives section). There is already an implementation, and so you can measure this too. Please make your comparison in context: since this whole paragraph is about last-drop performance improvements for power users, a balanced comparison would probably only be a comparison that such a power user would do - i.e. not running the VM with randomly selected default options that arbitrarily penalizes your competition. In the earlier email I only directly asked for performance numbers because in order to streamline this discussion, and given that you are a well-known performance and benchmark guru (afaik you were "R"eviewer long before me joining) it seemed a logical request. If you can't find numbers, there is also the reference ("Barriers, Friendlier Still" or so from Blackburn et al I think) I got that is also mentioned iirc in the very good Jones GC book. "Real" newbies I would just ask to perform this test. In our discussion we found at least one more, actually unique benefit (the one about getting correct heap dumps on failure). Of course there is a limit on the length of that section and others (i.e. considering the attention span of your readership), but all questions asked by the JEP template should be answered in the corresponding section. There is some intentional overlap in the JEP, particularly in the first three sections, similar to a scientific paper so that different groups of readers need only read the sections they are interested in to see whether this change is actually affecting them (and interesting to follow). It shouldn't be as long as a scientific paper though, so if you think a section is too long, drop the less impactful benefits, and other parts of the JEP will automatically follow. Again, given your experience with the VM I assume you know alternatives as good or even better than me to make a balanced assessment here. Otherwise, keep them and please raise specific questions. As for the Alternatives section, it is the same procedure, start with answering the questions raised in the template: "Alternatives ------------ // Did you consider any alternative approaches or technologies? If so // then please describe them here and explain why they were not // chosen." I would assume that for all of these benefits we can easily come up with alternative ways of doing the same or a similar thing (I already stated a few alternatives that I think are very valid in this or previous emails; some valid ones are already in the JEP), and why we would want to particularly do it this way given the context of that benefit (e.g. the user group). If there is no alternative, add a sentence that says so in that section. Again, try to make these alternative review balanced, and in context of the users the benefit is for. This section should imho also include a discussion of "mostly complete alternatives", as suggested in this email thread already, e.g. adding a -XX:+DieOnFirstGC switch, and reasons for and against it. Please understand that the JEP will be the reference to talk about, not some email or private offline discussions. Keeping that in mind I think discussions will go much smoother. I hope I made clear now why I, unfortunately not in a very friendly way (apologies again), suggested that the current JEP text lacks the required answers to the questions stated in the JEP template to (re- )start a hopefully more focused discussion. Thanks, Thomas From kim.barrett at oracle.com Mon Jan 8 19:23:59 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 8 Jan 2018 14:23:59 -0500 Subject: RFR (S): 8193063: Enabling narrowOop values for RawAccess accesses In-Reply-To: <5A26B20E.8080701@oracle.com> References: <5A26B20E.8080701@oracle.com> Message-ID: > On Dec 5, 2017, at 9:49 AM, Erik ?sterlund wrote: > > Hi, > > In order to replace oopDesc::load_heap_oop() but with stricter memory ordering properties, like MO_VOLATILE, some Access tweaks are required to allow RawAccess<>::oop_load() to return narrowOop values. > > I made the necessary changes to allow narrowOop values for all RawAccess operations that have an address (not the _at variants). > > While I am at it, I thought I'd clean up a few things that bug me: > > * The decorator verification for memory ordering specifically did not work as I originally intended, leading to harder to decipher compiler errors deeper down when using the wrong memory ordering decorators I kind of wish that had been dealt with separately. Oh well. > * An unnecessary include of oop.inline.hpp was removed from the G1 barrier set. > > This change helps solving the following bug: > https://bugs.openjdk.java.net/browse/JDK-8129440 > > This bug is filed under: > https://bugs.openjdk.java.net/browse/JDK-8193063 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8193063/webrev.00/ > > Thanks, > /Erik ------------------------------------------------------------------------------ src/hotspot/share/oops/access.inline.hpp 525 typedef RawAccessBarrier Raw; 582 typedef RawAccessBarrier Raw; I don't see any use of these Raw types. They look like copy-paste as part of splitting into two specializations. ------------------------------------------------------------------------------ src/hotspot/share/oops/access.inline.hpp Too bad we don't have C++17. I think making can_hardwire_raw constexpr and using the new "if constexpr" syntax would have been sufficient. ------------------------------------------------------------------------------ src/hotspot/share/oops/access.inline.hpp I was initially thinking I wanted names for these two expressions, since they are repeated a bunch of times: HasDecorator::value && CanHardwireRaw::value HasDecorator::value && !CanHardwireRaw::value But I think what I really want is the better function template SFINAE syntax allowed by C++11, e.g. template), DISABLE_IF(CanHardwireRaw)> inline static T store(...) ... ------------------------------------------------------------------------------ src/hotspot/share/oops/access.inline.hpp 796 // Step 2: Reduce types. This comment looks like it needs updating for the narrowOop value support. ------------------------------------------------------------------------------ src/hotspot/share/oops/accessBackend.hpp 176 AccessInternal::MustConvertCompressedOop::value, Wrong indentation. ------------------------------------------------------------------------------ From thomas.schatzl at oracle.com Tue Jan 9 09:52:12 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 09 Jan 2018 10:52:12 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <1513092283.2401.0.camel@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> Message-ID: <1515491532.4041.6.camel@oracle.com> Hi all, Erik Duveblad had some offline comments: New webrevs: http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) - inverting some conditions in the clauses to read better - extract out the condition to do the maximally compacting full gc added in this change into a separate method. Erik Duveblad also noted that this change contains some slight behavioral change in when a collection is started. I.e. previously TLAB allocation by itself could not cause a GC. Since this change is already quite big, he suggested to fix this in a follow-up, and push them together. Thanks, Thomas From thomas.schatzl at oracle.com Tue Jan 9 09:54:23 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 09 Jan 2018 10:54:23 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <1515491532.4041.6.camel@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> Message-ID: <1515491663.4041.7.camel@oracle.com> Hi again, On Tue, 2018-01-09 at 10:52 +0100, Thomas Schatzl wrote: > Hi all, > > Erik Duveblad had some offline comments: > > New webrevs: > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) > > - inverting some conditions in the clauses to read better > > - extract out the condition to do the maximally compacting full gc > added in this change into a separate method. > > Erik Duveblad also noted that this change contains some slight > behavioral change in when a collection is started. I.e. previously > TLAB allocation by itself could not cause a GC. Since this change is > already quite big, he suggested to fix this in a follow-up, and push > them together. additionally we agreed to retarget both changes to 11 to let it bake a bit. Thanks, Thomas From thomas.schatzl at oracle.com Tue Jan 9 10:11:06 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 09 Jan 2018 11:11:06 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <1515491663.4041.7.camel@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> <1515491663.4041.7.camel@oracle.com> Message-ID: <1515492666.4041.12.camel@oracle.com> Hi again, :( On Tue, 2018-01-09 at 10:54 +0100, Thomas Schatzl wrote: > Hi again, > > On Tue, 2018-01-09 at 10:52 +0100, Thomas Schatzl wrote: > > Hi all, > > > > Erik Duveblad had some offline comments: > > > > New webrevs: > > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) > > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) > > > > - inverting some conditions in the clauses to read better > > > > - extract out the condition to do the maximally compacting full gc > > added in this change into a separate method. > > > > Erik Duveblad also noted that this change contains some slight > > behavioral change in when a collection is started. I.e. previously > > TLAB allocation by itself could not cause a GC. Since this change > > is already quite big, he suggested to fix this in a follow-up, and > > push them together. Some detail got lost here: previously TLAB allocation could not result in a Full GC, it can now (that maximally compacting GC). Its side effects need to be investigated a bit more. > > additionally we agreed to retarget both changes to 11 to let it > bake a bit. Thanks, Thomas From per.liden at oracle.com Tue Jan 9 10:18:57 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 9 Jan 2018 11:18:57 +0100 Subject: RFR (S): 8193063: Enabling narrowOop values for RawAccess accesses In-Reply-To: <5A26B20E.8080701@oracle.com> References: <5A26B20E.8080701@oracle.com> Message-ID: Looks good! /Per On 2017-12-05 15:49, Erik ?sterlund wrote: > Hi, > > In order to replace oopDesc::load_heap_oop() but with stricter memory > ordering properties, like MO_VOLATILE, some Access tweaks are required > to allow RawAccess<>::oop_load() to return narrowOop values. > > I made the necessary changes to allow narrowOop values for all RawAccess > operations that have an address (not the _at variants). > > While I am at it, I thought I'd clean up a few things that bug me: > > * The decorator verification for memory ordering specifically did not > work as I originally intended, leading to harder to decipher compiler > errors deeper down when using the wrong memory ordering decorators > * An unnecessary include of oop.inline.hpp was removed from the G1 > barrier set. > > This change helps solving the following bug: > https://bugs.openjdk.java.net/browse/JDK-8129440 > > This bug is filed under: > https://bugs.openjdk.java.net/browse/JDK-8193063 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8193063/webrev.00/ > > Thanks, > /Erik From erik.osterlund at oracle.com Tue Jan 9 11:09:52 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 9 Jan 2018 12:09:52 +0100 Subject: RFR (S): 8193063: Enabling narrowOop values for RawAccess accesses In-Reply-To: References: <5A26B20E.8080701@oracle.com> Message-ID: <5A54A300.4030800@oracle.com> Hi Kim, Thank you for the review. Incremental webrev: http://cr.openjdk.java.net/~eosterlund/8193063/webrev.00_01/ Full webrev: http://cr.openjdk.java.net/~eosterlund/8193063/webrev.01/ On 2018-01-08 20:23, Kim Barrett wrote: >> On Dec 5, 2017, at 9:49 AM, Erik ?sterlund wrote: >> >> Hi, >> >> In order to replace oopDesc::load_heap_oop() but with stricter memory ordering properties, like MO_VOLATILE, some Access tweaks are required to allow RawAccess<>::oop_load() to return narrowOop values. >> >> I made the necessary changes to allow narrowOop values for all RawAccess operations that have an address (not the _at variants). >> >> While I am at it, I thought I'd clean up a few things that bug me: >> >> * The decorator verification for memory ordering specifically did not work as I originally intended, leading to harder to decipher compiler errors deeper down when using the wrong memory ordering decorators > I kind of wish that had been dealt with separately. Oh well. > >> * An unnecessary include of oop.inline.hpp was removed from the G1 barrier set. >> >> This change helps solving the following bug: >> https://bugs.openjdk.java.net/browse/JDK-8129440 >> >> This bug is filed under: >> https://bugs.openjdk.java.net/browse/JDK-8193063 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8193063/webrev.00/ >> >> Thanks, >> /Erik > ------------------------------------------------------------------------------ > src/hotspot/share/oops/access.inline.hpp > 525 typedef RawAccessBarrier Raw; > 582 typedef RawAccessBarrier Raw; > > I don't see any use of these Raw types. They look like copy-paste as > part of splitting into two specializations. Fixed. > > ------------------------------------------------------------------------------ > src/hotspot/share/oops/access.inline.hpp > > Too bad we don't have C++17. I think making can_hardwire_raw constexpr > and using the new "if constexpr" syntax would have been sufficient. Indeed. I stopped for a second, cried a bit about the lack of constexpr, and then did it the C++03 way. > ------------------------------------------------------------------------------ > src/hotspot/share/oops/access.inline.hpp > > I was initially thinking I wanted names for these two expressions, > since they are repeated a bunch of times: > > HasDecorator::value && CanHardwireRaw::value > HasDecorator::value && !CanHardwireRaw::value > > But I think what I really want is the better function template SFINAE > syntax allowed by C++11, e.g. > > template ENABLE_IF(HasDecorator), > DISABLE_IF(CanHardwireRaw)> > inline static T store(...) ... Yes, we are a bit stuck with old school SFINAE for some time. > ------------------------------------------------------------------------------ > src/hotspot/share/oops/access.inline.hpp > 796 // Step 2: Reduce types. > > This comment looks like it needs updating for the narrowOop value > support. Unfortunately, the comment was already accurate (and hence was inaccurate before). > ------------------------------------------------------------------------------ > src/hotspot/share/oops/accessBackend.hpp > 176 AccessInternal::MustConvertCompressedOop::value, > > Wrong indentation. > > ------------------------------------------------------------------------------ Fixed. Thanks, /Erik From erik.osterlund at oracle.com Tue Jan 9 11:10:54 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 9 Jan 2018 12:10:54 +0100 Subject: RFR (S): 8193063: Enabling narrowOop values for RawAccess accesses In-Reply-To: References: <5A26B20E.8080701@oracle.com> Message-ID: <5A54A33E.2050300@oracle.com> Hi Per, Thank you for the review. /Erik On 2018-01-09 11:18, Per Liden wrote: > Looks good! > > /Per > > On 2017-12-05 15:49, Erik ?sterlund wrote: >> Hi, >> >> In order to replace oopDesc::load_heap_oop() but with stricter memory >> ordering properties, like MO_VOLATILE, some Access tweaks are required >> to allow RawAccess<>::oop_load() to return narrowOop values. >> >> I made the necessary changes to allow narrowOop values for all RawAccess >> operations that have an address (not the _at variants). >> >> While I am at it, I thought I'd clean up a few things that bug me: >> >> * The decorator verification for memory ordering specifically did not >> work as I originally intended, leading to harder to decipher compiler >> errors deeper down when using the wrong memory ordering decorators >> * An unnecessary include of oop.inline.hpp was removed from the G1 >> barrier set. >> >> This change helps solving the following bug: >> https://bugs.openjdk.java.net/browse/JDK-8129440 >> >> This bug is filed under: >> https://bugs.openjdk.java.net/browse/JDK-8193063 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8193063/webrev.00/ >> >> Thanks, >> /Erik From thomas.schatzl at oracle.com Tue Jan 9 12:18:49 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 09 Jan 2018 13:18:49 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <1515492666.4041.12.camel@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> <1515491663.4041.7.camel@oracle.com> <1515492666.4041.12.camel@oracle.com> Message-ID: <1515500329.4041.20.camel@oracle.com> Hi, aaaand, On Tue, 2018-01-09 at 11:11 +0100, Thomas Schatzl wrote: > Hi again, > [...] > > > Erik Duveblad also noted that this change contains some slight > > > behavioral change in when a collection is started. I.e. > > > previously TLAB allocation by itself could not cause a GC. Since > > > this change is already quite big, he suggested to fix this in a > > > follow-up, and push them together. > > Some detail got lost here: previously TLAB allocation could not > result in a Full GC, it can now (that maximally compacting GC). > > Its side effects need to be investigated a bit more. The only side effect would be that these collections would not be accounted in the GC overhead limit mechanism (go OOME when doing too many GCs in a row). However, G1 never implemented the GC overhead limit mechanism. So this behavioral change is a non-issue because it has no visible impact. I filed RFE JDK-8194821 for that. Serial and CMS have this issue: they do not account GCs caused by TLAB allocation at all in the GC overhead limit. Filed JDK-8194823. Only Parallel GC seems to do the right thing by not doing any GC during TLAB allocation. Thanks, Thomas From thomas.schatzl at oracle.com Tue Jan 9 12:39:46 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 09 Jan 2018 13:39:46 +0100 Subject: RFR (XXS): 8194824: Add gc/stress/gclocker/TestGCLockerWithParallel.java to the ProblemList file Message-ID: <1515501586.4041.24.camel@oracle.com> Hi all, can I get reviews for this small change the quarantines the gc/stress/gclocker/TestGCLockerWithParallel.java test? We can't get even modifications of it to reliably pass yet, maybe due to JDK-8192647, but in the meantime we want to remove the noise from the CI system. CR: https://bugs.openjdk.java.net/browse/JDK-8194824 Webrev: http://cr.openjdk.java.net/~tschatzl/8194824/webrev/ Thanks, Thomas From leo.korinth at oracle.com Tue Jan 9 14:41:26 2018 From: leo.korinth at oracle.com (Leo Korinth) Date: Tue, 9 Jan 2018 15:41:26 +0100 Subject: RFR: 8194681: G1 uses young free cset time when reporting non-young free cset times Message-ID: Hi, G1 uses young free cset time when reporting non-young free cset times. This patch is fixing the typo. Bug: https://bugs.openjdk.java.net/browse/JDK-8194681 Webrev: http://cr.openjdk.java.net/~lkorinth/8194681/00/ Testing: - hs-tier1, hs-tier2 Thanks, Leo From thomas.schatzl at oracle.com Tue Jan 9 14:54:01 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 09 Jan 2018 15:54:01 +0100 Subject: RFR: 8194681: G1 uses young free cset time when reporting non-young free cset times In-Reply-To: References: Message-ID: <1515509641.14906.1.camel@oracle.com> Hi, On Tue, 2018-01-09 at 15:41 +0100, Leo Korinth wrote: > Hi, > > G1 uses young free cset time when reporting non-young free cset > times. > This patch is fixing the typo. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8194681 > > Webrev: > http://cr.openjdk.java.net/~lkorinth/8194681/00/ > > Testing: > - hs-tier1, hs-tier2 looks good. Thomas From kim.barrett at oracle.com Tue Jan 9 17:14:03 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 9 Jan 2018 12:14:03 -0500 Subject: RFR (S): 8193063: Enabling narrowOop values for RawAccess accesses In-Reply-To: <5A54A300.4030800@oracle.com> References: <5A26B20E.8080701@oracle.com> <5A54A300.4030800@oracle.com> Message-ID: <316635A8-424E-4F5E-A97D-2979B9D713FD@oracle.com> > On Jan 9, 2018, at 6:09 AM, Erik ?sterlund wrote: > > Hi Kim, > > Thank you for the review. > > Incremental webrev: > http://cr.openjdk.java.net/~eosterlund/8193063/webrev.00_01/ > > Full webrev: > http://cr.openjdk.java.net/~eosterlund/8193063/webrev.01/ Looks good. From kim.barrett at oracle.com Tue Jan 9 17:15:55 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 9 Jan 2018 12:15:55 -0500 Subject: RFR: 8194681: G1 uses young free cset time when reporting non-young free cset times In-Reply-To: References: Message-ID: <0EA3254F-6097-4CCB-8943-FA2E2E97BA4C@oracle.com> > On Jan 9, 2018, at 9:41 AM, Leo Korinth wrote: > > Hi, > > G1 uses young free cset time when reporting non-young free cset times. This patch is fixing the typo. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8194681 > > Webrev: > http://cr.openjdk.java.net/~lkorinth/8194681/00/ > > Testing: > - hs-tier1, hs-tier2 > > Thanks, > Leo Looks good. From kim.barrett at oracle.com Tue Jan 9 17:17:13 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 9 Jan 2018 12:17:13 -0500 Subject: RFR (XXS): 8194824: Add gc/stress/gclocker/TestGCLockerWithParallel.java to the ProblemList file In-Reply-To: <1515501586.4041.24.camel@oracle.com> References: <1515501586.4041.24.camel@oracle.com> Message-ID: > On Jan 9, 2018, at 7:39 AM, Thomas Schatzl wrote: > > Hi all, > > can I get reviews for this small change the quarantines the > gc/stress/gclocker/TestGCLockerWithParallel.java test? > > We can't get even modifications of it to reliably pass yet, maybe due > to JDK-8192647, but in the meantime we want to remove the noise from > the CI system. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8194824 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8194824/webrev/ > > > Thanks, > Thomas Looks good. From erik.osterlund at oracle.com Tue Jan 9 17:41:16 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 9 Jan 2018 18:41:16 +0100 Subject: RFR (S): 8193063: Enabling narrowOop values for RawAccess accesses In-Reply-To: <316635A8-424E-4F5E-A97D-2979B9D713FD@oracle.com> References: <5A26B20E.8080701@oracle.com> <5A54A300.4030800@oracle.com> <316635A8-424E-4F5E-A97D-2979B9D713FD@oracle.com> Message-ID: <5A54FEBC.4050704@oracle.com> Hi Kim, Thank you for the review. /Erik On 2018-01-09 18:14, Kim Barrett wrote: >> On Jan 9, 2018, at 6:09 AM, Erik ?sterlund wrote: >> >> Hi Kim, >> >> Thank you for the review. >> >> Incremental webrev: >> http://cr.openjdk.java.net/~eosterlund/8193063/webrev.00_01/ >> >> Full webrev: >> http://cr.openjdk.java.net/~eosterlund/8193063/webrev.01/ > Looks good. > From erik.helin at oracle.com Wed Jan 10 06:29:09 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 10 Jan 2018 07:29:09 +0100 Subject: RFR (XXS): 8194824: Add gc/stress/gclocker/TestGCLockerWithParallel.java to the ProblemList file In-Reply-To: <1515501586.4041.24.camel@oracle.com> References: <1515501586.4041.24.camel@oracle.com> Message-ID: <01df746c-59fe-1eaf-5d7c-d80d1a8614e9@oracle.com> On 01/09/2018 01:39 PM, Thomas Schatzl wrote: > Hi all, > > can I get reviews for this small change the quarantines the > gc/stress/gclocker/TestGCLockerWithParallel.java test? > > We can't get even modifications of it to reliably pass yet, maybe due > to JDK-8192647, but in the meantime we want to remove the noise from > the CI system. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8194824 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8194824/webrev/ Looks good, Reviewed. Thanks, Erik > Thanks, > Thomas > From stefan.johansson at oracle.com Wed Jan 10 09:04:02 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 10 Jan 2018 10:04:02 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <1515491663.4041.7.camel@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> <1515491663.4041.7.camel@oracle.com> Message-ID: On 2018-01-09 10:54, Thomas Schatzl wrote: > Hi again, > > On Tue, 2018-01-09 at 10:52 +0100, Thomas Schatzl wrote: >> Hi all, >> >> Erik Duveblad had some offline comments: >> >> New webrevs: >> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) >> >> - inverting some conditions in the clauses to read better >> >> - extract out the condition to do the maximally compacting full gc >> added in this change into a separate method. >> >> Erik Duveblad also noted that this change contains some slight >> behavioral change in when a collection is started. I.e. previously >> TLAB allocation by itself could not cause a GC. Since this change is >> already quite big, he suggested to fix this in a follow-up, and push >> them together. > additionally we agreed to retarget both changes to 11 to let it bake > a bit. The latest changes looks good and I agree that doing this early in 11 is good. Thanks, Stefan > > Thanks, > Thomas From thomas.schatzl at oracle.com Wed Jan 10 11:53:25 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 10 Jan 2018 12:53:25 +0100 Subject: RFR (XXS): 8194824: Add gc/stress/gclocker/TestGCLockerWithParallel.java to the ProblemList file In-Reply-To: <01df746c-59fe-1eaf-5d7c-d80d1a8614e9@oracle.com> References: <1515501586.4041.24.camel@oracle.com> <01df746c-59fe-1eaf-5d7c-d80d1a8614e9@oracle.com> Message-ID: <1515585205.26590.0.camel@oracle.com> Hi Kim, Erik, On Wed, 2018-01-10 at 07:29 +0100, Erik Helin wrote: > On 01/09/2018 01:39 PM, Thomas Schatzl wrote: > > Hi all, > > > > can I get reviews for this small change the quarantines the > > gc/stress/gclocker/TestGCLockerWithParallel.java test? > > [...] thanks for your reviews. Thomas > > From leo.korinth at oracle.com Wed Jan 10 13:16:47 2018 From: leo.korinth at oracle.com (Leo Korinth) Date: Wed, 10 Jan 2018 14:16:47 +0100 Subject: RFR: 8194681: G1 uses young free cset time when reporting non-young free cset times In-Reply-To: <0EA3254F-6097-4CCB-8943-FA2E2E97BA4C@oracle.com> References: <0EA3254F-6097-4CCB-8943-FA2E2E97BA4C@oracle.com> Message-ID: <1dc4533e-89bc-2eeb-9bd9-dade4f04e990@oracle.com> Thanks for the reviews Kim and Thomas! /Leo On 09/01/18 18:15, Kim Barrett wrote: >> On Jan 9, 2018, at 9:41 AM, Leo Korinth wrote: >> >> Hi, >> >> G1 uses young free cset time when reporting non-young free cset times. This patch is fixing the typo. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8194681 >> >> Webrev: >> http://cr.openjdk.java.net/~lkorinth/8194681/00/ >> >> Testing: >> - hs-tier1, hs-tier2 >> >> Thanks, >> Leo > > Looks good. > From erik.osterlund at oracle.com Wed Jan 10 14:29:04 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 10 Jan 2018 15:29:04 +0100 Subject: RFR(S): 8194741: Refactor oops in constant pool from CDS to use the Access API Message-ID: <5A562330.4070803@oracle.com> Hi, The constant pool may install resolved_references from CDS archives. This installation might happen during concurrent marking. Therefore, a new previously known graph is mounted onto the existing Java heap object graph. Naturally, this makes SATB invariants confused and therefore requires explicit enqueuing. This patch hooks this into the Access API. While there are different ways that this could be annotated, I chose to introduce a new decorator called IN_ARCHIVE_ROOT, which similar to IN_CONCURRENT_ROOT denotes a kind of special root that needs to be handled differently. In the G1 case, it results in a SATB enqueue when it is loaded. This approach made me scratch my head less than various other ideas I had. Webrev: http://cr.openjdk.java.net/~eosterlund/8194741/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8194741 /Erik From erik.helin at oracle.com Wed Jan 10 15:50:03 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 10 Jan 2018 16:50:03 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <1515491532.4041.6.camel@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> Message-ID: Hi Thomas, thanks for taking on this work and cleaning this up! On 01/09/2018 10:52 AM, Thomas Schatzl wrote: > Hi all, > > Erik Duveblad had some offline comments: > > New webrevs: > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) this looks good to me now, pending one small comment: please rename G1CollectedHeap::no_more_regions_left_for_allocation to G1CollectedHeap::has_regions_left_for_allocation (and of course change the method). This way you can use the ! operator in the if condition, which reads a bit easier. And thanks for putting this into 11, it is the right decision IMO. Thanks, Erik > - inverting some conditions in the clauses to read better > > - extract out the condition to do the maximally compacting full gc > added in this change into a separate method. > > Erik Duveblad also noted that this change contains some slight > behavioral change in when a collection is started. I.e. previously TLAB > allocation by itself could not cause a GC. Since this change is already > quite big, he suggested to fix this in a follow-up, and push them > together. > > Thanks, > Thomas > From erik.osterlund at oracle.com Wed Jan 10 16:44:54 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 10 Jan 2018 17:44:54 +0100 Subject: RFR (S): [TESTBUG] Test for JDK-8180048 In-Reply-To: <1511787154.2262.8.camel@oracle.com> References: <1511787154.2262.8.camel@oracle.com> Message-ID: <5A564306.5000901@oracle.com> Hi Thomas, Looks good. Note though that... 82 Asserts.assertLT(reserved, ReservedThreshold, "Reserved memory size is " + reserved + "KB which is higher than " + ReservedThreshold + "KB indicating a memory leak"); ...I guess it should say "greater than or equal to" instead of "higher than", but I'm not sure if I really care all that much. Also, the copyright years might need some tweaking. I don't need another webrev. Thanks, /Erik On 2017-11-27 13:52, Thomas Schatzl wrote: > Hi all, > > can I have some reviews that adds a test for JDK-8180048? That test > has not been completed in time for JDK 9, but I think it is good to > have it in JDK 10. > > It tries to detect the races fixed in JDK-8180048 by parsing NMT output > before/after symbol unloading. To be sure that we exercise the buggy > code path, the test takes some time (~30s) to stress symbol unloading. > > To avoid this test clogging up valuable time in lower testing tiers, > the test has been put into the "stress" test directory so that it only > executes at higher test tiers. Further I limited it to execute only > with a release build. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8180280 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8180280/webrev > Testing: > test case passing > > Thanks, > Thomas From thomas.schatzl at oracle.com Wed Jan 10 16:47:20 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 10 Jan 2018 17:47:20 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> Message-ID: <1515602840.2141.2.camel@oracle.com> Hi Erik, On Wed, 2018-01-10 at 16:50 +0100, Erik Helin wrote: > Hi Thomas, > > thanks for taking on this work and cleaning this up! thanks for your review. > > On 01/09/2018 10:52 AM, Thomas Schatzl wrote: > > Hi all, > > > > Erik Duveblad had some offline comments: > > > > New webrevs: > > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) > > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) > > this looks good to me now, pending one small comment: please rename > G1CollectedHeap::no_more_regions_left_for_allocation to > G1CollectedHeap::has_regions_left_for_allocation (and of course > change the method). This way you can use the ! operator in the if > condition, which reads a bit easier. > New webrev: http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3_to_4/ (diff) http://cr.openjdk.java.net/~tschatzl/8137099/webrev.4/ (full) > And thanks for putting this into 11, it is the right decision IMO. > No problem. Thanks, Thomas > Thanks, > Erik > > > - inverting some conditions in the clauses to read better > > > > - extract out the condition to do the maximally compacting full gc > > added in this change into a separate method. > > > > Erik Duveblad also noted that this change contains some slight > > behavioral change in when a collection is started. I.e. previously > > TLAB > > allocation by itself could not cause a GC. Since this change is > > already > > quite big, he suggested to fix this in a follow-up, and push them > > together. > > > > Thanks, > > Thomas > > From erik.helin at oracle.com Wed Jan 10 19:15:04 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 10 Jan 2018 20:15:04 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <1515602840.2141.2.camel@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> <1515602840.2141.2.camel@oracle.com> Message-ID: <307febe3-be4c-ebb3-6d3e-bcc68a1775da@oracle.com> On 01/10/2018 05:47 PM, Thomas Schatzl wrote: > Hi Erik, > > On Wed, 2018-01-10 at 16:50 +0100, Erik Helin wrote: >> Hi Thomas, >> >> thanks for taking on this work and cleaning this up! > > thanks for your review. > >> >> On 01/09/2018 10:52 AM, Thomas Schatzl wrote: >>> Hi all, >>> >>> Erik Duveblad had some offline comments: >>> >>> New webrevs: >>> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) >>> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) >> >> this looks good to me now, pending one small comment: please rename >> G1CollectedHeap::no_more_regions_left_for_allocation to >> G1CollectedHeap::has_regions_left_for_allocation (and of course >> change the method). This way you can use the ! operator in the if >> condition, which reads a bit easier. >> > > New webrev: > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3_to_4/ (diff) > http://cr.openjdk.java.net/~tschatzl/8137099/webrev.4/ (full) Looks good, Reviewed. Thanks! Erik >> And thanks for putting this into 11, it is the right decision IMO. >> > > No problem. > > Thanks, > Thomas > >> Thanks, >> Erik >> >>> - inverting some conditions in the clauses to read better >>> >>> - extract out the condition to do the maximally compacting full gc >>> added in this change into a separate method. >>> >>> Erik Duveblad also noted that this change contains some slight >>> behavioral change in when a collection is started. I.e. previously >>> TLAB >>> allocation by itself could not cause a GC. Since this change is >>> already >>> quite big, he suggested to fix this in a follow-up, and push them >>> together. >>> >>> Thanks, >>> Thomas >>> > From jiangli.zhou at Oracle.COM Wed Jan 10 19:54:20 2018 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Wed, 10 Jan 2018 11:54:20 -0800 Subject: RFR(S): 8194741: Refactor oops in constant pool from CDS to use the Access API In-Reply-To: <5A562330.4070803@oracle.com> References: <5A562330.4070803@oracle.com> Message-ID: Hi Erik, Thanks a lot for the refactoring. I?m really glad that the G1 specific code for explicit enqueuing is no longer needed in shared code when an archived object becomes ?in use?. Thanks, Jiangli > On Jan 10, 2018, at 6:29 AM, Erik ?sterlund wrote: > > Hi, > > The constant pool may install resolved_references from CDS archives. This installation might happen during concurrent marking. Therefore, a new previously known graph is mounted onto the existing Java heap object graph. Naturally, this makes SATB invariants confused and therefore requires explicit enqueuing. > > This patch hooks this into the Access API. While there are different ways that this could be annotated, I chose to introduce a new decorator called IN_ARCHIVE_ROOT, which similar to IN_CONCURRENT_ROOT denotes a kind of special root that needs to be handled differently. In the G1 case, it results in a SATB enqueue when it is loaded. This approach made me scratch my head less than various other ideas I had. > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8194741/webrev.00/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8194741 > > /Erik From kim.barrett at oracle.com Wed Jan 10 19:57:40 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 10 Jan 2018 14:57:40 -0500 Subject: RFR (S): 8180280: [TESTBUG] Test for JDK-8180048 In-Reply-To: <1511787221.2262.9.camel@oracle.com> References: <1511787221.2262.9.camel@oracle.com> Message-ID: <4118F578-97ED-40A1-A048-D862577EC7F8@oracle.com> > On Nov 27, 2017, at 7:53 AM, Thomas Schatzl wrote: > > Hi all, > > can I have some reviews that adds a test for JDK-8180048? That test > has not been completed in time for JDK 9, but I think it is good to > have it in JDK 10. > > It tries to detect the races fixed in JDK-8180048 by parsing NMT output > before/after symbol unloading. To be sure that we exercise the buggy > code path, the test takes some time (~30s) to stress symbol unloading. > > To avoid this test clogging up valuable time in lower testing tiers, > the test has been put into the "stress" test directory so that it only > executes at higher test tiers. Further I limited it to execute only > with a release build. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8180280 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8180280/webrev > Testing: > test case passing > > Thanks, > Thomas Looks good. From erik.osterlund at oracle.com Thu Jan 11 00:29:53 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 11 Jan 2018 01:29:53 +0100 Subject: RFR(S): 8194741: Refactor oops in constant pool from CDS to use the Access API In-Reply-To: References: <5A562330.4070803@oracle.com> Message-ID: <2cd067fe-4993-e22a-e343-46d40037cc5d@oracle.com> Hi Jiangli, You are welcome. :) /Erik On 2018-01-10 20:54, Jiangli Zhou wrote: > Hi Erik, > > Thanks a lot for the refactoring. I?m really glad that the G1 specific code for explicit enqueuing is no longer needed in shared code when an archived object becomes ?in use?. > > Thanks, > Jiangli > >> On Jan 10, 2018, at 6:29 AM, Erik ?sterlund wrote: >> >> Hi, >> >> The constant pool may install resolved_references from CDS archives. This installation might happen during concurrent marking. Therefore, a new previously known graph is mounted onto the existing Java heap object graph. Naturally, this makes SATB invariants confused and therefore requires explicit enqueuing. >> >> This patch hooks this into the Access API. While there are different ways that this could be annotated, I chose to introduce a new decorator called IN_ARCHIVE_ROOT, which similar to IN_CONCURRENT_ROOT denotes a kind of special root that needs to be handled differently. In the G1 case, it results in a SATB enqueue when it is loaded. This approach made me scratch my head less than various other ideas I had. >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8194741/webrev.00/ >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8194741 >> >> /Erik From mikael.vidstedt at oracle.com Thu Jan 11 01:04:36 2018 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 10 Jan 2018 17:04:36 -0800 Subject: EpsilonGC and throughput. In-Reply-To: <1515429904.2381.208.camel@oracle.com> References: <3d50e13e-3324-7058-3c8c-725a52207074@oracle.com> <1162631c-1ebc-f562-f096-d1ac861a42a4@redhat.com> <37071ceb-d21b-cf96-2dea-87da409b2e68@oracle.com> <1513781182.2415.29.camel@oracle.com> <07b29153-f01f-f1b5-18bc-9c38caaaf7eb@redhat.com> <1513805149.2542.112.camel@oracle.com> <304ef832-5e17-b53e-8b31-7c26fcb8c593@redhat.com> <1515429904.2381.208.camel@oracle.com> Message-ID: <18A190C9-80E8-43F2-9C40-D5C9B47A933E@oracle.com> Thomas, Thank you for bringing up these questions and comments. While I think it would be great to get some additional data and use case information for this feature added to the JEP, the isolated nature of the feature along with the fact that it is experimental means that the impact of making it is relatively small. With that in mind, I suggest that we move forward with this JEP/feature, and that more information can be added if/when it?s available. In line with that I will be endorsing the JEP shortly. Cheers, Mikael > On Jan 8, 2018, at 8:45 AM, Thomas Schatzl wrote: > > Hi Aleksey, > > I apologize for my somewhat inappropriate words, this has been due to > some frustration; also for the long delay that were due to the winter > holidays. > > Let's try to start all over with this... I will try to be constructive > this time. Feel free to remind me if needed. > > One purpose of the JEP is to share a problem and propose an idea (often > already accompanied by a solution) to solve them. This problem and the > idea is then discussed by the community, eventually refining it along > the way. > > The community then evaluates that idea based on its contents, of course > starting with the people trying to determine whether there is a > problem, what the problem is, and whether the proposed idea will fix > the problem. > > For this evaluation to happen, the JEP needs to clearly state the > problem, it's seriousness, and the proposed idea. > > It also helps if the JEP is written in a way to make it interesting for > the community to read it, and respond. The less thinking a reader has > to do to answer whether he is impacted or not, and whether and by how > much it would simplify the life of himself or in general Java users, > the more people will feel urged to get this in (or at least not > deterred). > > Finally, I assume you do understand that, in general, although there is > always a certain level of duplication in the VM, but if a change only > solves the problems that existing code already solves, or solves > problems almost nobody has, or it does not give enough benefit (also > dependent on the complexity of a change), it makes it a hard(er) sell? > > So the JEP template (http://openjdk.java.net/jeps/2) provides some > questions on how to structure this idea proposal and what to put into > the various sections. > > In general this is to help you providing the relevant information to > the community. While this might be onerous for a writer at first > glance, it saves everyone else lots of time trying to find out what and > how you want to solve something. > > > I am going over the Motivation section in detail in the remainder of > this email, with some comments at the end about the Alternatives one > which seem to be the most important here. > > The JEP template states under the Motivation section: > > "Motivation > ---------- > > // Why should this work be done? What are its benefits? Who's asking > // for it? How does it compare to the competition, if any?" > > > Now let me try to associate these questions to the relevant parts of > the existing JEP 318 (http://openjdk.java.net/jeps/318) text. > > And please, before reading below, I really do not want to shoot down > the proposal if you see a question mark. It should indicate just that > there is a question where I honestly do not know the answer to, but > which I hope you do. Similarly if I raise some concerns about some > statements I expect you to notice that there may be something missing > here, nothing else. I.e. not necessarily that I am "right" about > something. You said you already talked about it many times with other > people in the field, thought it over for a long time, so hopefully > these questions can be answered quickly, and in the future the JEP also > contains this information for other people too. > > Some may not need an answer as they only try to make you think about > the seriousness of a stated problem. > > JEP text: "Java implementations are well known for a broad choice of > highly configurable GC implementations." > > Potential answer to "Why should this work be done?". Or does the > sentence indicate we need another GC because we already have so many, > and another does not hurt? I am asking this in full seriousness, I > really do not know. Or is this only an introductory sentence without > meaning? > > JEP text: "There are four use cases where a trivial no-op GC proves > useful." > > This seems to be a transition sentence, but is fine to make it flow > better. > > Reading this, and given that only a list of benefits follows, I assume > that these two sentences were supposed to answer the "Why should this > work be done? Who's asking for it?" questions from the JEP. > > In the earlier email you mentioned these power users that want full > control. Mention them here. Define them. Also mention other user groups > that might be interested. Particularly groups the benefits list could > refer to. > > Let's go into these benefits in more detail: > > JEP text: "Performance testing. Having a GC that does almost nothing is > a useful tool to do differential performance analysis for other, real > GCs. Having a no-op GC can help to filter out GC-induced performance > artifacts." > > Benefit. Maybe it would be useful to list a few of these performance > artifacts here ("... , e.g. barrier code, concurrent threads"). > > Who are the benefactors of this? Not sure about these "power users" > (see M. Berger's response in this exact thread). Probably developers of > new GC algorithms? > > An alternative could be a developer just nop'ing out the relevant GC > interface section. That is somewhat cumbersome, but for how many users > is this a problem? Spell that out in the appropriate Alternatives > section. > > Also tell that using Epsilon GC for barrier testing may not be an ideal > tool, because all other existing collectors are generational (but in > the future it might apply to Shenandoah unless it goes generational > too, idk), and testing generational barriers on a non-generational heap > may not give a complete picture of barrier overhead. > > JEP text: "Functional testing. For Java code testing, a way to > establish a threshold for allocated memory is useful to assert memory > pressure invariants. Today, we have to pick up the allocation data from > MXBeans, or even resort to parsing GC logs. Having a GC that accepts > only the bounded number of allocations, and fails on heap exhaustion, > simplifies testing." > > Benefit. For regression testing, in how many cases do you think it is > sufficient (or in what circumstances) to get a fail/no-fail answer > only? > This seems to pass work on a failure to the dev, them needing to write > another test that also prints and monitors the memory usage increases > over time anyway. > How much work, given that you already need to monitor memory usage is > the test to fail when heap usage goes above a threshold then? > > "VM interface testing. For VM development purposes, having a simple GC > helps to understand the absolute minimum required from the VM-GC > interface to have a functional allocator. This serves as proof that the > VM-GC interface is sane, which is important in lieu of JEP 304 > ("Garbage Collector Interface")." > > Benefit. Who are the (main) benefactors for that - probably developers? > For a developer, how much is that benefit if there are already 5 or 6 > implementations of that interface? > > "Last-drop performance improvements. For ultra-latency-sensitive > applications, where developers are conscious about memory allocations > and know the application memory footprint exactly, or even have > (almost) completely garbage-free applications. In those applications, > GC cycles may be considered an implementation bug that wastes CPU > cycles for no good reason." > > This is the only benefit in this list that actually mentions its target > group. I assume it is those power users (not necessarily developers > only?), that are ultra-latency aware. This paragraph further > characterizes them that they are also throughput conscious. > The discussion earlier also characterized them as also being very > conscious about memory layout etc, they do not want object reordering > because it is inconsistent between GCs (which is a different issue, and > I do not want to discuss it here). > > From what I gathered so far, they want absolute control over memory > management - but the real question is whether this is their real or > only problem with the Java VM to achieve consistent VM behavior. > There are certainly more components in the VM that introduce > potentially more significant jitter (now assuming that that power user > can set heap sizes accordingly to use e.g. Serial GC). > > This execution consistency is maybe another goal that is even more > important than last-drop performance. > > It may be useful to investigate the problem of these power users in > more detail, and see if we could provide a (more?) complete solution > for them. > > > "Extremely short lived jobs are one example of this." > > I do not understand the use of Epsilon in such use case. The > alternative I can see would be to restart the VM after every short > lived job (something for the Alternatives section). That seems strange > to me, depending on the definition of a "short lived job", particularly > if nothing survives after execution of that short lived job, a GC will > be extremely fast. > > Further I assume this example is about FaaS (Function-as-a-service) and > their users, and while there may be an overlap with those "power > users", I would expect the "regular java users" a way larger group than > the power users. There may be an overlap with those power users, power > users probably would not want to incur the associated loss of control. > > "There are also cases when restarting the JVM -- letting load balancers > figure out failover -- is sometimes a better recovery strategy than > accepting a GC cycle." > > I really can't find a good example where a GC, particularly in the > situation that has been described so far, also for these short-lived > jobs, where a GC (on an almost empty heap) is not at least as fast as a > restart. > > It would make for a very good paragraph explaining this use case in the > alternatives section. > > Another problem with these two sentences to me is (and I am by no means > a "FaaS power user") that I believe that waiting for the VM to > crash/shut down to steer the load balancers is not a good strategy. > Maybe you can give some more information about this use case? > > "Even for non-allocating workloads, the choice of GC means choosing the > set of GC barriers that the workload has to use, even if no GC cycle > actually happens. Most JDK GCs are generational, and they emit at least > one reference write barrier. Avoiding this barrier brings the last bit > of performance improvement." > > (_All_ JDK GCs are currently generational) > > Now, as mentioned earlier in the thread, when talking about performance > improvements, it would be nice to mention the potential gains that can > be made (or elsewhere, like in the alternatives section). There is > already an implementation, and so you can measure this too. > > Please make your comparison in context: since this whole paragraph is > about last-drop performance improvements for power users, a balanced > comparison would probably only be a comparison that such a power user > would do - i.e. not running the VM with randomly selected default > options that arbitrarily penalizes your competition. > > In the earlier email I only directly asked for performance numbers > because in order to streamline this discussion, and given that you are > a well-known performance and benchmark guru (afaik you were "R"eviewer > long before me joining) it seemed a logical request. If you can't find > numbers, there is also the reference ("Barriers, Friendlier Still" or > so from Blackburn et al I think) I got that is also mentioned iirc in > the very good Jones GC book. > "Real" newbies I would just ask to perform this test. > > > In our discussion we found at least one more, actually unique benefit > (the one about getting correct heap dumps on failure). > > > Of course there is a limit on the length of that section and others > (i.e. considering the attention span of your readership), but all > questions asked by the JEP template should be answered in the > corresponding section. There is some intentional overlap in the JEP, > particularly in the first three sections, similar to a scientific paper > so that different groups of readers need only read the sections they > are interested in to see whether this change is actually affecting them > (and interesting to follow). > > It shouldn't be as long as a scientific paper though, so if you think a > section is too long, drop the less impactful benefits, and other parts > of the JEP will automatically follow. > > Again, given your experience with the VM I assume you know alternatives > as good or even better than me to make a balanced assessment here. > > Otherwise, keep them and please raise specific questions. > > As for the Alternatives section, it is the same procedure, start with > answering the questions raised in the template: > > "Alternatives > ------------ > > // Did you consider any alternative approaches or technologies? If so > // then please describe them here and explain why they were not > // chosen." > > I would assume that for all of these benefits we can easily come up > with alternative ways of doing the same or a similar thing (I already > stated a few alternatives that I think are very valid in this or > previous emails; some valid ones are already in the JEP), and why we > would want to particularly do it this way given the context of that > benefit (e.g. the user group). If there is no alternative, add a > sentence that says so in that section. > > Again, try to make these alternative review balanced, and in context of > the users the benefit is for. > > This section should imho also include a discussion of "mostly complete > alternatives", as suggested in this email thread already, e.g. adding a > -XX:+DieOnFirstGC switch, and reasons for and against it. > > Please understand that the JEP will be the reference to talk about, not > some email or private offline discussions. Keeping that in mind I think > discussions will go much smoother. > > I hope I made clear now why I, unfortunately not in a very friendly way > (apologies again), suggested that the current JEP text lacks the > required answers to the questions stated in the JEP template to (re- > )start a hopefully more focused discussion. > > Thanks, > Thomas > From stefan.johansson at oracle.com Thu Jan 11 08:37:24 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Thu, 11 Jan 2018 09:37:24 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <307febe3-be4c-ebb3-6d3e-bcc68a1775da@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> <1515602840.2141.2.camel@oracle.com> <307febe3-be4c-ebb3-6d3e-bcc68a1775da@oracle.com> Message-ID: <02323f1f-9b78-dc95-dadd-f2ecb0039b42@oracle.com> On 2018-01-10 20:15, Erik Helin wrote: > On 01/10/2018 05:47 PM, Thomas Schatzl wrote: >> Hi Erik, >> >> On Wed, 2018-01-10 at 16:50 +0100, Erik Helin wrote: >>> Hi Thomas, >>> >>> thanks for taking on this work and cleaning this up! >> >> thanks for your review. >> >>> >>> On 01/09/2018 10:52 AM, Thomas Schatzl wrote: >>>> Hi all, >>>> >>>> ??? Erik Duveblad had some offline comments: >>>> >>>> New webrevs: >>>> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.2_to_3/ (diff) >>>> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3/ (full) >>> >>> this looks good to me now, pending one small comment: please rename >>> G1CollectedHeap::no_more_regions_left_for_allocation to >>> G1CollectedHeap::has_regions_left_for_allocation (and of course >>> change the method). This way you can use the ! operator in the if >>> condition, which reads a bit easier. >>> >> >> New webrev: >> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.3_to_4/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8137099/webrev.4/ (full) > > Looks good, Reviewed. > +1 Stefan > Thanks! > Erik > >>> And thanks for putting this into 11, it is the right decision IMO. >>> >> >> No problem. >> >> Thanks, >> ?? Thomas >> >>> Thanks, >>> Erik >>> >>>> - inverting some conditions in the clauses to read better >>>> >>>> - extract out the condition to do the maximally compacting full gc >>>> added in this change into a separate method. >>>> >>>> Erik Duveblad also noted that this change contains some slight >>>> behavioral change in when a collection is started. I.e. previously >>>> TLAB >>>> allocation by itself could not cause a GC. Since this change is >>>> already >>>> quite big, he suggested to fix this in a follow-up, and push them >>>> together. >>>> >>>> Thanks, >>>> ??? Thomas >>>> >> From thomas.schatzl at oracle.com Thu Jan 11 10:01:55 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 11 Jan 2018 11:01:55 +0100 Subject: RFR (S): [TESTBUG] Test for JDK-8180048 In-Reply-To: <5A564306.5000901@oracle.com> References: <1511787154.2262.8.camel@oracle.com> <5A564306.5000901@oracle.com> Message-ID: <1515664915.2478.2.camel@oracle.com> Hi Erik and Kim, On Wed, 2018-01-10 at 17:44 +0100, Erik ?sterlund wrote: > Hi Thomas, > > Looks good. > Note though that... > > 82 Asserts.assertLT(reserved, ReservedThreshold, > "Reserved > memory size is " + reserved + "KB which is higher than " + > ReservedThreshold + "KB indicating a memory leak"); > > ...I guess it should say "greater than or equal to" instead of > "higher than", but I'm not sure if I really care all that much. > > Also, the copyright years might need some tweaking. I will fix both before pushing. > > I don't need another webrev. Thanks a lot for your reviews. Thomas From thomas.schatzl at oracle.com Thu Jan 11 09:56:13 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 11 Jan 2018 10:56:13 +0100 Subject: [10] RFR (M/L): 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME In-Reply-To: <02323f1f-9b78-dc95-dadd-f2ecb0039b42@oracle.com> References: <1512403004.2943.13.camel@oracle.com> <9419C346-98E6-4E50-829C-245B1C9B9C9B@amazon.com> <1512990447.2882.10.camel@oracle.com> <8941a386-dc2d-87ae-7e43-9f24d8513c7a@oracle.com> <1513092283.2401.0.camel@oracle.com> <1515491532.4041.6.camel@oracle.com> <1515602840.2141.2.camel@oracle.com> <307febe3-be4c-ebb3-6d3e-bcc68a1775da@oracle.com> <02323f1f-9b78-dc95-dadd-f2ecb0039b42@oracle.com> Message-ID: <1515664573.2478.1.camel@oracle.com> Hi everybody, On Thu, 2018-01-11 at 09:37 +0100, Stefan Johansson wrote: > > On 2018-01-10 20:15, Erik Helin wrote: > > On 01/10/2018 05:47 PM, Thomas Schatzl wrote: > > > Hi Erik, > > > > > > On Wed, 2018-01-10 at 16:50 +0100, Erik Helin wrote: > > > > Hi Thomas, > > > > > > > > thanks for taking on this work and cleaning this up! > > > > > > thanks for your review. [...] > > Looks good, Reviewed. > > > > +1 > Stefan > > Thanks! > > Erik Thanks for your reviews. Thomas From per.liden at oracle.com Fri Jan 12 14:29:45 2018 From: per.liden at oracle.com (Per Liden) Date: Fri, 12 Jan 2018 15:29:45 +0100 Subject: RFR(xs): 8195000: Clean out left-overs in arguments.hpp Message-ID: <8df3540a-1bdb-8d9c-bcf1-b76097666b29@oracle.com> Hi, The refactoring done in JDK-8189171 moved some functions out from Arguments to separate classes, but the function declarations were left. These should just be removed. Bug: https://bugs.openjdk.java.net/browse/JDK-8195000 Webrev: http://cr.openjdk.java.net/~pliden/8195000/webrev.0/ /Per From stefan.karlsson at oracle.com Fri Jan 12 14:42:20 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 12 Jan 2018 15:42:20 +0100 Subject: RFR(xs): 8195000: Clean out left-overs in arguments.hpp In-Reply-To: <8df3540a-1bdb-8d9c-bcf1-b76097666b29@oracle.com> References: <8df3540a-1bdb-8d9c-bcf1-b76097666b29@oracle.com> Message-ID: <7307b0ef-5ab1-4a2e-11dc-6ebb9a244f19@oracle.com> Looks good. StefanK On 2018-01-12 15:29, Per Liden wrote: > Hi, > > The refactoring done in JDK-8189171 moved some functions out from > Arguments to separate classes, but the function declarations were left. > These should just be removed. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195000 > Webrev: http://cr.openjdk.java.net/~pliden/8195000/webrev.0/ > > /Per From per.liden at oracle.com Fri Jan 12 14:44:04 2018 From: per.liden at oracle.com (Per Liden) Date: Fri, 12 Jan 2018 15:44:04 +0100 Subject: RFR(xs): 8195000: Clean out left-overs in arguments.hpp In-Reply-To: <7307b0ef-5ab1-4a2e-11dc-6ebb9a244f19@oracle.com> References: <8df3540a-1bdb-8d9c-bcf1-b76097666b29@oracle.com> <7307b0ef-5ab1-4a2e-11dc-6ebb9a244f19@oracle.com> Message-ID: <1f99ce30-d250-5bb0-1868-3951fecaa829@oracle.com> Thanks Stefan. I consider this a trivial change, so I'll push now. cheers, Per On 01/12/2018 03:42 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2018-01-12 15:29, Per Liden wrote: >> Hi, >> >> The refactoring done in JDK-8189171 moved some functions out from >> Arguments to separate classes, but the function declarations were >> left. These should just be removed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8195000 >> Webrev: http://cr.openjdk.java.net/~pliden/8195000/webrev.0/ >> >> /Per From erik.osterlund at oracle.com Tue Jan 16 09:42:06 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 16 Jan 2018 10:42:06 +0100 Subject: RFR(M): 8195103: Refactor ReduceInitialCardMarks to not assume all GCs use card marks Message-ID: <5A5DC8EE.9050806@oracle.com> Hi, The current interface between the compilers and GC regarding the ReduceInitialCardMarks optimization lives in the CollectedHeap. However, the optimization is relevant only for collectors with a card mark barrier set (CardTableModRefBS). Therefore, this interface ought to be moved into CardTableModRef so that code gets less messy when a collector does not use card marking. In the process, the CollectedHeap::pre_initialize member function was removed (as it was only used for initializing ReduceInitialCardMarks). The optimization needs to check if an object is in young or not. This question is now asked to the barrier set rather than the heap. For all collectors except G1, this has been implemented by forwarding the question to the corresponding heap (inlined member function), which is what was done before. For G1, I chose to instead look at the card value and see if it is a young card, which should give the same answer. Bug: https://bugs.openjdk.java.net/browse/JDK-8195103 Webrev: http://cr.openjdk.java.net/~eosterlund/8195103/webrev.00/ Testing: mach5 hs-tier1-5 Thanks, /Erik From erik.helin at oracle.com Wed Jan 17 15:33:00 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 17 Jan 2018 16:33:00 +0100 Subject: 8195158: Concurrent System.gc() is "upgraded" to stop-the-world System.gc() Message-ID: <4d110107-aa24-9ba4-475a-6e88344568e4@oracle.com> Hi all, this patch fixes the bug 'Concurrent System.gc() is "upgraded" to stop-the-world System.gc()' [0]. The bug occurs when all of the following conditions are true: - the heap is full (more strictly, there are no regions available for allocations) - the Java program executes System.gc() - the flag -XX:+ExplicitGCInvokesConcurrent is set If all of the above are true, then G1 will do an initial mark YC and (since 862c41 [1]) "upgrade" the gc to a full GC. This "upgrade" should not happen since the user has explicitly set the flag -XX:+ExplicitGCInvokesConcurrent. Since the heap is full, it is likely that the next allocation will trigger a full GC, but the concurrent marking could also finish quickly and free up memory in the cleanup phase. Patch: http://cr.openjdk.java.net/~ehelin/JDK-8195158/00/index.html Bug: https://bugs.openjdk.java.net/browse/JDK-8195158 Testing: - newly added regression test - hs-tier1, hs-tier2, hs-tier3, hs-tier7 on - Linux x86-64 - Mac x86-64 - Windows x86-64 - Solaris SPARC Thanks, Erik [0]: https://bugs.openjdk.java.net/browse/JDK-8195158 [1]: http://hg.openjdk.java.net/jdk/hs/rev/862c41cf1c7f From stefan.johansson at oracle.com Wed Jan 17 15:39:08 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 17 Jan 2018 16:39:08 +0100 Subject: 8195158: Concurrent System.gc() is "upgraded" to stop-the-world System.gc() In-Reply-To: <4d110107-aa24-9ba4-475a-6e88344568e4@oracle.com> References: <4d110107-aa24-9ba4-475a-6e88344568e4@oracle.com> Message-ID: <81755f4d-df3a-043d-a5a4-5195dac90ecc@oracle.com> On 2018-01-17 16:33, Erik Helin wrote: > Hi all, > > this patch fixes the bug 'Concurrent System.gc() is "upgraded" to > stop-the-world System.gc()' [0]. The bug occurs when all of the > following conditions are true: > - the heap is full (more strictly, there are no regions available for > ? allocations) > - the Java program executes System.gc() > - the flag -XX:+ExplicitGCInvokesConcurrent is set > If all of the above are true, then G1 will do an initial mark YC and > (since 862c41 [1]) "upgrade" the gc to a full GC. This "upgrade" > should not happen since the user has explicitly set the flag > -XX:+ExplicitGCInvokesConcurrent. Since the heap is full, it is likely > that the next allocation will trigger a full GC, but the concurrent > marking could also finish quickly and free up memory in the cleanup > phase. > > Patch: > http://cr.openjdk.java.net/~ehelin/JDK-8195158/00/index.html Looks good, ship it! StefanJ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8195158 > > Testing: > - newly added regression test > - hs-tier1, hs-tier2, hs-tier3, hs-tier7 on > ? - Linux x86-64 > ? - Mac x86-64 > ? - Windows x86-64 > ? - Solaris SPARC > > Thanks, > Erik > > [0]: https://bugs.openjdk.java.net/browse/JDK-8195158 > [1]: http://hg.openjdk.java.net/jdk/hs/rev/862c41cf1c7f From erik.osterlund at oracle.com Wed Jan 17 15:47:01 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 17 Jan 2018 16:47:01 +0100 Subject: 8195158: Concurrent System.gc() is "upgraded" to stop-the-world System.gc() In-Reply-To: <4d110107-aa24-9ba4-475a-6e88344568e4@oracle.com> References: <4d110107-aa24-9ba4-475a-6e88344568e4@oracle.com> Message-ID: <5A5F6FF5.4090304@oracle.com> Hi Erik, Looks good. Thanks, /Erik On 2018-01-17 16:33, Erik Helin wrote: > Hi all, > > this patch fixes the bug 'Concurrent System.gc() is "upgraded" to > stop-the-world System.gc()' [0]. The bug occurs when all of the > following conditions are true: > - the heap is full (more strictly, there are no regions available for > allocations) > - the Java program executes System.gc() > - the flag -XX:+ExplicitGCInvokesConcurrent is set > If all of the above are true, then G1 will do an initial mark YC and > (since 862c41 [1]) "upgrade" the gc to a full GC. This "upgrade" > should not happen since the user has explicitly set the flag > -XX:+ExplicitGCInvokesConcurrent. Since the heap is full, it is likely > that the next allocation will trigger a full GC, but the concurrent > marking could also finish quickly and free up memory in the cleanup > phase. > > Patch: > http://cr.openjdk.java.net/~ehelin/JDK-8195158/00/index.html > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8195158 > > Testing: > - newly added regression test > - hs-tier1, hs-tier2, hs-tier3, hs-tier7 on > - Linux x86-64 > - Mac x86-64 > - Windows x86-64 > - Solaris SPARC > > Thanks, > Erik > > [0]: https://bugs.openjdk.java.net/browse/JDK-8195158 > [1]: http://hg.openjdk.java.net/jdk/hs/rev/862c41cf1c7f From erik.helin at oracle.com Wed Jan 17 17:07:55 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 17 Jan 2018 18:07:55 +0100 Subject: 8195158: Concurrent System.gc() is "upgraded" to stop-the-world System.gc() In-Reply-To: <81755f4d-df3a-043d-a5a4-5195dac90ecc@oracle.com> References: <4d110107-aa24-9ba4-475a-6e88344568e4@oracle.com> <81755f4d-df3a-043d-a5a4-5195dac90ecc@oracle.com> Message-ID: <369bb547-7ce7-cd56-d733-69a995bc50f4@oracle.com> On 01/17/2018 04:39 PM, Stefan Johansson wrote: > > > On 2018-01-17 16:33, Erik Helin wrote: >> Hi all, >> >> this patch fixes the bug 'Concurrent System.gc() is "upgraded" to >> stop-the-world System.gc()' [0]. The bug occurs when all of the >> following conditions are true: >> - the heap is full (more strictly, there are no regions available for >> ? allocations) >> - the Java program executes System.gc() >> - the flag -XX:+ExplicitGCInvokesConcurrent is set >> If all of the above are true, then G1 will do an initial mark YC and >> (since 862c41 [1]) "upgrade" the gc to a full GC. This "upgrade" >> should not happen since the user has explicitly set the flag >> -XX:+ExplicitGCInvokesConcurrent. Since the heap is full, it is likely >> that the next allocation will trigger a full GC, but the concurrent >> marking could also finish quickly and free up memory in the cleanup >> phase. >> >> Patch: >> http://cr.openjdk.java.net/~ehelin/JDK-8195158/00/index.html > Looks good, ship it! Thanks Stefan for reviewing! Erik > StefanJ >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8195158 >> >> Testing: >> - newly added regression test >> - hs-tier1, hs-tier2, hs-tier3, hs-tier7 on >> ? - Linux x86-64 >> ? - Mac x86-64 >> ? - Windows x86-64 >> ? - Solaris SPARC >> >> Thanks, >> Erik >> >> [0]: https://bugs.openjdk.java.net/browse/JDK-8195158 >> [1]: http://hg.openjdk.java.net/jdk/hs/rev/862c41cf1c7f > From erik.helin at oracle.com Wed Jan 17 17:08:16 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 17 Jan 2018 18:08:16 +0100 Subject: 8195158: Concurrent System.gc() is "upgraded" to stop-the-world System.gc() In-Reply-To: <5A5F6FF5.4090304@oracle.com> References: <4d110107-aa24-9ba4-475a-6e88344568e4@oracle.com> <5A5F6FF5.4090304@oracle.com> Message-ID: <2b864738-62c6-83e2-617f-859e1db91fb7@oracle.com> On 01/17/2018 04:47 PM, Erik ?sterlund wrote: > Hi Erik, > > Looks good. Thanks Erik for Reviewing! Erik > Thanks, > /Erik > > On 2018-01-17 16:33, Erik Helin wrote: >> Hi all, >> >> this patch fixes the bug 'Concurrent System.gc() is "upgraded" to >> stop-the-world System.gc()' [0]. The bug occurs when all of the >> following conditions are true: >> - the heap is full (more strictly, there are no regions available for >> ? allocations) >> - the Java program executes System.gc() >> - the flag -XX:+ExplicitGCInvokesConcurrent is set >> If all of the above are true, then G1 will do an initial mark YC and >> (since 862c41 [1]) "upgrade" the gc to a full GC. This "upgrade" >> should not happen since the user has explicitly set the flag >> -XX:+ExplicitGCInvokesConcurrent. Since the heap is full, it is likely >> that the next allocation will trigger a full GC, but the concurrent >> marking could also finish quickly and free up memory in the cleanup >> phase. >> >> Patch: >> http://cr.openjdk.java.net/~ehelin/JDK-8195158/00/index.html >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8195158 >> >> Testing: >> - newly added regression test >> - hs-tier1, hs-tier2, hs-tier3, hs-tier7 on >> ? - Linux x86-64 >> ? - Mac x86-64 >> ? - Windows x86-64 >> ? - Solaris SPARC >> >> Thanks, >> Erik >> >> [0]: https://bugs.openjdk.java.net/browse/JDK-8195158 >> [1]: http://hg.openjdk.java.net/jdk/hs/rev/862c41cf1c7f > From hohensee at amazon.com Fri Jan 19 23:40:49 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 19 Jan 2018 23:40:49 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results Message-ID: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> I?d appreciate a review please. Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.00/ The bug is that from the JMX point of view, G1?s incremental collector (misnamed as the ?G1 Young Generation? collector) only affects G1?s survivor and eden spaces. In fact, mixed collections run by this collector also affect the G1 old generation. This proposed fix is to record, for each of a JMX garbage collector's memory pools, whether that memory pool is affected by all collections using that collector. And, for each collection, record whether or not all the collector's memory pools are affected. After each collection, for each memory pool, if either all the collector's memory pools were affected or the memory pool is affected for all collections, record CollectionUsage for that pool. For collectors other than G1 Young Generation, all pools are recorded as affected by all collections and every collection is recorded as affecting all the collector?s memory pools. For the G1 Young Generation collector, the G1 Old Gen pool is recorded as not being affected by all collections, and non-mixed collections are recorded as not affecting all memory pools. The result is that for non-mixed collections, CollectionUsage is recorded after a collection only the G1 Eden Space and G1 Survivor Space pools, while for mixed collections CollectionUsage is recorded for G1 Old Gen as well. Other than the effect of the fix on G1 Old Gen MemoryPool. CollectionUsage, the only external behavior change is that GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names rather than 2. With this fix, a collector?s memory pools can be divided into two disjoint subsets, one that participates in all collections and one that doesn?t. This is a bit more general than the minimum necessary to fix G1, but not by much. Because I expect it to apply to other incremental region-based collectors, I went with the more general solution. I minimized the amount of code I had to touch by using default parameters for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. Tested by running the new jtreg test included in the webrev. I tried to use the submit repo, but it was out of order earlier today, so I?d be much obliged if someone could run it through mach5 and sponsor an eventual push. I successfully ran a JDK8 version of the patch through all the JDK8 jtreg tests as well as the JDK8 TCK. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.helin at oracle.com Thu Jan 25 13:20:02 2018 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 25 Jan 2018 14:20:02 +0100 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> Message-ID: <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> Hi Paul, thanks for your interest in this area and for your patch! The GarbageCollectorMXBean and MemoryPoolMXBean support for G1 is in need of some updates, so thanks for working on this. Looking at your patch, I'm not sure that this is the direction we want to go in. I discussed this a bit with Thomas and Stefan J, and our current line of thinking is the following: - Memory pools (MemoryMXBean): - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Old Regions" - "G1 Humongous Regions" - "G1 Archive Regions" (if CDS and/or AppCDS is used) `init` for these pools would be 0, `used` would be total size of the "live" objects in the used regions of that type, `committed` the total size of the used regions of that that type and `max` would be MaxHeapSize. Note that "live" here means live from the GCs point of view, i.e. an object might be dead in an old region but the GC will consider that object live until a concurrent cycle has marked through the heap and deemed it dead. - Collectors (GarbageCollectorMXBean): - "G1 Young Collector" with the pools - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Humongous Regions" (due to early reclamation) - "G1 Mixed Collector" with the pools - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Old Regions" - "G1 Humongous Regions" (due to early reclamation) - "G1 Full Collector" with the pools - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Old Regions" - "G1 Humongous Regions" (can collect empty humongous regions) - "G1 Concurrent Cycle" with the pools - "G1 Old Regions" (can collect empty old regions) - "G1 Humongous Regions" (can collect empty humongous regions) Note that with this design, the GarbageCollectorMXBean::getCollectionTime() method for "G1 Concurrent Cycle" would be the wall clock time from start of the first initial mark to end of the last cleanup (also including the time of any eventual young collection during the concurrent cycle). So GarbageCollectorMXBean::getCollectionTime() would be a mix of concurrent and STW time for the GarbageCollectorMXBean with name "G1 Concurrent Cycle". Also note that the MemoryPoolMXBean with name "G1 Archive Regions" will not be attached to any GarbageCollectorMXBean, since those regions will never be collected. What do you think about this design, would it work for your use case? If we want to go ahead with this design, then I think we might have to file a CSR. David (who is the HotSpot CSR representative), do we have to file a CSR for changing the names of MemoryPoolMXBeans and GarbageCollectorMXBeans? Thanks, Erik On 01/20/2018 12:40 AM, Hohensee, Paul wrote: > I?d appreciate a review please. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.00/ > > The bug is that from the JMX point of view, G1?s incremental collector > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > survivor and eden spaces. In fact, mixed collections run by this > collector also affect the G1 old generation. > > This proposed fix is to record, for each of a JMX garbage collector's > memory pools, whether that memory pool is affected by all collections > using that collector. And, for each collection, record whether or not > all the collector's memory pools are affected. After each collection, > for each memory pool, if either all the collector's memory pools were > affected or the memory pool is affected for all collections, record > CollectionUsage for that pool. > > For collectors other than G1 Young Generation, all pools are recorded as > affected by all collections and every collection is recorded as > affecting all the collector?s memory pools. For the G1 Young Generation > collector, the G1 Old Gen pool is recorded as not being affected by all > collections, and non-mixed collections are recorded as not affecting all > memory pools. The result is that for non-mixed collections, > CollectionUsage is recorded after a collection only the G1 Eden Space > and G1 Survivor Space pools, while for mixed collections CollectionUsage > is recorded for G1 Old Gen as well. > > Other than the effect of the fix on G1 Old Gen MemoryPool. > CollectionUsage, the only external behavior change is that > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > rather than 2. > > With this fix, a collector?s memory pools can be divided into two > disjoint subsets, one that participates in all collections and one that > doesn?t. This is a bit more general than the minimum necessary to fix > G1, but not by much. Because I expect it to apply to other incremental > region-based collectors, I went with the more general solution. I > minimized the amount of code I had to touch by using default parameters > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > Tested by running the new jtreg test included in the webrev. I tried to > use the submit repo, but it was out of order earlier today, so I?d be > much obliged if someone could run it through mach5 and sponsor an > eventual push. I successfully ran a JDK8 version of the patch through > all the JDK8 jtreg tests as well as the JDK8 TCK. > > Thanks, > > Paul > From hohensee at amazon.com Thu Jan 25 21:04:26 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 25 Jan 2018 21:04:26 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> Message-ID: <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> Hi Erik & co, thanks for looking at this. Would you be ok with pushing this fix (it really is a bug!) and then me doing a followup RFE? That way, I can backport the fix to 8u and eventually remove the patch I?ve already pushed to our OpenJDK8 internal release. Some background. We used to measure heap occupancy using MemoryPoolMXBean.Usage and alarm on it when it got ?too high?. The problem is that it?s an instantaneous measure and therefore includes an unknown amount of garbage, so you can?t determine where to set an alarm, because you don?t know how much of it will be collected in the near future. We want to detect steadily increasing memory use such as you?d see with a leak, so we?re switching to MemoryPoolMXBean.CollectionUsage, which is the usage immediately after the last GC that affected the memory pool. We have a synthetic metric that?s defined for all collectors as (sum of CollectionUsage.used for non-eden pools) / (sum of CollectionUsage.max-if-defined for all non-eden pools), and alarm on that. ?max-if-defined? means zero if JMX returns -1, which is the ?undefined? value. The JMX API spec doesn?t specify what the memory pool or garbage collector names are, but the current names are de-facto part of the API, so if we change the existing ones, imo a CSR should be filed. To avoid that, how about we keep the existing memory pool names the same and, in the spirit of compatibility, make the new ones similar to the existing ones. I.e., ?G1 Eden Space? ?G1 Survivor Space? ?G1 Old Gen? ?G1 Humongous Space? ?G1 Archive Space? Alternatively, we can make the existing names aliases for the new ones, though ?Regions? seems a bit G1-specific to me and doesn?t convey the relationship to the equivalent spaces in the other collectors. Especially if there?s an archive space for the other collectors (see below). There?s no specification requirement that memory pools be disjoint. What do you think of defining the archive space as a subset of the old gen? That way existing code can stay the same (it typically just iterates over a collector?s pools as above), and new code can decide exactly what it wants to report. Do/should the non-G1 collectors have an archive space too? If so, we should just call it ?Archive Space? and not make it G1 specific. I?d guess for non-G1 collectors it?s just the initial live prefix of the old gen and therefore ignored during a collection. In this scheme, the initial value of ?used? for all collectors? oldgens would be the size of the archive space. All the archive space?s MemoryUsage attributes would have the same value. Not attaching the archive space to any collector seems correct, because a collector?s memory pools are defined to be the ones that it affects. Currently, Usage.max and CollectionUsage.max for all G1 memory pools other than the oldgen is -1, for ?undefined?. For the oldgen it?s ?Xmx. This makes it easy to generate aggregate metrics for all the collectors by just summing used values and dividing by the sum of max values as described above. It would be nice to keep this characteristic. Otherwise we?d have to write special-case code for G1, and change existing code to check for which JDK we?re running on. I?m uncertain whether your definition of the memory pool usage fields is for MemoryPool.Usage or MemoryPool.CollectionUsage. Seems like the former, which is fine and matches the current definition, except for max. All the existing collector names have an implicit ? Collector? suffix, e.g., ?ConcurrentMarkSweep? means ?ConcurrentMarkSweep Collector?. So, I?d use ?G1 Young? ?G1 Mixed? ?G1 Full? ?G1 Concurrent Cycle? keep the existing ?G1 Young Collection? as an alias for both ?G1 Young? and ?G1 Mixed?, and keep the existing ?G1 Old Collection? as an alias for ?G1 Full?. If you accept this and the memory pool name suggestions, then strictly speaking you don?t need a CSR. The definition of the ?G1 Concurrent Cycle? elapsed time makes sense to me. The young collection would still be reported separately, right? Paul On 1/25/18, 5:27 AM, "Erik Helin" wrote: Hi Paul, thanks for your interest in this area and for your patch! The GarbageCollectorMXBean and MemoryPoolMXBean support for G1 is in need of some updates, so thanks for working on this. Looking at your patch, I'm not sure that this is the direction we want to go in. I discussed this a bit with Thomas and Stefan J, and our current line of thinking is the following: - Memory pools (MemoryMXBean): - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Old Regions" - "G1 Humongous Regions" - "G1 Archive Regions" (if CDS and/or AppCDS is used) `init` for these pools would be 0, `used` would be total size of the "live" objects in the used regions of that type, `committed` the total size of the used regions of that that type and `max` would be MaxHeapSize. Note that "live" here means live from the GCs point of view, i.e. an object might be dead in an old region but the GC will consider that object live until a concurrent cycle has marked through the heap and deemed it dead. - Collectors (GarbageCollectorMXBean): - "G1 Young Collector" with the pools - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Humongous Regions" (due to early reclamation) - "G1 Mixed Collector" with the pools - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Old Regions" - "G1 Humongous Regions" (due to early reclamation) - "G1 Full Collector" with the pools - "G1 Eden Regions" - "G1 Survivor Regions" - "G1 Old Regions" - "G1 Humongous Regions" (can collect empty humongous regions) - "G1 Concurrent Cycle" with the pools - "G1 Old Regions" (can collect empty old regions) - "G1 Humongous Regions" (can collect empty humongous regions) Note that with this design, the GarbageCollectorMXBean::getCollectionTime() method for "G1 Concurrent Cycle" would be the wall clock time from start of the first initial mark to end of the last cleanup (also including the time of any eventual young collection during the concurrent cycle). So GarbageCollectorMXBean::getCollectionTime() would be a mix of concurrent and STW time for the GarbageCollectorMXBean with name "G1 Concurrent Cycle". Also note that the MemoryPoolMXBean with name "G1 Archive Regions" will not be attached to any GarbageCollectorMXBean, since those regions will never be collected. What do you think about this design, would it work for your use case? If we want to go ahead with this design, then I think we might have to file a CSR. David (who is the HotSpot CSR representative), do we have to file a CSR for changing the names of MemoryPoolMXBeans and GarbageCollectorMXBeans? Thanks, Erik On 01/20/2018 12:40 AM, Hohensee, Paul wrote: > I?d appreciate a review please. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 > > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.00/ > > The bug is that from the JMX point of view, G1?s incremental collector > (misnamed as the ?G1 Young Generation? collector) only affects G1?s > survivor and eden spaces. In fact, mixed collections run by this > collector also affect the G1 old generation. > > This proposed fix is to record, for each of a JMX garbage collector's > memory pools, whether that memory pool is affected by all collections > using that collector. And, for each collection, record whether or not > all the collector's memory pools are affected. After each collection, > for each memory pool, if either all the collector's memory pools were > affected or the memory pool is affected for all collections, record > CollectionUsage for that pool. > > For collectors other than G1 Young Generation, all pools are recorded as > affected by all collections and every collection is recorded as > affecting all the collector?s memory pools. For the G1 Young Generation > collector, the G1 Old Gen pool is recorded as not being affected by all > collections, and non-mixed collections are recorded as not affecting all > memory pools. The result is that for non-mixed collections, > CollectionUsage is recorded after a collection only the G1 Eden Space > and G1 Survivor Space pools, while for mixed collections CollectionUsage > is recorded for G1 Old Gen as well. > > Other than the effect of the fix on G1 Old Gen MemoryPool. > CollectionUsage, the only external behavior change is that > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names > rather than 2. > > With this fix, a collector?s memory pools can be divided into two > disjoint subsets, one that participates in all collections and one that > doesn?t. This is a bit more general than the minimum necessary to fix > G1, but not by much. Because I expect it to apply to other incremental > region-based collectors, I went with the more general solution. I > minimized the amount of code I had to touch by using default parameters > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors. > > Tested by running the new jtreg test included in the webrev. I tried to > use the submit repo, but it was out of order earlier today, so I?d be > much obliged if someone could run it through mach5 and sponsor an > eventual push. I successfully ran a JDK8 version of the patch through > all the JDK8 jtreg tests as well as the JDK8 TCK. > > Thanks, > > Paul > From wessam at google.com Fri Jan 26 20:36:01 2018 From: wessam at google.com (Wessam Hassanein) Date: Fri, 26 Jan 2018 12:36:01 -0800 Subject: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? Message-ID: Hi All, Per JEP 291( http://openjdk.java.net/jeps/291) the CMS collector was deprecated but there was no clear date of when it is expected for the CMS code to be dropped. I see code refactoring in JDK10 JEP 304 ( http://openjdk.java.net/jeps/304) and I am wondering whether CMS is planned to be dropped in JDK11 or when it is expected to be dropped? Thanks, Wessam Hassanein Google GC TL -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Fri Jan 26 20:53:39 2018 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 26 Jan 2018 12:53:39 -0800 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> Message-ID: <731b851e-ddf5-1bcc-a333-1c73a0ea875d@oracle.com> Hi Erik, The proposal you outline below is reasonable. The API was designed to allow any number of memory pools managed by a memory manager that can represent different phases of a garbage collector or other resource manager to expose various metrics. How G1 exposes these monitoring metrics is implementation specific. More comments inlined below. On 1/25/18, 5:27 AM, "Erik Helin" wrote: > > Hi Paul, > > thanks for your interest in this area and for your patch! The > GarbageCollectorMXBean and MemoryPoolMXBean support for G1 is in need of > some updates, so thanks for working on this. > > Looking at your patch, I'm not sure that this is the direction we want > to go in. I discussed this a bit with Thomas and Stefan J, and our > current line of thinking is the following: > > - Memory pools (MemoryMXBean): > - "G1 Eden Regions" > - "G1 Survivor Regions" > - "G1 Old Regions" > - "G1 Humongous Regions" > - "G1 Archive Regions" (if CDS and/or AppCDS is used) > Can you describe more about G1 archive regions? Is it an immutable region that no object will be added or removed? > `init` for these pools would be 0, `used` would be total size of the > "live" objects in the used regions of that type, `committed` the total > size of the used regions of that that type and `max` would be > MaxHeapSize. Note that "live" here means live from the GCs point of > view, i.e. an object might be dead in an old region but the GC will > consider that object live until a concurrent cycle has marked through > the heap and deemed it dead. As specified in MemoryPoolMXBean spec, for MemoryPoolMXBean::getUsage of a garbage-collected memory pool, the amount of used memory includes the memory occupied by all objects in the pool including both reachable and unreachable objects. OTOH, getCollectionUsage returns the memory usage after JVM most recently expended effort in recycling unused objects in this memory pool. It would depend on what a GC cycle is defined for G1 and at what phase G1 can record the "collection usage" with low overhead. Definitely the API does not request G1 to perform any GC-like action and "collection usage" reports how much memory is used at most recent GC cycle after some memory has been reclaimed. > > - Collectors (GarbageCollectorMXBean): > - "G1 Young Collector" with the pools > - "G1 Eden Regions" > - "G1 Survivor Regions" > - "G1 Humongous Regions" (due to early reclamation) > - "G1 Mixed Collector" with the pools > - "G1 Eden Regions" > - "G1 Survivor Regions" > - "G1 Old Regions" > - "G1 Humongous Regions" (due to early reclamation) > - "G1 Full Collector" with the pools > - "G1 Eden Regions" > - "G1 Survivor Regions" > - "G1 Old Regions" > - "G1 Humongous Regions" (can collect empty humongous regions) > - "G1 Concurrent Cycle" with the pools > - "G1 Old Regions" (can collect empty old regions) > - "G1 Humongous Regions" (can collect empty humongous regions) > > Note that with this design, the > GarbageCollectorMXBean::getCollectionTime() method for "G1 Concurrent > Cycle" would be the wall clock time from start of the first initial mark > to end of the last cleanup (also including the time of any eventual > young collection during the concurrent cycle). So > GarbageCollectorMXBean::getCollectionTime() would be a mix of concurrent > and STW time for the GarbageCollectorMXBean with name "G1 Concurrent Cycle". > GarbageCollectorMXBean API was defined prior to G1. It's a future enhancement to investigate how to represent the concurrent garbage collection metrics for better monitoring purpose. Do you have any thought for monitoring of G1 GC metrics besides memory usage? > Also note that the MemoryPoolMXBean with name "G1 Archive Regions" will > not be attached to any GarbageCollectorMXBean, since those regions will > never be collected. > > What do you think about this design, would it work for your use case? > > If we want to go ahead with this design, then I think we might have to > file a CSR. David (who is the HotSpot CSR representative), do we have to > file a CSR for changing the names of MemoryPoolMXBeans and > GarbageCollectorMXBeans? The names are not a supported interface but I'm not surprised applications may depend on the names. CSR as well as a release note to document this change is reasonable. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Fri Jan 26 22:38:18 2018 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 26 Jan 2018 14:38:18 -0800 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> Message-ID: <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> On 1/25/18 1:04 PM, Hohensee, Paul wrote: > The JMX API spec doesn?t specify what the memory pool or garbage > collector names are, but the current names are de-facto part of the > API, so if we change the existing ones, imo a CSR should be filed. The names are implementation details but I can see how an application might be impacted if they depend on it. CSR approval is not strictly necessary while I think filing one to document the change would be good. Does the name change impact any application you know of?? I'm trying to understand if any improvement to API is needed so that applications don't need to depend on the names. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Mon Jan 29 16:27:25 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 29 Jan 2018 16:27:25 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> Message-ID: <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> A name change would affect Amazon?s heap monitoring, and thus I expect it would affect other users as well. As long as there are gc-specific memory pools, we?re going to need to be able to identify them, and right now that?s done via name. All the mxbeans are identified by name, so that?s a general design principle. The only way I can think of to get rid of name dependency would be to figure out what abstract metrics users want to monitor and implement them for all collectors. HeapUsage (instantaneous occupancy) is one, CollectionUsage (long-lived occupancy) is another, both of these for the entire heap, not just particular memory pools. That said, imo there will always be a demand for the ability to get collector and memory pool specific details, so I don?t see a way to get around providing named entities. Paul From: mandy chung Organization: Oracle Corporation Date: Friday, January 26, 2018 at 2:38 PM To: "Hohensee, Paul" , Erik Helin , David Holmes Cc: "serviceability-dev at openjdk.java.net" , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results On 1/25/18 1:04 PM, Hohensee, Paul wrote: > The JMX API spec doesn?t specify what the memory pool or garbage > collector names are, but the current names are de-facto part of the > API, so if we change the existing ones, imo a CSR should be filed. The names are implementation details but I can see how an application might be impacted if they depend on it. CSR approval is not strictly necessary while I think filing one to document the change would be good. Does the name change impact any application you know of? I'm trying to understand if any improvement to API is needed so that applications don't need to depend on the names. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Mon Jan 29 18:35:20 2018 From: mandy.chung at oracle.com (mandy chung) Date: Mon, 29 Jan 2018 10:35:20 -0800 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> Message-ID: Thanks for the reply Paul.?? Try to understand a little more on the specific from gc-specific memory pool you depend on. On 1/29/18 8:27 AM, Hohensee, Paul wrote: > > A name change would affect Amazon?s heap monitoring, and thus I expect > it would affect other users as well. > > As long as there are gc-specific memory pools, we?re going to need to > be able to identify them, and right now that?s done via name. > MemoryPoolMXBean::getType returns "heap" memory type for GC-specific memory pools.? Are you using this method?? Do you use the name to build in specific characteristic of a memory pool (e.g. eden vs old gen)? > All the mxbeans are identified by name, so that?s a general design > principle. The only way I can think of to get rid of name dependency > would be to figure out what abstract metrics users want to monitor and > implement them for all collectors. HeapUsage (instantaneous occupancy) > is one, CollectionUsage (long-lived occupancy) is another, both of > these for the entire heap, not just particular memory pools. > The sum of HeapUsage and CollectionUsage of all heap memory pools was expected to give an incorrect approximation for the entire heap usage.? Are you seeing issue/bug with the sum result? Mandy > That said, imo there will always be a demand for the ability to get > collector and memory pool specific details, so I don?t see a way to > get around providing named entities. > > Paul > > *From: *mandy chung > *Organization: *Oracle Corporation > *Date: *Friday, January 26, 2018 at 2:38 PM > *To: *"Hohensee, Paul" , Erik Helin > , David Holmes > *Cc: *"serviceability-dev at openjdk.java.net" > , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > On 1/25/18 1:04 PM, Hohensee, Paul wrote: > > > > The JMX API spec doesn?t specify what the memory pool or garbage > > collector names are, but the current names are de-facto part of the > > API, so if we change the existing ones, imo a CSR should be filed. > > The names are implementation details but I can see how an application > might be impacted if they depend on it.? CSR approval is not strictly > necessary while I think filing one to document the change would be > good. > > Does the name change impact any application you know of?? I'm trying to > understand if any improvement to API is needed so that applications > don't need to depend on the names. > > > Mandy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Mon Jan 29 18:52:06 2018 From: mandy.chung at oracle.com (mandy chung) Date: Mon, 29 Jan 2018 10:52:06 -0800 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> Message-ID: <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> On 1/29/18 10:35 AM, mandy chung wrote: > Thanks for the reply Paul.?? Try to understand a little more on the > specific from gc-specific memory pool you depend on. > > On 1/29/18 8:27 AM, Hohensee, Paul wrote: >> >> A name change would affect Amazon?s heap monitoring, and thus I >> expect it would affect other users as well. >> >> As long as there are gc-specific memory pools, we?re going to need to >> be able to identify them, and right now that?s done via name. >> > > MemoryPoolMXBean::getType returns "heap" memory type for GC-specific > memory pools.? Are you using this method?? Do you use the name to > build in specific characteristic of a memory pool (e.g. eden vs old gen)? > > >> All the mxbeans are identified by name, so that?s a general design >> principle. The only way I can think of to get rid of name dependency >> would be to figure out what abstract metrics users want to monitor >> and implement them for all collectors. HeapUsage (instantaneous >> occupancy) is one, CollectionUsage (long-lived occupancy) is another, >> both of these for the entire heap, not just particular memory pools. >> > > The sum of HeapUsage and CollectionUsage of all heap memory pools was > expected to give an incorrect approximation for the entire heap > usage.? Are you seeing issue/bug with the sum result? > typo: s/an incorrect approximation/an approximation. Mandy > Mandy > >> That said, imo there will always be a demand for the ability to get >> collector and memory pool specific details, so I don?t see a way to >> get around providing named entities. >> >> Paul >> >> *From: *mandy chung >> *Organization: *Oracle Corporation >> *Date: *Friday, January 26, 2018 at 2:38 PM >> *To: *"Hohensee, Paul" , Erik Helin >> , David Holmes >> *Cc: *"serviceability-dev at openjdk.java.net" >> , >> "hotspot-gc-dev at openjdk.java.net" >> *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool >> CollectionUsage.used values don't reflect mixed GC results >> >> On 1/25/18 1:04 PM, Hohensee, Paul wrote: >> >> >> > The JMX API spec doesn?t specify what the memory pool or garbage > >> collector names are, but the current names are de-facto part of the > >> API, so if we change the existing ones, imo a CSR should be filed. >> >> The names are implementation details but I can see how an application >> might be impacted if they depend on it.? CSR approval is not strictly >> necessary while I think filing one to document the change would be >> good. >> >> Does the name change impact any application you know of?? I'm trying to >> understand if any improvement to API is needed so that applications >> don't need to depend on the names. >> >> >> Mandy >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Jan 29 21:00:40 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jan 2018 07:00:40 +1000 Subject: PING: RFR: 8194249: SA: G1HeapRegionTable#getByAddress() returns incorrect HeapRegion In-Reply-To: <1efaeb12-720c-e93e-6010-38b3687d3bf9@gmail.com> References: <323f09cf-a036-42b7-990c-0e42ad511a8f@gmail.com> <3be47bde-46c1-3ad2-7521-5eecceea5dee@oracle.com> <7332f28e-fe93-56c2-ce44-bff374b83728@gmail.com> <2bc3972c-837f-076e-60a3-a3b019761a97@oracle.com> <122a2fbf-6c0d-6209-28b4-0b03064b65ab@gmail.com> <1efaeb12-720c-e93e-6010-38b3687d3bf9@gmail.com> Message-ID: <03152805-7b8b-b107-cb99-0d2b53c0dabc@oracle.com> Added in hotspot-gc-dev. Although this is in the SA it is about the SA interaction with G1 and so likely needs someone familiar with G1 to review it. David On 28/01/2018 10:41 PM, Yasumasa Suenaga wrote: > PING: Could you review it? > >>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8194249/webrev.01/ > > This webrev has been reviewed by Jini. > I need a Reviewer and sponsor. > > > Yasumasa > > > On 2018/01/22 19:53, Yasumasa Suenaga wrote: >> Hi Jini, >> >> Thank you for your review! >> I will update the copyright year in this changeset. >> >> I'm waiting for Reviewer and sponsor. >> >> >> Yasumasa >> >> >> On 2018/01/22 13:14, Jini George wrote: >>> Hi Yasumasa, >>> >>> The changes look good to me. Please do update the copyright year to >>> 2018. >>> >>> Thanks! >>> Jini (Not a Reviewer). >>> >>> >>> >>> On 12/31/2017 10:03 AM, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> >>>>> How did you submit to mach5 ??? >>>> >>>> I'm using Submit Repo for testing: >>>> ?? https://wiki.openjdk.java.net/display/Build/Submit+Repo >>>> >>>> >>>>> Anyway the failure is with: >>>> >>>> Thanks! >>>> I've fixed them in new webrev: >>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8194249/webrev.01/ >>>> >>>> This webrev has passed Mach 5 tier 1 tests in Submit Repo: >>>> http://java.se.oracle.com:10065/mdash/jobs/mach5-one-ysuenaga-JDK-8194249-20171231-0202-8291 >>>> >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/12/30 10:31, David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Not a review ... >>>>> >>>>> On 29/12/2017 11:16 PM, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> G1HeapRegionTable#getByAddress() returns incorrect HeapRegion >>>>>> which contains incorrect address. We can see it in Stack Memory >>>>>> window on HSDB. Some oop addresses are shown as Free Region >>>>>> (attached image). >>>>>> >>>>>> G1HeapRegion#getByAddress() should create HeapRegion instance from >>>>>> the address in _biased_base array. >>>>>> >>>>>> I uploaded webrev. Could you review it? >>>>>> >>>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8194249/webrev.00/ >>>>>> >>>>>> I've tested this change with test/hotspot/jtreg/serviceability/sa, >>>>>> it works fine. >>>>>> But I received some failure from Mach 5. I also tested this change >>>>>> via submit repos. >>>>>> >>>>>> http://java.se.oracle.com:10065/mdash/jobs/mach5-one-ysuenaga-JDK-8194249-20171228-0605-8272 >>>>>> >>>>>> >>>>>> I cannot access this URL. Could you share the result? >>>>> >>>>> How did you submit to mach5 ??? >>>>> >>>>> Anyway the failure is with: >>>>> >>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java >>>>> >>>>> On linux and OS X: >>>>> >>>>> ??stderr: [Exception in thread "main" java.lang.NullPointerException >>>>> ?????at >>>>> TestG1HeapRegion$G1HeapRegionTestClosure.doSpace(TestG1HeapRegion.java:70) >>>>> >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.gc.g1.G1CollectedHeap.heapRegionIterate(G1CollectedHeap.java:121) >>>>> >>>>> ?????at TestG1HeapRegion.scanHeapRegion(TestG1HeapRegion.java:81) >>>>> ?????at TestG1HeapRegion.main(TestG1HeapRegion.java:129) >>>>> >>>>> On Solaris sparcv9: >>>>> >>>>> ??stderr: [Exception in thread "main" java.lang.RuntimeException: >>>>> Address of HeapRegion does not match.: expected 0x00000007afb00000 >>>>> to equal 0x00000007afc00000 >>>>> ?????at jdk.test.lib.Asserts.fail(Asserts.java:594) >>>>> ?????at jdk.test.lib.Asserts.assertEquals(Asserts.java:205) >>>>> ?????at >>>>> TestG1HeapRegion$G1HeapRegionTestClosure.doSpace(TestG1HeapRegion.java:70) >>>>> >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.gc.g1.G1CollectedHeap.heapRegionIterate(G1CollectedHeap.java:121) >>>>> >>>>> ?????at TestG1HeapRegion.scanHeapRegion(TestG1HeapRegion.java:81) >>>>> ?????at TestG1HeapRegion.main(TestG1HeapRegion.java:129) >>>>> ] >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Also I cannot access JPRT. So I need a sponsor. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa From hohensee at amazon.com Mon Jan 29 21:02:36 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 29 Jan 2018 21:02:36 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> Message-ID: We don?t use getType, and you guessed correctly: we use the memory pool name as an indicator of the specific characteristics of a memory pool, in particular eden. What we want is an indication of long term heap occupancy. We calculate it using CollectionUsage for non-eden heap memory pools, regardless of collector. We don?t use JMX notification, rather we periodically poll CollectionUsage for memory pools with names that contain ?Old?, ?Tenured?, or ?Survivor?. We get the memory pools from the GarbageCollectorMXBeans (we don?t care what the collector names are). For the named memory pools, we sum CollectionUsage.used and divide by the sum of CollectionUsage.max to get a long term heap occupancy percentage. We don?t want to include eden because it?s really just an allocation buffer and not part of the storage for long-lived objects. I suppose we could use a negative test instead by using all memory pools with names that don?t include ?Eden?. The bug is that the ?G1 Old Gen? memory pool isn?t being updated when the ?G1 Young Generation? collector runs a mixed collection. As far as JMX is concerned, that collector only knows about eden and the survivor space. The patch adds the old gen to the memory pools it knows about and has mixed collections update the old gen?s CollectionUsage. I managed to get a submit repo run to succeed last week and it found a problem. I?ve uploaded a new webrev that fixes the failure of the jtreg test TestMemoryMXBeansAndPoolsPresence.java due to the young gen collector being expected to know only about eden and the survivor space. http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ Waiting on the submit repo to come back with a result on it. Thanks, Paul From: mandy chung Organization: Oracle Corporation Date: Monday, January 29, 2018 at 10:52 AM To: "Hohensee, Paul" , Erik Helin , David Holmes Cc: "serviceability-dev at openjdk.java.net" , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results On 1/29/18 10:35 AM, mandy chung wrote: Thanks for the reply Paul. Try to understand a little more on the specific from gc-specific memory pool you depend on. On 1/29/18 8:27 AM, Hohensee, Paul wrote: A name change would affect Amazon?s heap monitoring, and thus I expect it would affect other users as well. As long as there are gc-specific memory pools, we?re going to need to be able to identify them, and right now that?s done via name. MemoryPoolMXBean::getType returns "heap" memory type for GC-specific memory pools. Are you using this method? Do you use the name to build in specific characteristic of a memory pool (e.g. eden vs old gen)? All the mxbeans are identified by name, so that?s a general design principle. The only way I can think of to get rid of name dependency would be to figure out what abstract metrics users want to monitor and implement them for all collectors. HeapUsage (instantaneous occupancy) is one, CollectionUsage (long-lived occupancy) is another, both of these for the entire heap, not just particular memory pools. The sum of HeapUsage and CollectionUsage of all heap memory pools was expected to give an incorrect approximation for the entire heap usage. Are you seeing issue/bug with the sum result? typo: s/an incorrect approximation/an approximation. Mandy Mandy That said, imo there will always be a demand for the ability to get collector and memory pool specific details, so I don?t see a way to get around providing named entities. Paul From: mandy chung Organization: Oracle Corporation Date: Friday, January 26, 2018 at 2:38 PM To: "Hohensee, Paul" , Erik Helin , David Holmes Cc: "serviceability-dev at openjdk.java.net" , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results On 1/25/18 1:04 PM, Hohensee, Paul wrote: > The JMX API spec doesn?t specify what the memory pool or garbage > collector names are, but the current names are de-facto part of the > API, so if we change the existing ones, imo a CSR should be filed. The names are implementation details but I can see how an application might be impacted if they depend on it. CSR approval is not strictly necessary while I think filing one to document the change would be good. Does the name change impact any application you know of? I'm trying to understand if any improvement to API is needed so that applications don't need to depend on the names. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Jan 29 21:03:52 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jan 2018 07:03:52 +1000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> Message-ID: <4add2e5b-cae9-7b3c-cb1c-277a8490a732@oracle.com> On the CSR question, yes this would need a CSR just to ensure the compatibility issues have been covered. David On 25/01/2018 11:20 PM, Erik Helin wrote: > Hi Paul, > > thanks for your interest in this area and for your patch! The > GarbageCollectorMXBean and MemoryPoolMXBean support for G1 is in need of > some updates, so thanks for working on this. > > Looking at your patch, I'm not sure that this is the direction we want > to go in. I discussed this a bit with Thomas and Stefan J, and our > current line of thinking is the following: > > - Memory pools (MemoryMXBean): > ? - "G1 Eden Regions" > ? - "G1 Survivor Regions" > ? - "G1 Old Regions" > ? - "G1 Humongous Regions" > ? - "G1 Archive Regions" (if CDS and/or AppCDS is used) > > `init` for these pools would be 0, `used` would be total size of the > "live" objects in the used regions of that type, `committed` the total > size of the used regions of that that type and `max` would be > MaxHeapSize. Note that "live" here means live from the GCs point of > view, i.e. an object might be dead in an old region but the GC will > consider that object live until a concurrent cycle has marked through > the heap and deemed it dead. > > - Collectors (GarbageCollectorMXBean): > ? - "G1 Young Collector" with the pools > ??? - "G1 Eden Regions" > ??? - "G1 Survivor Regions" > ??? - "G1 Humongous Regions" (due to early reclamation) > ? - "G1 Mixed Collector" with the pools > ??? - "G1 Eden Regions" > ??? - "G1 Survivor Regions" > ??? - "G1 Old Regions" > ??? - "G1 Humongous Regions" (due to early reclamation) > ? - "G1 Full Collector" with the pools > ??? - "G1 Eden Regions" > ??? - "G1 Survivor Regions" > ??? - "G1 Old Regions" > ??? - "G1 Humongous Regions" (can collect empty humongous regions) > ? - "G1 Concurrent Cycle" with the pools > ??? - "G1 Old Regions" (can collect empty old regions) > ??? - "G1 Humongous Regions" (can collect empty humongous regions) > > Note that with this design, the > GarbageCollectorMXBean::getCollectionTime() method for "G1 Concurrent > Cycle" would be the wall clock time from start of the first initial mark > to end of the last cleanup (also including the time of any eventual > young collection during the concurrent cycle). So > GarbageCollectorMXBean::getCollectionTime() would be a mix of concurrent > and STW time for the GarbageCollectorMXBean with name "G1 Concurrent > Cycle". > > Also note that the MemoryPoolMXBean with name "G1 Archive Regions" will > not be attached to any GarbageCollectorMXBean, since those regions will > never be collected. > > What do you think about this design, would it work for your use case? > > If we want to go ahead with this design, then I think we might have to > file a CSR. David (who is the HotSpot CSR representative), do we have to > file a CSR for changing the names of MemoryPoolMXBeans and > GarbageCollectorMXBeans? > > Thanks, > Erik > > On 01/20/2018 12:40 AM, Hohensee, Paul wrote: >> I?d appreciate a review please. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8195115 >> >> Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.00/ >> >> The bug is that from the JMX point of view, G1?s incremental collector >> (misnamed as the ?G1 Young Generation? collector) only affects G1?s >> survivor and eden spaces. In fact, mixed collections run by this >> collector also affect the G1 old generation. >> >> This proposed fix is to record, for each of a JMX garbage collector's >> memory pools, whether that memory pool is affected by all collections >> using that collector. And, for each collection, record whether or not >> all the collector's memory pools are affected. After each collection, >> for each memory pool, if either all the collector's memory pools were >> affected or the memory pool is affected for all collections, record >> CollectionUsage for that pool. >> >> For collectors other than G1 Young Generation, all pools are recorded >> as affected by all collections and every collection is recorded as >> affecting all the collector?s memory pools. For the G1 Young >> Generation collector, the G1 Old Gen pool is recorded as not being >> affected by all collections, and non-mixed collections are recorded as >> not affecting all memory pools. The result is that for non-mixed >> collections, CollectionUsage is recorded after a collection only the >> G1 Eden Space and G1 Survivor Space pools, while for mixed collections >> CollectionUsage is recorded for G1 Old Gen as well. >> >> Other than the effect of the fix on G1 Old Gen MemoryPool. >> CollectionUsage, the only external behavior change is that >> GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names >> rather than 2. >> >> With this fix, a collector?s memory pools can be divided into two >> disjoint subsets, one that participates in all collections and one >> that doesn?t. This is a bit more general than the minimum necessary to >> fix G1, but not by much. Because I expect it to apply to other >> incremental region-based collectors, I went with the more general >> solution. I minimized the amount of code I had to touch by using >> default parameters for GCMemoryManager::add_pool and the >> TraceMemoryManagerStats constructors. >> >> Tested by running the new jtreg test included in the webrev. I tried >> to use the submit repo, but it was out of order earlier today, so I?d >> be much obliged if someone could run it through mach5 and sponsor an >> eventual push. I successfully ran a JDK8 version of the patch through >> all the JDK8 jtreg tests as well as the JDK8 TCK. >> >> Thanks, >> >> Paul >> From mandy.chung at oracle.com Mon Jan 29 21:40:41 2018 From: mandy.chung at oracle.com (mandy chung) Date: Mon, 29 Jan 2018 13:40:41 -0800 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> Message-ID: <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> I created? JDK-8196362 to look into whether it makes sense to provide some categorization to differentiate eden space vs the heap space for long-lived objects. W.r.t. to JDK-8195115, I have to defer to GC team to comment on the mixed collection update.? If they are okay, I have no objection to implement a short-term fix and do the proper G1 memory pools as a separate patch. Mandy On 1/29/18 1:02 PM, Hohensee, Paul wrote: > > We don?t use getType, and you guessed correctly: we use the memory > pool name as an indicator of the specific characteristics of a memory > pool, in particular eden. > > What we want is an indication of long term heap occupancy. We > calculate it using CollectionUsage for non-eden heap memory pools, > regardless of collector. We don?t use JMX notification, rather we > periodically poll CollectionUsage for memory pools with names that > contain ?Old?, ?Tenured?, or ?Survivor?. We get the memory pools from > the GarbageCollectorMXBeans (we don?t care what the collector names > are). For the named memory pools, we sum CollectionUsage.used and > divide by the sum of CollectionUsage.max to get a long term heap > occupancy percentage. We don?t want to include eden because it?s > really just an allocation buffer and not part of the storage for > long-lived objects. I suppose we could use a negative test instead by > using all memory pools with names that don?t include ?Eden?. > > The bug is that the ?G1 Old Gen? memory pool isn?t being updated when > the ?G1 Young Generation? collector runs a mixed collection. As far as > JMX is concerned, that collector only knows about eden and the > survivor space. The patch adds the old gen to the memory pools it > knows about and has mixed collections update the old gen?s > CollectionUsage. > > I managed to get a submit repo run to succeed last week and it found a > problem. I?ve uploaded a new webrev that fixes the failure of the > jtreg test TestMemoryMXBeansAndPoolsPresence.java due to the young gen > collector being expected to know only about eden and the survivor space. > > http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > Waiting on the submit repo to come back with a result on it. > > Thanks, > > Paul > > *From: *mandy chung > *Organization: *Oracle Corporation > *Date: *Monday, January 29, 2018 at 10:52 AM > *To: *"Hohensee, Paul" , Erik Helin > , David Holmes > *Cc: *"serviceability-dev at openjdk.java.net" > , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > On 1/29/18 10:35 AM, mandy chung wrote: > > Thanks for the reply Paul.?? Try to understand a little more on > the specific from gc-specific memory pool you depend on. > > On 1/29/18 8:27 AM, Hohensee, Paul wrote: > > A name change would affect Amazon?s heap monitoring, and thus > I expect it would affect other users as well. > > As long as there are gc-specific memory pools, we?re going to > need to be able to identify them, and right now that?s done > via name. > > > MemoryPoolMXBean::getType returns "heap" memory type for > GC-specific memory pools.? Are you using this method?? Do you use > the name to build in specific characteristic of a memory pool > (e.g. eden vs old gen)? > > > > All the mxbeans are identified by name, so that?s a general > design principle. The only way I can think of to get rid of > name dependency would be to figure out what abstract metrics > users want to monitor and implement them for all collectors. > HeapUsage (instantaneous occupancy) is one, CollectionUsage > (long-lived occupancy) is another, both of these for the > entire heap, not just particular memory pools. > > > The sum of HeapUsage and CollectionUsage of all heap memory pools > was expected to give an incorrect approximation for the entire > heap usage.? Are you seeing issue/bug with the sum result? > > > typo: s/an incorrect approximation/an approximation. > > Mandy > > > Mandy > > > That said, imo there will always be a demand for the ability > to get collector and memory pool specific details, so I don?t > see a way to get around providing named entities. > > Paul > > *From: *mandy chung > > *Organization: *Oracle Corporation > *Date: *Friday, January 26, 2018 at 2:38 PM > *To: *"Hohensee, Paul" > , Erik Helin > , David > Holmes > *Cc: *"serviceability-dev at openjdk.java.net" > > > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > On 1/25/18 1:04 PM, Hohensee, Paul wrote: > > > > The JMX API spec doesn?t specify what the memory pool or > garbage > collector names are, but the current names are > de-facto part of the > API, so if we change the existing ones, > imo a CSR should be filed. > > The names are implementation details but I can see how an application > > might be impacted if they depend on it.? CSR approval is not strictly > > necessary while I think filing one to document the change would be > > good. > > Does the name change impact any application you know of?? I'm trying to > > understand if any improvement to API is needed so that applications > > don't need to depend on the names. > > > Mandy > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirk at kodewerk.com Mon Jan 29 22:09:21 2018 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Mon, 29 Jan 2018 23:09:21 +0100 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> Message-ID: <1C279B19-5B8A-4622-B6D7-B7CD36C1B0FC@kodewerk.com> > On Jan 29, 2018, at 5:27 PM, Hohensee, Paul wrote: > > A name change would affect Amazon?s heap monitoring, and thus I expect it would affect other users as well. I can name a number of tools that would be disrupted by this type of change. Additionally tooling would be complicated by the need to support both the old and new versions. The current names have been with us for years and they are well known, well documented and well understood. Given the level of disruption this change is likely to cause IMHO you?d need a very very good reason to want to make it. > > As long as there are gc-specific memory pools, we?re going to need to be able to identify them, and right now that?s done via name. All the mxbeans are identified by name, so that?s a general design principle. The only way I can think of to get rid of name dependency would be to figure out what abstract metrics users want to monitor and implement them for all collectors. HeapUsage (instantaneous occupancy) is one, CollectionUsage (long-lived occupancy) is another, both of these for the entire heap, not just particular memory pools. That said, imo there will always be a demand for the ability to get collector and memory pool specific details, so I don?t see a way to get around providing named entities. Agreed?tuning strategies are implementation dependent and sensitive to specific versions. One is always going to need to know this information. Kind regards, Kirk Pepperdine -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Tue Jan 30 02:07:32 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 30 Jan 2018 02:07:32 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> Message-ID: That?s one reviewer who?s ok with a short term patch. Anyone else? And, any reviewers for said short term patch? :) Thanks, Paul From: mandy chung Organization: Oracle Corporation Date: Monday, January 29, 2018 at 1:41 PM To: "Hohensee, Paul" Cc: "serviceability-dev at openjdk.java.net" , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results I created JDK-8196362 to look into whether it makes sense to provide some categorization to differentiate eden space vs the heap space for long-lived objects. W.r.t. to JDK-8195115, I have to defer to GC team to comment on the mixed collection update. If they are okay, I have no objection to implement a short-term fix and do the proper G1 memory pools as a separate patch. Mandy On 1/29/18 1:02 PM, Hohensee, Paul wrote: We don?t use getType, and you guessed correctly: we use the memory pool name as an indicator of the specific characteristics of a memory pool, in particular eden. What we want is an indication of long term heap occupancy. We calculate it using CollectionUsage for non-eden heap memory pools, regardless of collector. We don?t use JMX notification, rather we periodically poll CollectionUsage for memory pools with names that contain ?Old?, ?Tenured?, or ?Survivor?. We get the memory pools from the GarbageCollectorMXBeans (we don?t care what the collector names are). For the named memory pools, we sum CollectionUsage.used and divide by the sum of CollectionUsage.max to get a long term heap occupancy percentage. We don?t want to include eden because it?s really just an allocation buffer and not part of the storage for long-lived objects. I suppose we could use a negative test instead by using all memory pools with names that don?t include ?Eden?. The bug is that the ?G1 Old Gen? memory pool isn?t being updated when the ?G1 Young Generation? collector runs a mixed collection. As far as JMX is concerned, that collector only knows about eden and the survivor space. The patch adds the old gen to the memory pools it knows about and has mixed collections update the old gen?s CollectionUsage. I managed to get a submit repo run to succeed last week and it found a problem. I?ve uploaded a new webrev that fixes the failure of the jtreg test TestMemoryMXBeansAndPoolsPresence.java due to the young gen collector being expected to know only about eden and the survivor space. http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ Waiting on the submit repo to come back with a result on it. Thanks, Paul From: mandy chung Organization: Oracle Corporation Date: Monday, January 29, 2018 at 10:52 AM To: "Hohensee, Paul" , Erik Helin , David Holmes Cc: "serviceability-dev at openjdk.java.net" , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results On 1/29/18 10:35 AM, mandy chung wrote: Thanks for the reply Paul. Try to understand a little more on the specific from gc-specific memory pool you depend on. On 1/29/18 8:27 AM, Hohensee, Paul wrote: A name change would affect Amazon?s heap monitoring, and thus I expect it would affect other users as well. As long as there are gc-specific memory pools, we?re going to need to be able to identify them, and right now that?s done via name. MemoryPoolMXBean::getType returns "heap" memory type for GC-specific memory pools. Are you using this method? Do you use the name to build in specific characteristic of a memory pool (e.g. eden vs old gen)? All the mxbeans are identified by name, so that?s a general design principle. The only way I can think of to get rid of name dependency would be to figure out what abstract metrics users want to monitor and implement them for all collectors. HeapUsage (instantaneous occupancy) is one, CollectionUsage (long-lived occupancy) is another, both of these for the entire heap, not just particular memory pools. The sum of HeapUsage and CollectionUsage of all heap memory pools was expected to give an incorrect approximation for the entire heap usage. Are you seeing issue/bug with the sum result? typo: s/an incorrect approximation/an approximation. Mandy Mandy That said, imo there will always be a demand for the ability to get collector and memory pool specific details, so I don?t see a way to get around providing named entities. Paul From: mandy chung Organization: Oracle Corporation Date: Friday, January 26, 2018 at 2:38 PM To: "Hohensee, Paul" , Erik Helin , David Holmes Cc: "serviceability-dev at openjdk.java.net" , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results On 1/25/18 1:04 PM, Hohensee, Paul wrote: > The JMX API spec doesn?t specify what the memory pool or garbage > collector names are, but the current names are de-facto part of the > API, so if we change the existing ones, imo a CSR should be filed. The names are implementation details but I can see how an application might be impacted if they depend on it. CSR approval is not strictly necessary while I think filing one to document the change would be good. Does the name change impact any application you know of? I'm trying to understand if any improvement to API is needed so that applications don't need to depend on the names. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbin.ehn at oracle.com Tue Jan 30 09:40:34 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 30 Jan 2018 10:40:34 +0100 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: Message-ID: Hi JC, On 01/30/2018 04:22 AM, JC Beyler wrote: > - Collectedheap still needs to call AllocTracer to see if it is to be > sampled, I can't hide everything in it without a bigger refactor (want > me to try?) Yes we need a bigger refactor to do this nicely. I suggested not doing that now, so just rollback to the previously version. Thanks for having a look at it! /Robbin > > On Mon, Jan 29, 2018 at 1:29 AM, Robbin Ehn wrote: >> Hi JC, thanks! >> >> I'm happy with current state, looks good! >> >> Truncated: >> >> On 01/27/2018 05:01 AM, JC Beyler wrote: >>> >>> This is strange but I'm assuming it is because we are not working on >>> the same repo? >>> >>> I used: >>> hg clone http://hg.openjdk.java.net/jdk/hs jdkhs-heap >>> >>> I'll try a new clone on Monday and see. My new version moved hard_end >>> back to public so it should work now. >> >> >> Sorry this compile error was in closed code. >> Now the closed part compiles, thanks. >> >>> >>> Fair enough, hopefully Thomas will chime in. Are you saying that this >>> first version could go in and we can work on a refinement? Or are you >>> saying I should work on this now at the same time and fix it before >>> this V1 goes in? (Just so I know :)) >> >> >> We may have to change this before integration, but for now keep it as is. >> >>> I'll look at this on Monday then! >> >> >> Great! >> >> /Robbin >> >> >>> >>> Thanks for the reply and have a great weekend! >>> Jc >>> >>>> >>>>> >>>>>> #### >>>>>> Minor nit, when declaring pointer there is a little mix of having the >>>>>> pointer adjacent by type name and data name. (Most hotspot code is by >>>>>> type >>>>>> name) >>>>>> E.g. >>>>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>>>> (not just this file) >>>>>> >>>>> >>>>> Done! >>>>> >>>>>> #### >>>>>> HeapMonitorThreadOnOffTest.java:77 >>>>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>>>> theoretical be skipped. >>>>>> >>>>> >>>>> Also done! >>>> >>>> >>>> >>>> Looks good, thanks! >>>> >>>> /Robbin >>>> >>>>> >>>>> Thanks again! >>>>> Jc >>>>> >>>> >> From thomas.schatzl at oracle.com Tue Jan 30 10:06:14 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 30 Jan 2018 11:06:14 +0100 Subject: PING: RFR: 8194249: SA: G1HeapRegionTable#getByAddress() returns incorrect HeapRegion In-Reply-To: <03152805-7b8b-b107-cb99-0d2b53c0dabc@oracle.com> References: <323f09cf-a036-42b7-990c-0e42ad511a8f@gmail.com> <3be47bde-46c1-3ad2-7521-5eecceea5dee@oracle.com> <7332f28e-fe93-56c2-ce44-bff374b83728@gmail.com> <2bc3972c-837f-076e-60a3-a3b019761a97@oracle.com> <122a2fbf-6c0d-6209-28b4-0b03064b65ab@gmail.com> <1efaeb12-720c-e93e-6010-38b3687d3bf9@gmail.com> <03152805-7b8b-b107-cb99-0d2b53c0dabc@oracle.com> Message-ID: <1517306774.2832.6.camel@oracle.com> Hi, On Tue, 2018-01-30 at 07:00 +1000, David Holmes wrote: > Added in hotspot-gc-dev. Although this is in the SA it is about the > SA interaction with G1 and so likely needs someone familiar with G1 > to review it. > > David > > On 28/01/2018 10:41 PM, Yasumasa Suenaga wrote: > > PING: Could you review it? > > > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8194249/webrev.01/ > > > > This webrev has been reviewed by Jini. > > I need a Reviewer and sponsor. looks good to me - however there is another (pre-existing) bug: the shift in that code should be a logical shift, not an arithmetic shift. I.e. ">>" instead of ">>>". I will run the patch through testing and report back in a few hours. Should be okay. Thanks, Thomas From thomas.schatzl at oracle.com Tue Jan 30 10:44:14 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 30 Jan 2018 11:44:14 +0100 Subject: RFR(M): 8195103: Refactor ReduceInitialCardMarks to not assume all GCs use card marks In-Reply-To: <5A5DC8EE.9050806@oracle.com> References: <5A5DC8EE.9050806@oracle.com> Message-ID: <1517309054.2832.14.camel@oracle.com> Hi, On Tue, 2018-01-16 at 10:42 +0100, Erik ?sterlund wrote: ^^ sorry for being a bit late... > Hi, > > The current interface between the compilers and GC regarding the > ReduceInitialCardMarks optimization lives in the CollectedHeap. > However, the optimization is relevant only for collectors with a card > mark barrier set (CardTableModRefBS). Therefore, this interface ought > to be moved into CardTableModRef so that code gets less messy when a > collector does not use card marking. In the process, the > CollectedHeap::pre_initialize member function was removed (as it was > only used for initializing ReduceInitialCardMarks). > > The optimization needs to check if an object is in young or not. > This question is now asked to the barrier set rather than the heap. > For all collectors except G1, this has been implemented by forwarding > the question to the corresponding heap (inlined member function), > which is what was done before. For G1, I chose to instead look at > the card value and see if it is a young card, which should give the > same answer. Marking the cards young is done concurrently to the application. So you could get false answers here. However it seems that this is benign, i.e. at most too many objects are pushed into the deferred card mark from what I can see. However the assert in CardTableModRefBs::flush_deferred_card_mark_barrier() may complain... i.e. at the time when the object is deferred, the result of is_young() may be false, but at the time the deferred card mark is flushed, is_young() will return true. Note that while this occurrence is not very common, it does happen. I think this needs to be fixed. Either the mentioned assert, or the is_young() check. The region type is always good btw. > Bug: > https://bugs.openjdk.java.net/browse/JDK-8195103 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8195103/webrev.00/ > > Testing: mach5 hs-tier1-5 looks good to me otherwise. Thanks, Thomas From erik.osterlund at oracle.com Tue Jan 30 13:25:28 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 30 Jan 2018 14:25:28 +0100 Subject: RFR(M): 8195103: Refactor ReduceInitialCardMarks to not assume all GCs use card marks In-Reply-To: <1517309054.2832.14.camel@oracle.com> References: <5A5DC8EE.9050806@oracle.com> <1517309054.2832.14.camel@oracle.com> Message-ID: <5A707248.2050905@oracle.com> Hi Thomas, Thanks for the review. :) On 2018-01-30 11:44, Thomas Schatzl wrote: > Hi, > > On Tue, 2018-01-16 at 10:42 +0100, Erik ?sterlund wrote: > > ^^ sorry for being a bit late... > >> Hi, >> >> The current interface between the compilers and GC regarding the >> ReduceInitialCardMarks optimization lives in the CollectedHeap. >> However, the optimization is relevant only for collectors with a card >> mark barrier set (CardTableModRefBS). Therefore, this interface ought >> to be moved into CardTableModRef so that code gets less messy when a >> collector does not use card marking. In the process, the >> CollectedHeap::pre_initialize member function was removed (as it was >> only used for initializing ReduceInitialCardMarks). >> >> The optimization needs to check if an object is in young or not. >> This question is now asked to the barrier set rather than the heap. >> For all collectors except G1, this has been implemented by forwarding >> the question to the corresponding heap (inlined member function), >> which is what was done before. For G1, I chose to instead look at >> the card value and see if it is a young card, which should give the >> same answer. > Marking the cards young is done concurrently to the application. So you > could get false answers here. However it seems that this is benign, > i.e. at most too many objects are pushed into the deferred card mark > from what I can see. > > However the assert in > CardTableModRefBs::flush_deferred_card_mark_barrier() may complain... > i.e. at the time when the object is deferred, the result of is_young() > may be false, but at the time the deferred card mark is flushed, > is_young() will return true. > > Note that while this occurrence is not very common, it does happen. > > I think this needs to be fixed. Either the mentioned assert, or the > is_young() check. The region type is always good btw. We discussed this off-list. There is in fact no such race. The compiler slow-path first allocates new memory (TLAB or not). Then it writes young to all of the cards. Then it contemplates whether performing a card mark is necessary for non-young objects to comply with ReduceInitialCardMarks. So by the time the is_young() question is asked, Thread::current() has written the young value, which is always observable to itself. It might be that a concurrent thread over-writes this value with a monotonic card transition to the very same young value, due to crossing the same card boundary with another allocation. In either case, the young value will always be observed by the thread that performed the allocation if and only if the object then resides in young. >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8195103 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8195103/webrev.00/ >> >> Testing: mach5 hs-tier1-5 > looks good to me otherwise. Thanks Thomas! /Erik > Thanks, > Thomas From erik.helin at oracle.com Tue Jan 30 13:50:03 2018 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 30 Jan 2018 14:50:03 +0100 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> Message-ID: <15754d34-f3c7-6431-7abf-05214bd6b9d2@oracle.com> On 01/30/2018 03:07 AM, Hohensee, Paul wrote: > That?s one reviewer who?s ok with a short term patch. Anyone else? And, > any reviewers for said short term patch? :) Well, the patch is not really complete as it is. The problem is the definitions of the MemoryPoolMXBeans and GarbageCollectorMXBeans, which, as I tried to hint at in my first email, is a mess for G1. The names and implementations of these MemoryPoolMXBeans and GarbageCollectionMXBeans for G1 are very old, G1 has changed a lot since those were implemented (hence my suggestion for finally fixing this). The issue with your patch is that the MemoryPoolMXBean named "G1 Old Gen" consists of both old and humongous regions (it will also include archive regions). Old regions can be collected by mixed, concurrent and full collections. Humongous regions can be collected by young, mixed or full collections and the concurrent cycle. Archive regions will never be collected. Your patch will update the pool in the case of a mixed collection collecting old regions or humongous regions, but misses the following cases: - a young collection collecting humongous regions - a concurrent cycle collecting humongous regions - a concurrent cycle collecting old regions Unfortunately I could not come up with a good way to solve the above without re-designing the pools. I'm not sure about accepting your patch as is, since it might cause even more confusion for users compared to the current (already confusing) situation. One idea we have discussed is to implement the re-design but also add a flag, -XX:+UseG1LegacyPoolsAndBeans (false by default), to allow for a smoother transition. Would that solution work for you? Thanks, Erik > Thanks, > > Paul > > *From: *mandy chung > *Organization: *Oracle Corporation > *Date: *Monday, January 29, 2018 at 1:41 PM > *To: *"Hohensee, Paul" > *Cc: *"serviceability-dev at openjdk.java.net" > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > I created? JDK-8196362 to look into whether it makes sense to provide > some categorization to differentiate eden space vs the heap space for > long-lived objects. > > W.r.t. to JDK-8195115, I have to defer to GC team to comment on the > mixed collection update.? If they are okay, I have no objection to > implement a short-term fix and do the proper G1 memory pools as a > separate patch. > > Mandy > > On 1/29/18 1:02 PM, Hohensee, Paul wrote: > > We don?t use getType, and you guessed correctly: we use the memory > pool name as an indicator of the specific characteristics of a > memory pool, in particular eden. > > What we want is an indication of long term heap occupancy. We > calculate it using CollectionUsage for non-eden heap memory pools, > regardless of collector. We don?t use JMX notification, rather we > periodically poll CollectionUsage for memory pools with names that > contain ?Old?, ?Tenured?, or ?Survivor?. We get the memory pools > from the GarbageCollectorMXBeans (we don?t care what the collector > names are). For the named memory pools, we sum CollectionUsage.used > and divide by the sum of CollectionUsage.max to get a long term heap > occupancy percentage. We don?t want to include eden because it?s > really just an allocation buffer and not part of the storage for > long-lived objects. I suppose we could use a negative test instead > by using all memory pools with names that don?t include ?Eden?. > > The bug is that the ?G1 Old Gen? memory pool isn?t being updated > when the ?G1 Young Generation? collector runs a mixed collection. As > far as JMX is concerned, that collector only knows about eden and > the survivor space. The patch adds the old gen to the memory pools > it knows about and has mixed collections update the old gen?s > CollectionUsage. > > I managed to get a submit repo run to succeed last week and it found > a problem. I?ve uploaded a new webrev that fixes the failure of the > jtreg test TestMemoryMXBeansAndPoolsPresence.java due to the young > gen collector being expected to know only about eden and the > survivor space. > > http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > Waiting on the submit repo to come back with a result on it. > > Thanks, > > Paul > > *From: *mandy chung > > *Organization: *Oracle Corporation > *Date: *Monday, January 29, 2018 at 10:52 AM > *To: *"Hohensee, Paul" > , Erik Helin > , David Holmes > > *Cc: *"serviceability-dev at openjdk.java.net" > > > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > On 1/29/18 10:35 AM, mandy chung wrote: > > Thanks for the reply Paul.?? Try to understand a little more on > the specific from gc-specific memory pool you depend on. > > On 1/29/18 8:27 AM, Hohensee, Paul wrote: > > A name change would affect Amazon?s heap monitoring, and > thus I expect it would affect other users as well. > > As long as there are gc-specific memory pools, we?re going > to need to be able to identify them, and right now that?s > done via name. > > > MemoryPoolMXBean::getType returns "heap" memory type for > GC-specific memory pools.? Are you using this method?? Do you > use the name to build in specific characteristic of a memory > pool (e.g. eden vs old gen)? > > > > > All the mxbeans are identified by name, so that?s a general > design principle. The only way I can think of to get rid of > name dependency would be to figure out what abstract metrics > users want to monitor and implement them for all collectors. > HeapUsage (instantaneous occupancy) is one, CollectionUsage > (long-lived occupancy) is another, both of these for the > entire heap, not just particular memory pools. > > > The sum of HeapUsage and CollectionUsage of all heap memory > pools was expected to give an incorrect approximation for the > entire heap usage.? Are you seeing issue/bug with the sum result? > > > typo: s/an incorrect approximation/an approximation. > > Mandy > > > > Mandy > > > > That said, imo there will always be a demand for the ability > to get collector and memory pool specific details, so I > don?t see a way to get around providing named entities. > > Paul > > *From: *mandy chung > > *Organization: *Oracle Corporation > *Date: *Friday, January 26, 2018 at 2:38 PM > *To: *"Hohensee, Paul" > , Erik Helin > , > David Holmes > > *Cc: *"serviceability-dev at openjdk.java.net" > > > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > On 1/25/18 1:04 PM, Hohensee, Paul wrote: > > > > The JMX API spec doesn?t specify what the memory pool or > garbage > collector names are, but the current names are > de-facto part of the > API, so if we change the existing > ones, imo a CSR should be filed. > > The names are implementation details but I can see how an application > > might be impacted if they depend on it.? CSR approval is not strictly > > necessary while I think filing one to document the change would be > > good. > > Does the name change impact any application you know of?? I'm trying to > > understand if any improvement to API is needed so that applications > > don't need to depend on the names. > > > Mandy > > > > > > > > From thomas.schatzl at oracle.com Tue Jan 30 15:51:50 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 30 Jan 2018 16:51:50 +0100 Subject: PING: RFR: 8194249: SA: G1HeapRegionTable#getByAddress() returns incorrect HeapRegion In-Reply-To: <1517306774.2832.6.camel@oracle.com> References: <323f09cf-a036-42b7-990c-0e42ad511a8f@gmail.com> <3be47bde-46c1-3ad2-7521-5eecceea5dee@oracle.com> <7332f28e-fe93-56c2-ce44-bff374b83728@gmail.com> <2bc3972c-837f-076e-60a3-a3b019761a97@oracle.com> <122a2fbf-6c0d-6209-28b4-0b03064b65ab@gmail.com> <1efaeb12-720c-e93e-6010-38b3687d3bf9@gmail.com> <03152805-7b8b-b107-cb99-0d2b53c0dabc@oracle.com> <1517306774.2832.6.camel@oracle.com> Message-ID: <1517327510.2368.27.camel@oracle.com> Hi all, On Tue, 2018-01-30 at 11:06 +0100, Thomas Schatzl wrote: > Hi, > > On Tue, 2018-01-30 at 07:00 +1000, David Holmes wrote: > > Added in hotspot-gc-dev. Although this is in the SA it is about the > > SA interaction with G1 and so likely needs someone familiar with G1 > > to review it. > > > > David > > > > On 28/01/2018 10:41 PM, Yasumasa Suenaga wrote: > > > PING: Could you review it? > > > > > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8194249/webrev.01/ > > > > > > This webrev has been reviewed by Jini. > > > I need a Reviewer and sponsor. > > looks good to me - however there is another (pre-existing) bug: the > shift in that code should be a logical shift, not an arithmetic > shift. > > I.e. ">>" instead of ">>>". > > I will run the patch through testing and report back in a few hours. > Should be okay. is good. Do you want to fix the issue with the shift operator too here, or use another CR? Thanks, Thomas From jeremymanson at google.com Tue Jan 30 18:25:27 2018 From: jeremymanson at google.com (Jeremy Manson) Date: Tue, 30 Jan 2018 10:25:27 -0800 Subject: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? In-Reply-To: References: Message-ID: Hi folks, Wessam's question is pretty important to our planning. As many of you know, we were unable to get the G1 collector as it shipped with JDK 8 to work well with our services. We want to try a newer branch. However, it also takes us a lot of work to make a JDK workable for internal use. We have to patch it to work with our systems, and we have to make a whole lot of tests and infrastructure functional. We'd rather not rush to get out a Java version that will stop being supported in a few months. Java 9 has already seen its last release, and we aren't really all that close to making it work for the kinds of systems that actually care about the GC they are using. We are debating whether to try to continue to make Java 9 work, move to Java 10, or to skip both and go with Java 11, which will see long term support. (If we went to 11, we'd continue to experiment with 9 and 10, but we wouldn't push it hard.) In order to switch to G1, we need a release where our services can compare a recent G1 with CMS. If the last release with both of them is Java 10, then we need to work on Java 10, in spite of its September expiration date. If the last release with both of them is Java 11, then we will have a lot more breathing room for the G1 migration, and we can target that instead. So, it would be really helpful if someone could chime in with more information about the timeline for CMS removal. Is it likely to be before the door closes for the Java 11 release? Thanks! Jeremy On Fri, Jan 26, 2018 at 12:36 PM, Wessam Hassanein wrote: > Hi All, > > Per JEP 291( http://openjdk.java.net/jeps/291) the CMS collector was > deprecated but there was no clear date of when it is expected for the CMS > code to be dropped. I see code refactoring in JDK10 JEP 304 ( > http://openjdk.java.net/jeps/304) and I am wondering whether CMS is > planned to be dropped in JDK11 or when it is expected to be dropped? > > Thanks, > > Wessam Hassanein > Google GC TL > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.su at oracle.com Wed Jan 31 00:07:54 2018 From: paul.su at oracle.com (Paul Su) Date: Tue, 30 Jan 2018 16:07:54 -0800 Subject: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? In-Reply-To: References: Message-ID: <0B0D105A-125F-4ADF-9704-13BE59F1B655@oracle.com> Hi Jeremy, Wessam, At this time, removal of CMS is still under consideration for JDK 11 but unlikely to happen within that timeframe. However, there were significant performance improvements to G1 in JDK 9 and 10, so it may be worthwhile to evaluate those with your use cases. As always, any feedback is welcome and appreciated. Thanks, Paul > > Subject: Re: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? > Date: Tue, 30 Jan 2018 10:25:27 -0800 > From: Jeremy Manson > To: Wessam Hassanein > CC: hotspot-gc-dev at openjdk.java.net openjdk.java.net > > Hi folks, > > Wessam's question is pretty important to our planning. > > As many of you know, we were unable to get the G1 collector as it shipped with JDK 8 to work well with our services. We want to try a newer branch. > > However, it also takes us a lot of work to make a JDK workable for internal use. We have to patch it to work with our systems, and we have to make a whole lot of tests and infrastructure functional. > > We'd rather not rush to get out a Java version that will stop being supported in a few months. Java 9 has already seen its last release, and we aren't really all that close to making it work for the kinds of systems that actually care about the GC they are using. We are debating whether to try to continue to make Java 9 work, move to Java 10, or to skip both and go with Java 11, which will see long term support. > > (If we went to 11, we'd continue to experiment with 9 and 10, but we wouldn't push it hard.) > > In order to switch to G1, we need a release where our services can compare a recent G1 with CMS. If the last release with both of them is Java 10, then we need to work on Java 10, in spite of its September expiration date. If the last release with both of them is Java 11, then we will have a lot more breathing room for the G1 migration, and we can target that instead. > > So, it would be really helpful if someone could chime in with more information about the timeline for CMS removal. Is it likely to be before the door closes for the Java 11 release? > > Thanks! > > Jeremy > > On Fri, Jan 26, 2018 at 12:36 PM, Wessam Hassanein > wrote: > Hi All, > > Per JEP 291( http://openjdk.java.net/jeps/291 ) the CMS collector was deprecated but there was no clear date of when it is expected for the CMS code to be dropped. I see code refactoring in JDK10 JEP 304 (http://openjdk.java.net/jeps/304 ) and I am wondering whether CMS is planned to be dropped in JDK11 or when it is expected to be dropped? > > Thanks, > > Wessam Hassanein > Google GC TL > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Wed Jan 31 00:49:52 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 31 Jan 2018 09:49:52 +0900 Subject: PING: RFR: 8194249: SA: G1HeapRegionTable#getByAddress() returns incorrect HeapRegion In-Reply-To: <1517327510.2368.27.camel@oracle.com> References: <323f09cf-a036-42b7-990c-0e42ad511a8f@gmail.com> <3be47bde-46c1-3ad2-7521-5eecceea5dee@oracle.com> <7332f28e-fe93-56c2-ce44-bff374b83728@gmail.com> <2bc3972c-837f-076e-60a3-a3b019761a97@oracle.com> <122a2fbf-6c0d-6209-28b4-0b03064b65ab@gmail.com> <1efaeb12-720c-e93e-6010-38b3687d3bf9@gmail.com> <03152805-7b8b-b107-cb99-0d2b53c0dabc@oracle.com> <1517306774.2832.6.camel@oracle.com> <1517327510.2368.27.camel@oracle.com> Message-ID: Hi Thomas, >> looks good to me - however there is another (pre-existing) bug: the >> shift in that code should be a logical shift, not an arithmetic >> shift. >> >> I.e. ">>" instead of ">>>". >> >> I will run the patch through testing and report back in a few hours. >> Should be okay. > > is good. Do you want to fix the issue with the shift operator too > here, or use another CR? Thanks! If the use of ">>>" is the bug, I want to fix it in new bug ticket. I do not think the use of ">>>" is not a bug. I g1BiasedArray.hpp, G1BiasedMappedArray::get_by_address() uses ">>" operator to calculate biased_index: http://hg.openjdk.java.net/jdk/hs/file/ee513596f3ee/src/hotspot/share/gc/g1/g1BiasedArray.hpp#l134 idx_t is defined as size_t. So it is calculated as unsigned value. In JLS 15.19, ">>>" is for unsigned. If we use ">>", it might remain MSB in some cases. https://docs.oracle.com/javase/specs/jls/se9/html/jls-15.html#jls-15.19 Thus I think this is not a bug. Thanks, Yasumasa 2018-01-31 0:51 GMT+09:00 Thomas Schatzl : > Hi all, > > On Tue, 2018-01-30 at 11:06 +0100, Thomas Schatzl wrote: >> Hi, >> >> On Tue, 2018-01-30 at 07:00 +1000, David Holmes wrote: >> > Added in hotspot-gc-dev. Although this is in the SA it is about the >> > SA interaction with G1 and so likely needs someone familiar with G1 >> > to review it. >> > >> > David >> > >> > On 28/01/2018 10:41 PM, Yasumasa Suenaga wrote: >> > > PING: Could you review it? >> > > >> > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8194249/webrev.01/ >> > > >> > > This webrev has been reviewed by Jini. >> > > I need a Reviewer and sponsor. >> >> looks good to me - however there is another (pre-existing) bug: the >> shift in that code should be a logical shift, not an arithmetic >> shift. >> >> I.e. ">>" instead of ">>>". >> >> I will run the patch through testing and report back in a few hours. >> Should be okay. > > is good. Do you want to fix the issue with the shift operator too > here, or use another CR? > > Thanks, > Thomas > From jeremymanson at google.com Wed Jan 31 01:15:27 2018 From: jeremymanson at google.com (Jeremy Manson) Date: Tue, 30 Jan 2018 17:15:27 -0800 Subject: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? In-Reply-To: <0B0D105A-125F-4ADF-9704-13BE59F1B655@oracle.com> References: <0B0D105A-125F-4ADF-9704-13BE59F1B655@oracle.com> Message-ID: Thanks, Paul. We're definitely going to re-evaluate G1 with a more modern JDK, but to do it for real, we need to re-evaluate it alongside something that can run CMS in a production setting. Getting a JDK and our codebase to the point where that is possible at Google is quite a bit of work, so it would be a lot of work for us to rush to do it for 9 or 10 and then start again for 10/11. (The work for 9/10 would help us for 11, but there would be quite a bit of extra work, as well.) In short: "unlikely" is definitely good news for us. :) We'll watch this space for updates, in case that changes. Jeremy On Tue, Jan 30, 2018 at 4:07 PM, Paul Su wrote: > Hi Jeremy, Wessam, > > At this time, removal of CMS is still under consideration for JDK 11 but > unlikely to happen within that timeframe. However, there were significant > performance improvements to G1 in JDK 9 and 10, so it may be worthwhile to > evaluate those with your use cases. As always, any feedback is welcome and > appreciated. > > Thanks, > Paul > > > Subject: Re: Expected JDK version of when CMS code is suppose to be > dropped, is it JDK11? > Date: Tue, 30 Jan 2018 10:25:27 -0800 > From: Jeremy Manson > To: Wessam Hassanein > CC: hotspot-gc-dev at openjdk.java.net openjdk.java.net > > > > Hi folks, > > Wessam's question is pretty important to our planning. > > As many of you know, we were unable to get the G1 collector as it shipped > with JDK 8 to work well with our services. We want to try a newer branch. > > However, it also takes us a lot of work to make a JDK workable for > internal use. We have to patch it to work with our systems, and we have to > make a whole lot of tests and infrastructure functional. > > We'd rather not rush to get out a Java version that will stop being > supported in a few months. Java 9 has already seen its last release, and > we aren't really all that close to making it work for the kinds of systems > that actually care about the GC they are using. We are debating whether to > try to continue to make Java 9 work, move to Java 10, or to skip both and > go with Java 11, which will see long term support. > > (If we went to 11, we'd continue to experiment with 9 and 10, but we > wouldn't push it hard.) > > In order to switch to G1, we need a release where our services can compare > a recent G1 with CMS. If the last release with both of them is Java 10, > then we need to work on Java 10, in spite of its September expiration > date. If the last release with both of them is Java 11, then we will have > a lot more breathing room for the G1 migration, and we can target that > instead. > > So, it would be really helpful if someone could chime in with more > information about the timeline for CMS removal. Is it likely to be before > the door closes for the Java 11 release? > > Thanks! > > Jeremy > > On Fri, Jan 26, 2018 at 12:36 PM, Wessam Hassanein > wrote: > >> Hi All, >> >> Per JEP 291( http://openjdk.java.net/jeps/291) the CMS collector was >> deprecated but there was no clear date of when it is expected for the CMS >> code to be dropped. I see code refactoring in JDK10 JEP 304 ( >> http://openjdk.java.net/jeps/304) and I am wondering whether CMS is >> planned to be dropped in JDK11 or when it is expected to be dropped? >> >> Thanks, >> >> Wessam Hassanein >> Google GC TL >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Wed Jan 31 01:30:59 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 31 Jan 2018 01:30:59 +0000 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: <15754d34-f3c7-6431-7abf-05214bd6b9d2@oracle.com> References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> <15754d34-f3c7-6431-7abf-05214bd6b9d2@oracle.com> Message-ID: It?s true that my patch doesn?t completely solve the larger problem, but it fixes the most immediately important part of it, particularly for JDK8 where current expected behavior is entrenched. If we?re going to fix the larger problem, imo we should file another bug/rfe to do it. I?d be happy to fix that one too, once we have a spec. What do you think of my suggestions? To summarize: - Keep the existing memory pools and add humongous and archive pools. - Make the archive pool part of the old gen, and generalize it to all collectors. - Split humongous regions off from the old gen pool into their own pool. The old gen and humongous pools are disjoint region sets. - Keep the existing ?G1 Young Generation? and ?G1 Old Generation? collectors and their associated memory pools (net of this patch). We add the humongous pool to them. - Add ?G1 Full? as an alias of the existing ?G1 Old Generation? collector. - Add the ?G1 Young?, ?G1 Mixed? and ?G1 Concurrent Cycle? collectors. - Set the G1 old gen memory pool max size to ?Xmx, the archive space max size to whatever it is, and the rest of the G1 memory pool max sizes to -1 == undefined, as now. The resulting memory pools: ?G1 Eden Space? ?G1 Survivor Space? ?G1 Old Gen? ?G1 Humongous Space? ?Archive Space? The resulting collectors and their memory pools: ?G1 Young Generation? (the existing young/mixed collector), ?G1 Old Generation?/?G1 Full?, ?G1 Mixed? - ?G1 Eden Space? - ?G1 Survivor Space? - ?G1 Old Gen? - ?G1 Humongous Space? ?G1 Young? - ?G1 Eden Space? - ?G1 Survivor Space? - ?G1 Humongous Space? ?G1 Concurrent Cycle? - ?G1 Old Gen? - ?G1 Humongous Space? I?m not religious about the names, but I like my suggestions. :) The significant addition to my previous email, and an incompatible change, is splitting humongous regions off from the old gen pool. This means that apps that specifically monitor old gen occupancy will no longer see humongous regions. Monitoring apps that just add up info about all a collector?s pools won?t see a difference. I may be corrected (by Kirk, perhaps), but imo it?s not as bad a compatibility issue as one might think, because the type of app that uses a lot of humongous regions isn?t all that common. E.g., apps that cache data in the heap in the form of large compressed arrays, and apps with large hashmap bucket list arrays. The heaps such apps use are very often large enough to use 32mb regions, hence need really big objects to actually allocate humongous regions. Thanks, Paul On 1/30/18, 5:51 AM, "Erik Helin" wrote: On 01/30/2018 03:07 AM, Hohensee, Paul wrote: > That?s one reviewer who?s ok with a short term patch. Anyone else? And, > any reviewers for said short term patch? :) Well, the patch is not really complete as it is. The problem is the definitions of the MemoryPoolMXBeans and GarbageCollectorMXBeans, which, as I tried to hint at in my first email, is a mess for G1. The names and implementations of these MemoryPoolMXBeans and GarbageCollectionMXBeans for G1 are very old, G1 has changed a lot since those were implemented (hence my suggestion for finally fixing this). The issue with your patch is that the MemoryPoolMXBean named "G1 Old Gen" consists of both old and humongous regions (it will also include archive regions). Old regions can be collected by mixed, concurrent and full collections. Humongous regions can be collected by young, mixed or full collections and the concurrent cycle. Archive regions will never be collected. Your patch will update the pool in the case of a mixed collection collecting old regions or humongous regions, but misses the following cases: - a young collection collecting humongous regions - a concurrent cycle collecting humongous regions - a concurrent cycle collecting old regions Unfortunately I could not come up with a good way to solve the above without re-designing the pools. I'm not sure about accepting your patch as is, since it might cause even more confusion for users compared to the current (already confusing) situation. One idea we have discussed is to implement the re-design but also add a flag, -XX:+UseG1LegacyPoolsAndBeans (false by default), to allow for a smoother transition. Would that solution work for you? Thanks, Erik > Thanks, > > Paul > > *From: *mandy chung > *Organization: *Oracle Corporation > *Date: *Monday, January 29, 2018 at 1:41 PM > *To: *"Hohensee, Paul" > *Cc: *"serviceability-dev at openjdk.java.net" > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > I created JDK-8196362 to look into whether it makes sense to provide > some categorization to differentiate eden space vs the heap space for > long-lived objects. > > W.r.t. to JDK-8195115, I have to defer to GC team to comment on the > mixed collection update. If they are okay, I have no objection to > implement a short-term fix and do the proper G1 memory pools as a > separate patch. > > Mandy > > On 1/29/18 1:02 PM, Hohensee, Paul wrote: > > We don?t use getType, and you guessed correctly: we use the memory > pool name as an indicator of the specific characteristics of a > memory pool, in particular eden. > > What we want is an indication of long term heap occupancy. We > calculate it using CollectionUsage for non-eden heap memory pools, > regardless of collector. We don?t use JMX notification, rather we > periodically poll CollectionUsage for memory pools with names that > contain ?Old?, ?Tenured?, or ?Survivor?. We get the memory pools > from the GarbageCollectorMXBeans (we don?t care what the collector > names are). For the named memory pools, we sum CollectionUsage.used > and divide by the sum of CollectionUsage.max to get a long term heap > occupancy percentage. We don?t want to include eden because it?s > really just an allocation buffer and not part of the storage for > long-lived objects. I suppose we could use a negative test instead > by using all memory pools with names that don?t include ?Eden?. > > The bug is that the ?G1 Old Gen? memory pool isn?t being updated > when the ?G1 Young Generation? collector runs a mixed collection. As > far as JMX is concerned, that collector only knows about eden and > the survivor space. The patch adds the old gen to the memory pools > it knows about and has mixed collections update the old gen?s > CollectionUsage. > > I managed to get a submit repo run to succeed last week and it found > a problem. I?ve uploaded a new webrev that fixes the failure of the > jtreg test TestMemoryMXBeansAndPoolsPresence.java due to the young > gen collector being expected to know only about eden and the > survivor space. > > http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > Waiting on the submit repo to come back with a result on it. > > Thanks, > > Paul > > *From: *mandy chung > > *Organization: *Oracle Corporation > *Date: *Monday, January 29, 2018 at 10:52 AM > *To: *"Hohensee, Paul" > , Erik Helin > , David Holmes > > *Cc: *"serviceability-dev at openjdk.java.net" > > > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > On 1/29/18 10:35 AM, mandy chung wrote: > > Thanks for the reply Paul. Try to understand a little more on > the specific from gc-specific memory pool you depend on. > > On 1/29/18 8:27 AM, Hohensee, Paul wrote: > > A name change would affect Amazon?s heap monitoring, and > thus I expect it would affect other users as well. > > As long as there are gc-specific memory pools, we?re going > to need to be able to identify them, and right now that?s > done via name. > > > MemoryPoolMXBean::getType returns "heap" memory type for > GC-specific memory pools. Are you using this method? Do you > use the name to build in specific characteristic of a memory > pool (e.g. eden vs old gen)? > > > > > All the mxbeans are identified by name, so that?s a general > design principle. The only way I can think of to get rid of > name dependency would be to figure out what abstract metrics > users want to monitor and implement them for all collectors. > HeapUsage (instantaneous occupancy) is one, CollectionUsage > (long-lived occupancy) is another, both of these for the > entire heap, not just particular memory pools. > > > The sum of HeapUsage and CollectionUsage of all heap memory > pools was expected to give an incorrect approximation for the > entire heap usage. Are you seeing issue/bug with the sum result? > > > typo: s/an incorrect approximation/an approximation. > > Mandy > > > > Mandy > > > > That said, imo there will always be a demand for the ability > to get collector and memory pool specific details, so I > don?t see a way to get around providing named entities. > > Paul > > *From: *mandy chung > > *Organization: *Oracle Corporation > *Date: *Friday, January 26, 2018 at 2:38 PM > *To: *"Hohensee, Paul" > , Erik Helin > , > David Holmes > > *Cc: *"serviceability-dev at openjdk.java.net" > > > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > CollectionUsage.used values don't reflect mixed GC results > > On 1/25/18 1:04 PM, Hohensee, Paul wrote: > > > > The JMX API spec doesn?t specify what the memory pool or > garbage > collector names are, but the current names are > de-facto part of the > API, so if we change the existing > ones, imo a CSR should be filed. > > The names are implementation details but I can see how an application > > might be impacted if they depend on it. CSR approval is not strictly > > necessary while I think filing one to document the change would be > > good. > > Does the name change impact any application you know of? I'm trying to > > understand if any improvement to API is needed so that applications > > don't need to depend on the names. > > > Mandy > > > > > > > > From wessam at google.com Wed Jan 31 01:37:45 2018 From: wessam at google.com (Wessam Hassanein) Date: Tue, 30 Jan 2018 17:37:45 -0800 Subject: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? In-Reply-To: References: <0B0D105A-125F-4ADF-9704-13BE59F1B655@oracle.com> Message-ID: Thanks Paul, we will definitely take both G1 JDK9 & JDK10 improvements in our analysis. We would appreciate you updating the thread when a final decision is made for CMS in JDK11. On Tue, Jan 30, 2018 at 5:15 PM, Jeremy Manson wrote: > Thanks, Paul. We're definitely going to re-evaluate G1 with a more modern > JDK, but to do it for real, we need to re-evaluate it alongside something > that can run CMS in a production setting. Getting a JDK and our codebase > to the point where that is possible at Google is quite a bit of work, so it > would be a lot of work for us to rush to do it for 9 or 10 and then start > again for 10/11. > > (The work for 9/10 would help us for 11, but there would be quite a bit of > extra work, as well.) > > In short: "unlikely" is definitely good news for us. :) We'll watch this > space for updates, in case that changes. > > Jeremy > > On Tue, Jan 30, 2018 at 4:07 PM, Paul Su wrote: > >> Hi Jeremy, Wessam, >> >> At this time, removal of CMS is still under consideration for JDK 11 >> but unlikely to happen within that timeframe. However, there were >> significant performance improvements to G1 in JDK 9 and 10, so it may be >> worthwhile to evaluate those with your use cases. As always, any feedback >> is welcome and appreciated. >> >> Thanks, >> Paul >> >> >> Subject: Re: Expected JDK version of when CMS code is suppose to be >> dropped, is it JDK11? >> Date: Tue, 30 Jan 2018 10:25:27 -0800 >> From: Jeremy Manson >> To: Wessam Hassanein >> CC: hotspot-gc-dev at openjdk.java.net openjdk.java.net >> >> >> >> Hi folks, >> >> Wessam's question is pretty important to our planning. >> >> As many of you know, we were unable to get the G1 collector as it shipped >> with JDK 8 to work well with our services. We want to try a newer branch. >> >> However, it also takes us a lot of work to make a JDK workable for >> internal use. We have to patch it to work with our systems, and we have to >> make a whole lot of tests and infrastructure functional. >> >> We'd rather not rush to get out a Java version that will stop being >> supported in a few months. Java 9 has already seen its last release, and >> we aren't really all that close to making it work for the kinds of systems >> that actually care about the GC they are using. We are debating whether to >> try to continue to make Java 9 work, move to Java 10, or to skip both and >> go with Java 11, which will see long term support. >> >> (If we went to 11, we'd continue to experiment with 9 and 10, but we >> wouldn't push it hard.) >> >> In order to switch to G1, we need a release where our services can >> compare a recent G1 with CMS. If the last release with both of them is >> Java 10, then we need to work on Java 10, in spite of its September >> expiration date. If the last release with both of them is Java 11, then we >> will have a lot more breathing room for the G1 migration, and we can target >> that instead. >> >> So, it would be really helpful if someone could chime in with more >> information about the timeline for CMS removal. Is it likely to be before >> the door closes for the Java 11 release? >> >> Thanks! >> >> Jeremy >> >> On Fri, Jan 26, 2018 at 12:36 PM, Wessam Hassanein >> wrote: >> >>> Hi All, >>> >>> Per JEP 291( http://openjdk.java.net/jeps/291) the CMS collector was >>> deprecated but there was no clear date of when it is expected for the CMS >>> code to be dropped. I see code refactoring in JDK10 JEP 304 ( >>> http://openjdk.java.net/jeps/304) and I am wondering whether CMS is >>> planned to be dropped in JDK11 or when it is expected to be dropped? >>> >>> Thanks, >>> >>> Wessam Hassanein >>> Google GC TL >>> >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirk.pepperdine at gmail.com Wed Jan 31 04:49:19 2018 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Wed, 31 Jan 2018 05:49:19 +0100 Subject: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? In-Reply-To: References: <0B0D105A-125F-4ADF-9704-13BE59F1B655@oracle.com> Message-ID: Hi Paul, Thank you for the update. I have seen fewer pathological cases where G1 "blows up" but the overall electron budget for G1 is still about 10-20% higher than for CMS as well as the memory budget is about 30% higher. This means there are still a number of cases where we?re forced the use of G1 but CMS is still the better choice. Kind regards, Kirk > On Jan 31, 2018, at 2:15 AM, Jeremy Manson wrote: > > Thanks, Paul. We're definitely going to re-evaluate G1 with a more modern JDK, but to do it for real, we need to re-evaluate it alongside something that can run CMS in a production setting. Getting a JDK and our codebase to the point where that is possible at Google is quite a bit of work, so it would be a lot of work for us to rush to do it for 9 or 10 and then start again for 10/11. > > (The work for 9/10 would help us for 11, but there would be quite a bit of extra work, as well.) > > In short: "unlikely" is definitely good news for us. :) We'll watch this space for updates, in case that changes. > > Jeremy > > On Tue, Jan 30, 2018 at 4:07 PM, Paul Su > wrote: > Hi Jeremy, Wessam, > > At this time, removal of CMS is still under consideration for JDK 11 but unlikely to happen within that timeframe. However, there were significant performance improvements to G1 in JDK 9 and 10, so it may be worthwhile to evaluate those with your use cases. As always, any feedback is welcome and appreciated. > > Thanks, > Paul > >> >> Subject: Re: Expected JDK version of when CMS code is suppose to be dropped, is it JDK11? >> Date: Tue, 30 Jan 2018 10:25:27 -0800 >> From: Jeremy Manson >> To: Wessam Hassanein >> CC: hotspot-gc-dev at openjdk.java.net openjdk.java.net >> >> >> Hi folks, >> >> Wessam's question is pretty important to our planning. >> >> As many of you know, we were unable to get the G1 collector as it shipped with JDK 8 to work well with our services. We want to try a newer branch. >> >> However, it also takes us a lot of work to make a JDK workable for internal use. We have to patch it to work with our systems, and we have to make a whole lot of tests and infrastructure functional. >> >> We'd rather not rush to get out a Java version that will stop being supported in a few months. Java 9 has already seen its last release, and we aren't really all that close to making it work for the kinds of systems that actually care about the GC they are using. We are debating whether to try to continue to make Java 9 work, move to Java 10, or to skip both and go with Java 11, which will see long term support. >> >> (If we went to 11, we'd continue to experiment with 9 and 10, but we wouldn't push it hard.) >> >> In order to switch to G1, we need a release where our services can compare a recent G1 with CMS. If the last release with both of them is Java 10, then we need to work on Java 10, in spite of its September expiration date. If the last release with both of them is Java 11, then we will have a lot more breathing room for the G1 migration, and we can target that instead. >> >> So, it would be really helpful if someone could chime in with more information about the timeline for CMS removal. Is it likely to be before the door closes for the Java 11 release? >> >> Thanks! >> >> Jeremy >> >> On Fri, Jan 26, 2018 at 12:36 PM, Wessam Hassanein > wrote: >> Hi All, >> >> Per JEP 291( http://openjdk.java.net/jeps/291 ) the CMS collector was deprecated but there was no clear date of when it is expected for the CMS code to be dropped. I see code refactoring in JDK10 JEP 304 (http://openjdk.java.net/jeps/304 ) and I am wondering whether CMS is planned to be dropped in JDK11 or when it is expected to be dropped? >> >> Thanks, >> >> Wessam Hassanein >> Google GC TL >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Wed Jan 31 08:57:12 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 31 Jan 2018 09:57:12 +0100 Subject: PING: RFR: 8194249: SA: G1HeapRegionTable#getByAddress() returns incorrect HeapRegion In-Reply-To: References: <323f09cf-a036-42b7-990c-0e42ad511a8f@gmail.com> <3be47bde-46c1-3ad2-7521-5eecceea5dee@oracle.com> <7332f28e-fe93-56c2-ce44-bff374b83728@gmail.com> <2bc3972c-837f-076e-60a3-a3b019761a97@oracle.com> <122a2fbf-6c0d-6209-28b4-0b03064b65ab@gmail.com> <1efaeb12-720c-e93e-6010-38b3687d3bf9@gmail.com> <03152805-7b8b-b107-cb99-0d2b53c0dabc@oracle.com> <1517306774.2832.6.camel@oracle.com> <1517327510.2368.27.camel@oracle.com> Message-ID: <1517389032.2352.2.camel@oracle.com> Hi, On Wed, 2018-01-31 at 09:49 +0900, Yasumasa Suenaga wrote: > Hi Thomas, > > > > looks good to me - however there is another (pre-existing) bug: > > > the > > > shift in that code should be a logical shift, not an arithmetic > > > shift. > > > > > > I.e. ">>" instead of ">>>". > > > > > > I will run the patch through testing and report back in a few > > > hours. > > > Should be okay. > > > > is good. Do you want to fix the issue with the shift operator too > > here, or use another CR? > > Thanks! > > If the use of ">>>" is the bug, I want to fix it in new bug ticket. > I do not think the use of ">>>" is not a bug. > > I g1BiasedArray.hpp, G1BiasedMappedArray::get_by_address() uses ">>" > operator to calculate biased_index: > http://hg.openjdk.java.net/jdk/hs/file/ee513596f3ee/src/hotspot/sha > re/gc/g1/g1BiasedArray.hpp#l134 > > idx_t is defined as size_t. So it is calculated as unsigned value. > In JLS 15.19, ">>>" is for unsigned. If we use ">>", it might remain > MSB in some cases. > https://docs.oracle.com/javase/specs/jls/se9/html/jls-15.html#jls-1 > 5.19 > > Thus I think this is not a bug. You are right, I mixed them up. I will push the change then, with jgeorge and me as reviewers. Thomas From erik.helin at oracle.com Wed Jan 31 11:40:46 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 31 Jan 2018 12:40:46 +0100 Subject: RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results In-Reply-To: References: <3A67BF1B-CE35-4759-B5B4-959C24020A45@amazon.com> <22ceea57-7d27-8350-4457-21f765cb3d0f@oracle.com> <878D6678-84CB-471E-B126-AF14B1BC8EB9@amazon.com> <727f9bcb-9404-df32-917f-70aca6e31cb6@oracle.com> <4BA32239-D645-4B1A-A58E-502258C8FFB7@amazon.com> <997a766f-9f4b-f192-ae87-03f4c8d2666d@oracle.com> <64793ff3-4805-79f9-4e72-6e469cd511dc@oracle.com> <15754d34-f3c7-6431-7abf-05214bd6b9d2@oracle.com> Message-ID: <93417506-8e12-6ad3-690a-439e36719652@oracle.com> On 01/31/2018 02:30 AM, Hohensee, Paul wrote: > It?s true that my patch doesn?t completely solve the larger problem, but it fixes the most immediately important part of it, particularly for JDK8 where current expected behavior is entrenched. Yes, your patch fixes part of the problem, but as I said, can potentially lead to more confusion. I'm not sure that doing this behavioral change for a public API in an JDK 8 update release is the right thing. There are likely users that rely on the memory pool "G1 Old Gen" only being updated by a full collection (even though that behavior is not correct), those uses will encounter a new behavior in an update release with your patch. The good thing is that we have very experienced engineers participating in the CSR process that have much more experience than I have in evaluating the impact of behavioral changes such as this one. Would you please file a CSR request for your patch to get their opinion? See https://wiki.openjdk.java.net/display/csr/Main for more details about CSR. On 01/31/2018 02:30 AM, Hohensee, Paul wrote: If we?re going to fix the larger problem, imo we should file another bug/rfe to do it. I?d be happy to fix that one too, once we have a spec. > > What do you think of my suggestions? To summarize: > > - Keep the existing memory pools and add humongous and archive pools. > - Make the archive pool part of the old gen, and generalize it to all collectors. > - Split humongous regions off from the old gen pool into their own pool. The old gen and humongous pools are disjoint region sets. > - Keep the existing ?G1 Young Generation? and ?G1 Old Generation? collectors and their associated memory pools (net of this patch). We add the humongous pool to them. > - Add ?G1 Full? as an alias of the existing ?G1 Old Generation? collector. > - Add the ?G1 Young?, ?G1 Mixed? and ?G1 Concurrent Cycle? collectors. > - Set the G1 old gen memory pool max size to ?Xmx, the archive space max size to whatever it is, and the rest of the G1 memory pool max sizes to -1 == undefined, as now. > > The resulting memory pools: > > ?G1 Eden Space? > ?G1 Survivor Space? > ?G1 Old Gen? > ?G1 Humongous Space? > ?Archive Space? The "Space" suffix is unfortunate, but acceptable. I'm least happy about the "Gen" suffix for the "G1 Old Gen", since G1's old regions differ from a generation in the traditional sense as applied to e.g. Serial, Parallel and CMS. I would be more happy to use a consistent naming scheme in the form of "G1 Old Space" (having only one pool ending "Gen" begs the question how it differs from the others ending in "Space"). Again, we could introduce a flag such as -XX:+UseG1LegacyPoolsAndBeans for those that really wants the old names. "Archive Space" should be named "G1 Archive Space" since it differs in implementation from the other collectors. It would be unfortunate if users thought they could change collector and the "Archive Space" memory pool would keep the same behavior. > The resulting collectors and their memory pools: > > ?G1 Young Generation? (the existing young/mixed collector), ?G1 Old Generation?/?G1 Full?, ?G1 Mixed? > - ?G1 Eden Space? > - ?G1 Survivor Space? > - ?G1 Old Gen? > - ?G1 Humongous Space? > ?G1 Young? > - ?G1 Eden Space? > - ?G1 Survivor Space? > - ?G1 Humongous Space? > ?G1 Concurrent Cycle? > - ?G1 Old Gen? > - ?G1 Humongous Space? > > I?m not religious about the names, but I like my suggestions. :) I think it will be confusing for users to have both "G1 Old Generation" and "G1 Full", particularly for tools iterating over all GarbageCollectorMXBeans. There is no way to indicate that a GarbageCollectorMXBeans is an alias of another GarbageCollectorMXBean (I thought about such a solution as well). I guess I don't follow what the GarbageCollectorMXBean "G1 Young Generation" is meant to represent? > The significant addition to my previous email, and an incompatible change, is splitting humongous regions off from the old gen pool. This means that apps that specifically monitor old gen occupancy will no longer see humongous regions. Monitoring apps that just add up info about all a collector?s pools won?t see a difference. I may be corrected (by Kirk, perhaps), but imo it?s not as bad a compatibility issue as one might think, because the type of app that uses a lot of humongous regions isn?t all that common. E.g., apps that cache data in the heap in the form of large compressed arrays, and apps with large hashmap bucket list arrays. The heaps such apps use are very often large enough to use 32mb regions, hence need really big objects to actually allocate humongous regions. So why not enable backwards compatibility by allowing a user to set the flag -XX:+UseG1LegacyPoolsAndBeans? It is not that cumbersome for us to maintain the current definition of memory pools and collectors. Such a flag allows us to start over and do this right and a user who relies on the current behavior can get that by just setting a flag. Doing such a change in a major release also allows us to clearly highlight the change in the release notes (users are more prepared for larger changes in a major release and that they might have to add flags to keep old behavior). It is not uncommon for memory pools to change in major releases. The perm gen pool was removed in JDK 8, the default pools changed when Parallel Old became default old collector way back in JDK 7 and changed again when G1 became the default collector in JDK 9. Thanks, Erik > Thanks, > > Paul > > On 1/30/18, 5:51 AM, "Erik Helin" wrote: > > On 01/30/2018 03:07 AM, Hohensee, Paul wrote: > > That?s one reviewer who?s ok with a short term patch. Anyone else? And, > > any reviewers for said short term patch? :) > > Well, the patch is not really complete as it is. The problem is the > definitions of the MemoryPoolMXBeans and GarbageCollectorMXBeans, which, > as I tried to hint at in my first email, is a mess for G1. The names and > implementations of these MemoryPoolMXBeans and GarbageCollectionMXBeans > for G1 are very old, G1 has changed a lot since those were implemented > (hence my suggestion for finally fixing this). > > The issue with your patch is that the MemoryPoolMXBean named "G1 Old > Gen" consists of both old and humongous regions (it will also include > archive regions). Old regions can be collected by mixed, concurrent and > full collections. Humongous regions can be collected by young, mixed or > full collections and the concurrent cycle. Archive regions will never be > collected. Your patch will update the pool in the case of a mixed > collection collecting old regions or humongous regions, but misses the > following cases: > - a young collection collecting humongous regions > - a concurrent cycle collecting humongous regions > - a concurrent cycle collecting old regions > > Unfortunately I could not come up with a good way to solve the above > without re-designing the pools. I'm not sure about accepting your patch > as is, since it might cause even more confusion for users compared to > the current (already confusing) situation. > > One idea we have discussed is to implement the re-design but also add a > flag, -XX:+UseG1LegacyPoolsAndBeans (false by default), to allow for a > smoother transition. Would that solution work for you? > > Thanks, > Erik > > > Thanks, > > > > Paul > > > > *From: *mandy chung > > *Organization: *Oracle Corporation > > *Date: *Monday, January 29, 2018 at 1:41 PM > > *To: *"Hohensee, Paul" > > *Cc: *"serviceability-dev at openjdk.java.net" > > , "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > > CollectionUsage.used values don't reflect mixed GC results > > > > I created JDK-8196362 to look into whether it makes sense to provide > > some categorization to differentiate eden space vs the heap space for > > long-lived objects. > > > > W.r.t. to JDK-8195115, I have to defer to GC team to comment on the > > mixed collection update. If they are okay, I have no objection to > > implement a short-term fix and do the proper G1 memory pools as a > > separate patch. > > > > Mandy > > > > On 1/29/18 1:02 PM, Hohensee, Paul wrote: > > > > We don?t use getType, and you guessed correctly: we use the memory > > pool name as an indicator of the specific characteristics of a > > memory pool, in particular eden. > > > > What we want is an indication of long term heap occupancy. We > > calculate it using CollectionUsage for non-eden heap memory pools, > > regardless of collector. We don?t use JMX notification, rather we > > periodically poll CollectionUsage for memory pools with names that > > contain ?Old?, ?Tenured?, or ?Survivor?. We get the memory pools > > from the GarbageCollectorMXBeans (we don?t care what the collector > > names are). For the named memory pools, we sum CollectionUsage.used > > and divide by the sum of CollectionUsage.max to get a long term heap > > occupancy percentage. We don?t want to include eden because it?s > > really just an allocation buffer and not part of the storage for > > long-lived objects. I suppose we could use a negative test instead > > by using all memory pools with names that don?t include ?Eden?. > > > > The bug is that the ?G1 Old Gen? memory pool isn?t being updated > > when the ?G1 Young Generation? collector runs a mixed collection. As > > far as JMX is concerned, that collector only knows about eden and > > the survivor space. The patch adds the old gen to the memory pools > > it knows about and has mixed collections update the old gen?s > > CollectionUsage. > > > > I managed to get a submit repo run to succeed last week and it found > > a problem. I?ve uploaded a new webrev that fixes the failure of the > > jtreg test TestMemoryMXBeansAndPoolsPresence.java due to the young > > gen collector being expected to know only about eden and the > > survivor space. > > > > http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/ > > > > > > Waiting on the submit repo to come back with a result on it. > > > > Thanks, > > > > Paul > > > > *From: *mandy chung > > > > *Organization: *Oracle Corporation > > *Date: *Monday, January 29, 2018 at 10:52 AM > > *To: *"Hohensee, Paul" > > , Erik Helin > > , David Holmes > > > > *Cc: *"serviceability-dev at openjdk.java.net" > > > > > > , > > "hotspot-gc-dev at openjdk.java.net" > > > > > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > > CollectionUsage.used values don't reflect mixed GC results > > > > On 1/29/18 10:35 AM, mandy chung wrote: > > > > Thanks for the reply Paul. Try to understand a little more on > > the specific from gc-specific memory pool you depend on. > > > > On 1/29/18 8:27 AM, Hohensee, Paul wrote: > > > > A name change would affect Amazon?s heap monitoring, and > > thus I expect it would affect other users as well. > > > > As long as there are gc-specific memory pools, we?re going > > to need to be able to identify them, and right now that?s > > done via name. > > > > > > MemoryPoolMXBean::getType returns "heap" memory type for > > GC-specific memory pools. Are you using this method? Do you > > use the name to build in specific characteristic of a memory > > pool (e.g. eden vs old gen)? > > > > > > > > > > All the mxbeans are identified by name, so that?s a general > > design principle. The only way I can think of to get rid of > > name dependency would be to figure out what abstract metrics > > users want to monitor and implement them for all collectors. > > HeapUsage (instantaneous occupancy) is one, CollectionUsage > > (long-lived occupancy) is another, both of these for the > > entire heap, not just particular memory pools. > > > > > > The sum of HeapUsage and CollectionUsage of all heap memory > > pools was expected to give an incorrect approximation for the > > entire heap usage. Are you seeing issue/bug with the sum result? > > > > > > typo: s/an incorrect approximation/an approximation. > > > > Mandy > > > > > > > > Mandy > > > > > > > > That said, imo there will always be a demand for the ability > > to get collector and memory pool specific details, so I > > don?t see a way to get around providing named entities. > > > > Paul > > > > *From: *mandy chung > > > > *Organization: *Oracle Corporation > > *Date: *Friday, January 26, 2018 at 2:38 PM > > *To: *"Hohensee, Paul" > > , Erik Helin > > , > > David Holmes > > > > *Cc: *"serviceability-dev at openjdk.java.net" > > > > > > , > > "hotspot-gc-dev at openjdk.java.net" > > > > > > > > *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool > > CollectionUsage.used values don't reflect mixed GC results > > > > On 1/25/18 1:04 PM, Hohensee, Paul wrote: > > > > > > > The JMX API spec doesn?t specify what the memory pool or > > garbage > collector names are, but the current names are > > de-facto part of the > API, so if we change the existing > > ones, imo a CSR should be filed. > > > > The names are implementation details but I can see how an application > > > > might be impacted if they depend on it. CSR approval is not strictly > > > > necessary while I think filing one to document the change would be > > > > good. > > > > Does the name change impact any application you know of? I'm trying to > > > > understand if any improvement to API is needed so that applications > > > > don't need to depend on the names. > > > > > > Mandy > > > > > > > > > > > > > > > > > > From leo.korinth at oracle.com Wed Jan 31 15:58:29 2018 From: leo.korinth at oracle.com (Leo Korinth) Date: Wed, 31 Jan 2018 16:58:29 +0100 Subject: RFR: 8196341: Add JFR events for parallel phases of G1 Message-ID: Hi, I am adding events for the parallel phases of G1. The phases that are covered are most of the values of the GCParPhases enum. Exceptions are: GCWorkerStart, GCWorkerTotal, GCWorkerEnd and ExtRootScan -- these phases overlap more specific phases specified in the enum and is thus omitted. YoungFreeCSet and NonYoungFreeCSet are represented as one big phase under the name FreeCSet (lots of short phase if they would have been reported individually). The enum value Other is not reported. One extra phase is reported as BufferedRootProcessing. This change is depending on JDK-8196337 that is out for review and can thus not be applied to the tree as-is. Enhancement: https://bugs.openjdk.java.net/browse/JDK-8196341 Webrev: http://cr.openjdk.java.net/~lkorinth/8196341/00/ Testing: mach5 hs-tier1,hs-tier2 Thanks, Leo