Question regarding G1 option to run parallel Old generation garbage collection?

Vitaly Davidovich vitalyd at gmail.com
Fri Oct 26 12:02:54 UTC 2012


Hi Kirk,

Not sure how much I can add given Ramki's excellent replies, but I've never
seen copying being a big problem on the heap sizes I use (~2.5-3g eden).
The reason is that young GC cost, as you know, is a function of the amount
of live data.  If the application satisfies the generational hypothesis and
the heap sizes are properly tuned to ensure that when eden fills up there
aren't many survivors, the copying cost of the remaining few survivors is
very cheap (relatively speaking).  In other words, in a properly tuned app
the copy cost should be minimal.

Now, if you did have liveness stats a priori, as Ramki says, you could
potentially do better by not moving regions that are completely full of
survivors and simply treat them as survivor space.  But then you have to
weigh the cost of this bookkeeping both in terms of CPU and extra memory
required.

As for regions that have a mix of survivors and dead objects, you do need
evacuation so that the region can be used for bump allocation; if you
fragment it by not moving survivors, you lose this fast alloc capability.
So in the common case I'd venture you won't find regions completely full of
survivors (even if they're age based) unless the allocation patterns and
heap tuning of the application are just lucky/fortunate enough to be spot
on; therefore, you'll almost always have to do some copying.

Thanks

Sent from my phone
On Oct 26, 2012 2:48 AM, "Kirk Pepperdine" <kirk at kodewerk.com> wrote:

> Hi Ramki,
>
> Ok, if you put it that way.....  I can see that maybe the copy cost would
> be the worst of the evils. But I can't help thinking that things are less
> random than you might expect. I think this would definitely require some
> experimentation.
>
> Thanks,
> Kirk
>
> On 2012-10-26, at 8:15 AM, Srinivas Ramakrishna <ysr1729 at gmail.com> wrote:
>
> Kirk, Young regions are always collected at each collection pause, and
> they are not subjected to the global concurrent marking,
> so you do not really know their liveness a priori, but just start
> evacuating them on the fly as you discover reachable objects in them,
> your classic copying collection with no a-priori liveness information.
>
> Of course, one could keep track of statistics of where each live object
> came out of when evacuating it, and build
> a post-facto map of the statistical distribution of live objects in
> regions perhaps based on the "age" of those regions (i.e.
> when they were first allocated out of the free region pool). Then, using
> stats of that kind one could perhaps make a good
> guess as to what kind of young region (e.g. the very youngest regions)
> might typically have a very high liveness ratio and then
> use that kind of information to avoid evacuating those regions at all, and
> treating them, e.g. as
> "from" survivor spaces for the next collection. However, I really don't
> see those techniques working for tenuring
> young (survivor) regions into the old generation, unless you are
> enormously lucky and end up with all objects in
> a survivor region having the same age, which is unlikely for typical
> programs. Now, people have played around with the
> idea of segregating objects by age into age-based regions, and perhaps
> something like what you envisage might work in
> those kinds of cases perhaps (but again it would have to be based on
> measured historical statistics rather than a-priori
> correctly known liveness information).
>
> I'll let John, Thomas et al. correct my misunderstandings on how current
> G1 works because it's constantly changing and
> my recollections are necessarily vague, being based on hallway discussions
> rather than from ever having worked on the code,
> and from quite a while ago at that, so possibly obsolete.
>
> -- ramki
>
> On Thu, Oct 25, 2012 at 10:46 PM, Kirk Pepperdine <kirk at kodewerk.com>wrote:
>
>> Hi Vitaly,
>>
>> Well, depending on the livelyness of the region, I would suggest that you
>> simply don't copy. If you do, copy to a new young gen region. Tenure as per
>> the tenuring threshold. As for fragmentation... with regions being so
>> small, would it really be that big a problem? And if so, would it be a
>> problem for very long? IOWs, are we willing to accept a certain level of
>> fragmentation to reduce copy costs?
>>
>> The problem with the current adaptive sizing policy (AFAICT from various
>> production logs) is that it doesn't account for premature promotion and
>> therefore continuously undersizes survivor spaces. Well, with no survivor
>> spaces, no problem.. right? well, not quite.. the tenuring threshold would
>> still have to be calculated in order to have a minimal amount of space
>> retained for new object creation.
>>
>> On 2012-10-25, at 11:11 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>
>> So if some objects survive young GC and need to be copied so that new
>> allocations are bump the pointer and fragmentation is avoided, where would
>> they be copied to if there's no concept of survivor space (or something
>> akin to it)? You don't want to tenure them prematurely if they ultimately
>> would die fairly soon but you can't keep them in place if you're employing
>> a copying collector.
>>
>> Sent from my phone
>> On Oct 25, 2012 5:04 PM, "Kirk Pepperdine" <kirk at kodewerk.com> wrote:
>>
>>> Not necessarily, IBM doesn't have survivor spaces and they don't promote
>>> until they hit a tenuring threshold. They use a hemispheric scheme in young
>>> which I'm wondering would even be necessary in G1. For example, objects
>>> would get evacuated from a young gen region when they hit a tenuring
>>> threshold. Until then young gen regions would get collected based on
>>> occupancy just as old gen regions are. Given weak gen hypothesis I'm not
>>> sure if there would be a lot of savings here 'cept when you ran into blocks
>>> of long lived objects.
>>>
>>> -- Kirk
>>>
>>> On 2012-10-25, at 10:58 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>>
>>> Kirk,
>>>
>>> Unless I misunderstood your question, not having survivor spaces means
>>> you need to promote to tenured which may be undesired for non long-lived
>>> objects.  The 2 survivor spaces allow for an object to survive a few young
>>> GCs and then die before promotion (provided its tenuring threshold is not
>>> breached).
>>>
>>> Sent from my phone
>>> On Oct 25, 2012 4:43 PM, "Kirk Pepperdine" <kirk at kodewerk.com> wrote:
>>>
>>>> This is a nice explanation... I would think that not necessarly having
>>>> a to/from survivor space would cut back on some copy costs?
>>>>
>>>> On 2012-10-25, at 7:53 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>
>>>> wrote:
>>>>
>>>> Kirk, Think of Eden as the minimum space available for allocation
>>>> before a young GC becomes necessary. Think of a survivor space as the
>>>> minimum space set aside to hold objects surviving in the young generation
>>>> and not being tenured. G1 does take advantage of the fact that you do not
>>>> necessarily need to keep the "To" survivor space in reserve separately, but
>>>> draw from a common pool of free regions. In practice, it might be sensible
>>>> to reuse recently collected Eden regions (can't recall how hard G1 tries to
>>>> do that) because it's possible that some caches are warm, but with today's
>>>> huge young generation sizes, may be it doesn't make sense to talk about
>>>> cache reuse. In the presence of paging, reusing Eden and survivor pages
>>>> becomes more important to reduce the cost of inadvertently picking a region
>>>> whose physical pages may need to be faulted in because they had been paged
>>>> out or are being touched for the first time. (This may be more important on
>>>> Windows because of its proclivity to evict pages that haven't been touched
>>>> in a while even when there is no virtual memory pressure.)
>>>>
>>>> John might be able to tell us whether or how hard G1 tries to reuse
>>>> Eden/Survivor pages (I had often lobbied for that because AFAIR old G1 code
>>>> did not make any such attempts, but G1 has seen many recent improvements
>>>> since I last looked).
>>>>
>>>> -- ramki
>>>>
>>>> On Fri, Oct 19, 2012 at 12:38 PM, Kirk Pepperdine <kirk at kodewerk.com>wrote:
>>>>
>>>>> Hi Charlie,
>>>>>
>>>>> Was thinking that as long as you're evacuating regions was there a
>>>>> need to make a distinction between Eden and survivor... they are all just
>>>>> regions in young gen. The distinction seems some what artificial.
>>>>>
>>>>> As for your use case.. make sense on one hand but on the other I'm
>>>>> wondering if it's akin to calling System.gc()... time will tell me thinks.
>>>>> ;-)
>>>>>
>>>>> Regards,
>>>>> Kirk
>>>>>
>>>>> On 2012-10-19, at 9:28 PM, Charlie Hunt <chunt at salesforce.com> wrote:
>>>>>
>>>>> Perhaps if you're really, really ... really squeeze on available heap
>>>>> space and wanted stuff cleaned from old asap, then
>>>>> InitiatiingHeapOccupancyPercent=0 could be justified?
>>>>>
>>>>> Btw, I thought about your question you asked at J1 on "why use
>>>>> survivor spaces with G1?" ... I'll offer an answer, John Cu or Bengt along
>>>>> with Monica are free to offer their thoughts as well.
>>>>>
>>>>> By using survivor spaces, you should, (and I'd expect that to be the
>>>>> case) reduce the amount of concurrent cycles you'll do.  And, you will
>>>>> likely more frequently visit more long live objects if you didn't have
>>>>> survivor spaces as a result of doing more concurrent cycles.  In addition,
>>>>> the total number of different regions you evacuate may be more without
>>>>> survivor spaces, and you may evacuate the same (live) objects more times
>>>>> than without survivor spaces.  In short, I would expect in most cases you
>>>>> end evacuating fewer times per object and you end doing fewer concurrent
>>>>> cycles, all of which saves you CPU cycles for application threads.  Of
>>>>> course, I'm sure we can write an application where it would be advantageous
>>>>> to not have a survivor spaces in G1.  But, we could also write would that
>>>>> could never have the need for a concurrent cycle in a G1 heap that has
>>>>> survivor spaces.
>>>>>
>>>>> Thanks again for the question!
>>>>>
>>>>> charlie ...
>>>>>
>>>>> On Oct 19, 2012, at 2:16 PM, Kirk Pepperdine wrote:
>>>>>
>>>>> Thanks Charlie,
>>>>>
>>>>> I only had a cursor look at the source and found the initial
>>>>> calculation but stopped there figuring someone here would know off the top
>>>>> of their heads. Didn't expect someone to splunk through the code so a big
>>>>> thanks for that.
>>>>>
>>>>> Again, I'm struggling to think of a use case for this behaviour.
>>>>>
>>>>> Regards,
>>>>> Kirk
>>>>>
>>>>> On 2012-10-19, at 8:56 PM, Charlie Hunt <chunt at salesforce.com> wrote:
>>>>>
>>>>> Don't mean to jump in front of Monica. :-/   But, she can confirm. ;-)
>>>>>
>>>>> A quick look at the G1 source code suggests that if
>>>>> InitiatingHeapOccupancyPercent=0, the following will happen:
>>>>> - the first minor GC will initiate a concurrent cycle implying that
>>>>> you'll see a young GC with an initial-mark in the GC log w/ +PrintGCDetails
>>>>> - every minor GC there after, as long as there is not an active
>>>>> concurrent cycle, will initiate the start of a concurrent cycle
>>>>> * So, in other words, concurrent cycles will run back to back.
>>>>>  Remember that there needs to be a minor GC to initiate the concurrent
>>>>> cycle, i.e. the initial-mark. There is at least one caveat to that which
>>>>> I'll explain next.  So, once a concurrent cycle complete, the next
>>>>> concurrent cycle will not start, until the next minor GC, or a humongous
>>>>> allocation occurs as described next.
>>>>> - If there is a humongous object allocation, a concurrent cycle will
>>>>> be initiated (if InitiattingHeapOccupancyPercent=0). This is done before
>>>>> the humongous allocation is done.
>>>>>
>>>>> charlie ...
>>>>>
>>>>> On Oct 19, 2012, at 12:58 PM, Kirk Pepperdine wrote:
>>>>>
>>>>> Hi Monica,
>>>>>
>>>>> Can you comment on what a value of 0 means?
>>>>>
>>>>> Regards,
>>>>> Kirk
>>>>>
>>>>> On 2012-10-19, at 2:55 PM, Monica Beckwith <monica.beckwith at oracle.com>
>>>>> wrote:
>>>>>
>>>>>  Couple of quick observations and questions -
>>>>>
>>>>>    1. G1 is officially supported in 7u4. (There are numerous
>>>>>    performance improvements that I recommend updating to the latest jdk7
>>>>>    update, if possible)
>>>>>    2. What do you mean by old gen collection? Are you talking about
>>>>>    MixedGCs?
>>>>>    3. Instead of setting InitiatingHeapOccupancyPercent to zero, have
>>>>>    you tried resizing your young generation?
>>>>>       1. I see the NewRatio, but that fixes the nursery to 640,
>>>>>       instead have you tried with a lower (than the min default) of nursery using
>>>>>       the NewSize option?
>>>>>
>>>>> -Monica
>>>>>
>>>>>
>>>>> On 10/19/2012 12:13 AM, csewhiz wrote:
>>>>>
>>>>>  Hello All,
>>>>>   Sorry for posting this question in this mailing list. I am unable to
>>>>> find any answer for this. I am trying to tune our application for G1GC as
>>>>> we need very small pauses Below 500msec.
>>>>>   But the problem is when we are runing with G1GC (under jdk 6_u37)
>>>>> Old generation of garbage collection only happening when it is reaching the
>>>>> Max GC size I noticed on jdk 6U 37 if max heap size is 1GB then it is close
>>>>> to 1sec 2GB close to 2 sec pauses.
>>>>>
>>>>>   Is there any parameter to force the old gc happening regularly.
>>>>>
>>>>> I am trying following setting,
>>>>>
>>>>> -Xms1280M -Xmx1280M -XX:+UseG1GC -XX:MaxTenuringThreshold=15
>>>>> -XX:SurvivorRatio=8 -XX:NewRatio=1 -XX:GCPauseIntervalMillis=7500
>>>>> -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=0
>>>>> -XX:ParallelGCThreads=7 -XX:ConcGCThreads=7
>>>>>
>>>>> If anyone can give insight on how full GC is triggred internals will
>>>>> be of great help.
>>>>>
>>>>> PS: I have tried without any option for G1 but not of much help hence
>>>>> .. this one trying to be agressive ? but of not much help.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Soumit
>>>>>
>>>>>
>>>>> --
>>>>> <oracle_sig_logo.gif> <http://www.oracle.com/>
>>>>> Monica Beckwith | Java Performance Engineer
>>>>> VOIP: +1 512 401 1274 <+1%20512%20401%201274>
>>>>> Texas
>>>>> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> Oracle
>>>>> is committed to developing practices and products that help protect the
>>>>> environment
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20121026/ba33588f/attachment.htm>


More information about the hotspot-gc-dev mailing list