Request for review (S): 8005972: ParNew should not update the tenuring threshold when promotion failed has occurred

Fri Jan 11 17:50:23 UTC 2013

Hi Bengt --

Try computing the GC overhead by normalizing wrt the work done (for which
the net allocation volume might be a good proxy). As you state, the
performance numbers will then likely make sense. Of course, they still
won't explain why ParNew does better.  As Vitaly conjectures, the
difference is likely in better object co-location with ParNew's slightly
more DFS-like evacuation compared with DefNew's considerably more BFS-like
evacuation because of the latter's use of a pure Cheney scan compared with
the use of (a) marking stack(s) in the former, as far as i can remember the
code. One way to tell if that accounts for the difference is to measure the
cache-miss rates in the two cases (and may be use a good tool like Solaris
perf analyzer to show you where the misses are coming from as well).

Also curious if you can share the two sets of GC logs, by chance? (specJBB
is a for-fee benchmark so is not freely available to the individual
developer.)

thanks.
-- ramki

On Fri, Jan 11, 2013 at 4:57 AM, Bengt Rutisson
<bengt.rutisson at oracle.com>wrote:

>
> Hi Vitaly,
>
>
> On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> Regarding the benchmark score, are you saying ParNew has longer cumulative
> GC time or just the average is higher? If it's just average, maybe the
> total # of them (and cumulative time) is less.  I don't know the
> characteristics of this particular specjbb benchmark, but perhaps having
> fewer total GCs is better because of the overhead of getting all threads to
> a safe point, going go the OS to suspend them, and then restarting them.
> After they're restarted, the CPU cache may be cold for it because the GC
> thread polluted it.  Or I'm entirely wrong in my speculation ... :).
>
>
> You have a good point about the number of GCs. The problem in my runs is
> that ParNew does more GCs than DefNew. So there are both more of them and
> their average time is higher, but the score is still better. That ParNew
> does more GCs is not that strange. It has a higher score, which means that
> it had higher throughput and had time to create more objects. So, that is
> kind of expected. But I don't understand how it can have higher throughput
> when the GCs take longer. My current guess is that it does something
> differently with how objects are copied in a way that is beneficial for the
> execution time between GCs.
>
> It also seems like ParNew keeps more objects alive for each GC. That is
> either the reason why it does more and more frequent GCs than DefNew, or it
> is an effect of the fact that more objects are created due to the higher
> throughput. This is the reason I started looking at the tenuring threshold.
>
> Bengt
>
>
>  Thanks
>
> Sent from my phone
> On Jan 11, 2013 6:02 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com>
> wrote:
>
>>
>> Hi Ramki,
>>
>> Thanks for looking at this!
>>
>> On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>
>> Hi Bengt --
>>
>> The change looks reasonable, but I have a comment and a follow-up
>> question.
>>
>> Not your change, but I'd elide the "half the real survivor size" since
>> it's really a configurable parameter based on TargetSurvivorRatio with
>> default half.
>> I'd leave the comment as "set the new tenuring threshold and desired
>> survivor size".
>>
>>
>> I'm fine with removing this from the comment, but I thought the "half the
>> real survivor size" aimed at the fact that we pass only the "to" capacity
>> and not the "from" capacity in to compute_tenuring_threshold(). With that
>> interpretation I think the comment is correct.
>>
>> Would you like me to remove it anyway? Either way is fine with me.
>>
>> I'm curious though, as to what performance data prompted this change,
>>
>> Good point. This change was preceded by an internal discussion in the GC
>> team, so I should probably have explained the background more in my review
>> request to the open.
>>
>> I was comparing the ParNew and DefNew implementation since I am seeing
>> some strange differences in some SPECjbb2005 results. I am running ParNew
>> with a single thread and get much better score than with DefNew. But I also
>> get higher average GC times. So, I was trying to figure out what DefNew and
>> ParNew does differently.
>>
>> When I was looking at DefNewGeneration::collect() and
>> ParNewGeneration::collect() I saw that they contain a whole lot of code
>> duplication. It would be tempting to try to extract the common code out
>> into DefNewGeneration since it is the super class. But there are some minor
>> differences. One of them was this issue with how they handle the tenuring
>> threshold.
>>
>> We tried to figure out if there is a reason for ParNew and DefNew to
>> behave different in this regard. We could not come up with any good reason
>> for that. So, we needed to figure out if we should change ParNew or DefNew
>> to make them consistent. The decision to change ParNew was based on two
>> things. First, it seems wrong to use the data from a collection that got
>> promotion failure. This collection will not have allowed the tenuring
>> threshold to fulfill its purpose. Second, ParallelScavenge works the same
>> way as DefNew.
>>
>> BTW, the difference between DefNew and ParNew seems to have been there
>> from the start. So, there is no bug or changeset in mercurial or TeamWare
>> to explain why the difference was introduced.
>>
>> (Just to be clear, this difference was not the cause of my performance
>> issue. I still don't have a good explanation for how ParNew can have longer
>> GC times but better SPECjbb score.)
>>
>> and whether it might make sense, upon a promotion failure to do something
>> about the tenuring threshold for the next scavenge (i.e. for example make
>> the tenuring threshold half of its current value as a reaction to the fact
>> that promotion failed). Is it currently left at its previous value or is it
>> asjusted back to the default max value (which latter may be the wrong thing
>> to do) or something else?
>>
>>
>> As far as I can tell the tenuring threshold is left untouched if we get a
>> promotion failure. It is probably a good idea to update it in some way. But
>> I would prefer to handle that as a separate bug fix.
>>
>> This change is mostly a small cleanup to make DefNewGeneration::collect()
>> and ParNewGeneration::collect() be more consistent. We've done the thinking
>> so, it's good to make the change in preparation for the next person that
>> comes a long and has a few cycles over and would like to merge the two
>> collect() methods in some way.
>>
>> Thanks again for looking at this!
>> Bengt
>>
>>
>> -- ramki
>>
>> On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson <
>> bengt.rutisson at oracle.com> wrote:
>>
>>>
>>> Hi everyone,
>>>
>>> Could I have a couple of reviews for this small change to make DefNew
>>> and ParNew be more consistent in the way they treat the tenuring threshold:
>>>
>>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>
>>> Thanks,
>>> Bengt
>>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/828168ad/attachment.htm>