Request for review (S): 8005972: ParNew should not update the tenuring threshold when promotion failed has occurred

Fri Jan 11 13:05:33 UTC 2013

That's very strange and unexpected indeed.  Perhaps you're right that
ParNew moves objects in a better manner (e.g. preserves locality).

Let us know if you manage to figure it out - I'd be very interested.

Thanks

Sent from my phone
On Jan 11, 2013 7:57 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com> wrote:

>
> Hi Vitaly,
>
> On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> Regarding the benchmark score, are you saying ParNew has longer cumulative
> GC time or just the average is higher? If it's just average, maybe the
> total # of them (and cumulative time) is less.  I don't know the
> characteristics of this particular specjbb benchmark, but perhaps having
> fewer total GCs is better because of the overhead of getting all threads to
> a safe point, going go the OS to suspend them, and then restarting them.
> After they're restarted, the CPU cache may be cold for it because the GC
> thread polluted it.  Or I'm entirely wrong in my speculation ... :).
>
>
> You have a good point about the number of GCs. The problem in my runs is
> that ParNew does more GCs than DefNew. So there are both more of them and
> their average time is higher, but the score is still better. That ParNew
> does more GCs is not that strange. It has a higher score, which means that
> it had higher throughput and had time to create more objects. So, that is
> kind of expected. But I don't understand how it can have higher throughput
> when the GCs take longer. My current guess is that it does something
> differently with how objects are copied in a way that is beneficial for the
> execution time between GCs.
>
> It also seems like ParNew keeps more objects alive for each GC. That is
> either the reason why it does more and more frequent GCs than DefNew, or it
> is an effect of the fact that more objects are created due to the higher
> throughput. This is the reason I started looking at the tenuring threshold.
>
> Bengt
>
>  Thanks
>
> Sent from my phone
> On Jan 11, 2013 6:02 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com>
> wrote:
>
>>
>> Hi Ramki,
>>
>> Thanks for looking at this!
>>
>> On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>
>> Hi Bengt --
>>
>> The change looks reasonable, but I have a comment and a follow-up
>> question.
>>
>> Not your change, but I'd elide the "half the real survivor size" since
>> it's really a configurable parameter based on TargetSurvivorRatio with
>> default half.
>> I'd leave the comment as "set the new tenuring threshold and desired
>> survivor size".
>>
>>
>> I'm fine with removing this from the comment, but I thought the "half the
>> real survivor size" aimed at the fact that we pass only the "to" capacity
>> and not the "from" capacity in to compute_tenuring_threshold(). With that
>> interpretation I think the comment is correct.
>>
>> Would you like me to remove it anyway? Either way is fine with me.
>>
>> I'm curious though, as to what performance data prompted this change,
>>
>> Good point. This change was preceded by an internal discussion in the GC
>> team, so I should probably have explained the background more in my review
>> request to the open.
>>
>> I was comparing the ParNew and DefNew implementation since I am seeing
>> some strange differences in some SPECjbb2005 results. I am running ParNew
>> with a single thread and get much better score than with DefNew. But I also
>> get higher average GC times. So, I was trying to figure out what DefNew and
>> ParNew does differently.
>>
>> When I was looking at DefNewGeneration::collect() and
>> ParNewGeneration::collect() I saw that they contain a whole lot of code
>> duplication. It would be tempting to try to extract the common code out
>> into DefNewGeneration since it is the super class. But there are some minor
>> differences. One of them was this issue with how they handle the tenuring
>> threshold.
>>
>> We tried to figure out if there is a reason for ParNew and DefNew to
>> behave different in this regard. We could not come up with any good reason
>> for that. So, we needed to figure out if we should change ParNew or DefNew
>> to make them consistent. The decision to change ParNew was based on two
>> things. First, it seems wrong to use the data from a collection that got
>> promotion failure. This collection will not have allowed the tenuring
>> threshold to fulfill its purpose. Second, ParallelScavenge works the same
>> way as DefNew.
>>
>> BTW, the difference between DefNew and ParNew seems to have been there
>> from the start. So, there is no bug or changeset in mercurial or TeamWare
>> to explain why the difference was introduced.
>>
>> (Just to be clear, this difference was not the cause of my performance
>> issue. I still don't have a good explanation for how ParNew can have longer
>> GC times but better SPECjbb score.)
>>
>> and whether it might make sense, upon a promotion failure to do something
>> about the tenuring threshold for the next scavenge (i.e. for example make
>> the tenuring threshold half of its current value as a reaction to the fact
>> that promotion failed). Is it currently left at its previous value or is it
>> asjusted back to the default max value (which latter may be the wrong thing
>> to do) or something else?
>>
>>
>> As far as I can tell the tenuring threshold is left untouched if we get a
>> promotion failure. It is probably a good idea to update it in some way. But
>> I would prefer to handle that as a separate bug fix.
>>
>> This change is mostly a small cleanup to make DefNewGeneration::collect()
>> and ParNewGeneration::collect() be more consistent. We've done the thinking
>> so, it's good to make the change in preparation for the next person that
>> comes a long and has a few cycles over and would like to merge the two
>> collect() methods in some way.
>>
>> Thanks again for looking at this!
>> Bengt
>>
>>
>> -- ramki
>>
>> On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson <
>> bengt.rutisson at oracle.com> wrote:
>>
>>>
>>> Hi everyone,
>>>
>>> Could I have a couple of reviews for this small change to make DefNew
>>> and ParNew be more consistent in the way they treat the tenuring threshold:
>>>
>>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>
>>> Thanks,
>>> Bengt
>>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/5429c562/attachment.htm>