deferred updates? (was Re: value of dense prefix address used?)

Thu Mar 1 13:37:12 PST 2012

Turning off maximal compaction seems to have worked to get rid of the
outlier times, just as we had conjectured.
(Charlie may want to add that to the next edition of his book ;-) We'll see
how well it holds up.

On a related note (see changed the subject line), addressed mainly to John
Coomes (i think, and may be Jon Masa?),
why do we not update the pointers inside partial objects at the end of a
destination compaction region at the time
that we copy the object, rather than deferring that work for later? Can
something be done about this?
Secondly, (this would be moot if we didn't have to defer) even if this had
to be deferred, why aren't the deferred updates
done in parallel rather than single-threaded as done currently? Am I
missing something or is this wrinkle merely
an expediency that never got straightened out but is easily fixed? If so,
are there any plans to address this soon?
Certain experiments seem to indicate phase to be highly variable causing
parallel old gc's to be variable. It would
seem at this point that if this is addressed, the variance/outliers in
these times would shrink considerably.

Question to Charlie: have you seen this trend in any of your performance
explorations?

thanks!
-- ramki

On Sun, Feb 19, 2012 at 2:13 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

> Hi Jon --
>
> After looking at the code and the pattern we observed, we are pretty
> confident now that the maximal
> compaction was the root of the problem. We are going to effectively turn
> off the maximal compaction
> and see if it does any harm (don't think it will), and use that to work
> around the problem of extreme degradation
> when doing parallel compaction. It;s interesting why maximal compaction
> would degrade parallel compaction
> by so much... some experiments would be useful and perhaps help correct a
> specific issue the lack of initial parallelism
> may be causing to make the whole collection so much more inefficient.
> Hopefully we'll be able to collect some
> numbers that might help you folks address the issue.
>
> later.
> -- ramki
>
>
> On Fri, Feb 17, 2012 at 12:48 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:
>
>> Hi John, thanks for those suggestions...
>>
>> So far the pattern has not repeated, but occurred on two different
>> servers (in each case it was the
>> same full gc ordinal too, albeit at different time). There didn't seem
>> anything external that would
>> explain the difference observed. Yes, we'll play around a bit with the
>> compaction related parameters and look at the phase times
>> as well. I am also looking at how the dense prefix address is computed to
>> see if it sheds a bit of
>> light may be, but it could also be something happening early in the life
>> of the process that doesn't happen
>> later that causes this... it's all a bit of a mystery at the moment.
>> Thanks!
>>
>> -- ramki
>>
>>
>> On Fri, Feb 17, 2012 at 12:10 PM, Jon Masamitsu <jon.masamitsu at oracle.com
>> > wrote:
>>
>>> **
>>> Ramki,
>>>
>>> I didn't find a product flag that would print the end of the dense
>>> prefix.
>>> Don't know about jmx.
>>>
>>> The phase accounting (PrintParallelOldGCPhaseTimes)
>>> as you say is a good place to start.  The summary phase is
>>> serial so look for an increase in that phase.   Does this pattern
>>> repeat?
>>>
>>> You could also try changing HeapMaximumCompactionInterval
>>> and see if it affects the pattern.
>>>
>>> Jon
>>>
>>>
>>> On 2/17/2012 9:46 AM, Srinivas Ramakrishna wrote:
>>>
>>> Hi Jo{h}n, all --
>>>
>>> Is there some way i can get at the dense prefix value used for ParOld in
>>> each (major) collection? I couldn't find an obvious product flag for
>>> eliciting that info, but wondered if you knew/remembered.
>>> JMX would be fine too -- as long as the info can be obtained in a product
>>> build.
>>>
>>> I am seeing a curious looking log where one specific major gc seems to have
>>> greater user and real time, lower "parallelism" [=(user+sys)/real] and
>>> takes much longer than the rest of the ParOld's. It
>>> also lowers the footprint a tad more (but definitely not proportionately
>>> more vis-a-vis the user time ratios) than the gc's preceding (but not
>>> succeeding) that one, so one conjecture was that perhaps
>>> something happens with the dense prefix computation at that time and we
>>> somehow end up copying more. We'll see if we can get some data with
>>> printing the ParOld phase times, but i wondered if
>>> we might also be able to elicit the dense prefix address/size. I'll
>>> continue to dig around meanwhile.
>>>
>>> thanks for any info!
>>> -- ramki
>>>
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120301/78f03a1f/attachment.html