Very slow promotion failures in ParNew / ParallelGC
tprintezis at twitter.com
Wed Jan 13 23:52:59 UTC 2016
On January 13, 2016 at 11:47:40 AM, Thomas Schatzl (thomas.schatzl at oracle.com) wrote:
On Wed, 2016-01-13 at 11:11 -0500, Tony Printezis wrote:
> Thanks for the reply. Inline.
> On January 13, 2016 at 5:08:04 AM, Thomas Schatzl (
> thomas.schatzl at oracle.com) wrote:
> > Hi,
> > On Tue, 2016-01-12 at 13:15 -0500, Tony Printezis wrote:
> > > Thomas,
> > >
> > > Inline.
> > >
> > > On January 12, 2016 at 7:00:45 AM, Thomas Schatzl (
> > > thomas.schatzl at oracle.com) wrote:
> > > >
> > [...]
> > > >
> > > > > The fix is to use a different default mark value when biased
> > > > > locking is enabled (0x5) or disabled (0x1, as it is now).
> > During
> > > > > promotion failures, marks are not preserved if they are equal
> > to
> > > > > the default value and the mark of forwarded objects is set to
> > the
> > > > > default value post promotion failure and before the preserved
> > > > > marks are re-instated.
> > > >
> > > > You mean the value of the mark as it is set during promotion
> > > > failure for the new objects?
> > > Not sure what you mean by “for new objects”.
> > > Current state: When we encounter promotion failures, we check
> > whether
> > > the mark is the default (0x1). If it is, we don’t preserve it. If
> > it
> > > is not, we preserve it. After promotion failure, we iterate over
> > the
> > > young gen and set the mark of all objects (ParNew) or all
> > forwarded
> > > objects (ParallelGC) to the default (0x1), then apply all
> > preserved
> > > marks.
> > > What I’m proposing is that in the process I just described, the
> > > default mark will be 0x5, if biased locking is enabled (as most
> > > objects will be expected to have a 0x5 mark) and 0x1, if biased
> > > locking is disabled (as it is the case right now).
> > As you mentioned, the default value for new objects is typically
> > not
> > 0x1 when biased locking is enabled, but klass()
> > ->prototype_header().
> (OK, I now understand what you meant by “new objects”.) Indeed. But
> that’s not only the case for new objects. I’d guess that most objects
> will retain their initial mark? Maybe?
Hopefully they will :)
> > One other "problem" seems to be that some evacuation failure
> > recovery
> > code unconditionally sets the header of the objects that failed
> > promotion but are not in the preserved headers list to 0x1....
> It’d be hard to do otherwise? You’d have to do a look-up on a table
> to see whether the object’s mark should be set to the default or a
> stored value. I think, assuming that most objects have a default mark
> word, setting the mark word of all (forwarded?) objects in the young
> gen to the default, then apply the (hopefully, small number of)
> preserved marks afterwards is not unreasonable.
The description in markoop.hpp indicates that a value of 0x1 means for
when biased locking is enabled that that object can not be biased any
more in the future. Hurting future performance after evacuation
failure. So it might not be ideal to unconditionally store 0x1 there.
I never actually suggested to set the mark word to 1 unconditionally. :-)
We may be talking in circles here about the same thing, but one other
thought... one may (here I am not completely sure because the existing
code to determine that is somewhat complicated :)) only need to
preserve marks if they are different from the default value (the
condition in markOopDesc::must_be_preserved() may ultimately just boil
down to this). This obviously needs the gc to compare with the existing
The assumption of your changes that either 0x5 or 0x1 is most common is
just a short-cut to that, but by only putting values into the preserved
mark lists that are non-default (regardless of whether biased locking
is enabled or not), you may get an even lower amount of entries in that
Sure, something to consider. I’d love to get some extensive testing on that change though. :-) (hint, hint)
Now the question to me would be, what is more expensive, just assuming
a particular default value at the start (e.g. 0x5 with BiasedLocking
enabled, 0x1 if disabled) is faster, or checking whether the current
value is default or not (or the existing
markOopDesc::must_be_preserved()) and profiting from the reduced amount
of entries later.
Comparing against a constant (or a local field) will most definitely be cheaper (not dereferencing). But, I don’t think that check will have a huge overhead whichever way it’s done. Reducing the number of preserved marks will be more important than optimizing said check. BTW, the check right now is actually quite expensive (checks the UseBiasedLocking flag, etc.). So, we can probably do better whichever way we implement it.
> FWIW, it’d be nice if we could completely avoid self-forwarding (and
> a lot of those problems will just go away…).
... and adds new problems :)
Of course. But something to consider in the long-term.
> > > > That [disabling biased locking] may be an option in some cases
> > > > in addition to these suggested changes.
> > > Not sure what you mean.
> > In some cases, a "fix" to long promotion failure times might be to
> > disable biased locking - because biased locking may not even be
> > advantageous in some cases due to its own overhead.
> Well, if biased locking doesn’t pay off for an application (and we do
> have evidence that biased locking might not pay off for our
> services), then I assume a lot of classes will end up being unbiased
> and their prototype header set to 0x1 which might prevent the high
> amount of marks being preserved issue.
Without biased locking, no object header will be marked as biased :)
> > > > A larger segment size may be a better trade-off for current,
> > > > larger applications though.
> > > Is there any way to auto-tune the segment size? So, the larger
> > > the stack grows, the larger the segment size?
> > Could be done, however is not implemented yet. And of course the
> > basic
> > promotion failure handling code is very different between the
> > collectors. Volunteers welcome :]
> I factored out some of the logic to a PreservedMarks class which can
> be re-used by all GCs to somewhat cut down on the code replication...
There is already OopAndOopMark or so in G1 btw that might be used
Way ahead of you. ;-)
> > Should I create a new CR per GC (ParNew and ParallelGC) for the
> > > per-worker preserved mark stacks and we’ll take it from there?
> > Please do.
> JDK-8146989 and JDK-8146991. I’ll post a webrev for the first one
> later today.
Tony Printezis | JVM/GC Engineer / VM Team | Twitter
tprintezis at twitter.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-gc-dev