RFR(S): 7041879: G1: introduce stress testing parameter to cause frequent evacuation failures
John Cuthbertson
john.cuthbertson at oracle.com
Tue Aug 21 22:42:04 UTC 2012
Hi Bengt,
Thanks for reviewing these changes. Detailed responses are inline....
On 08/16/12 03:48, Bengt Rutisson wrote:
>
> Hi John,
>
> Overall I think this looks fine. It's definitely a good stress option
> to have. Thanks for fixing this!
>
> Functionally it looks good to me, but I have some style questions:
>
> * Why do we add the new methods as inline methods in the
> g1CollectedHeap.inline.hpp file? These are only debug methods and they
> are only used from g1CollectedHeap.cpp. I would prefer to place the
> implementations in g1CollectedHeap.cpp and let the compiler decide
> whether to inline or not.
I followed the style of PromotionFailureALot. This is speculation...
The code would be included in non-product builds (which includes jvmg,
fastdebug, and optimized). If we are using fastdebug and/or optimized
builds of hotspot to chase down a timing sensitive failure then adding
overhead to the fast path of the copy code might affect our attempts to
reproduce the failure. I don't think we can assume that all compilers
will inline routines that are in the same cpp file (IIRC the Solaris
Studio compiler only does it at certain optimization levels _and_ if it
has already seen the definition of the callee routine). So by declaring
the routines as "inline" and placing them in an include file, we don't
need to rely upon the compiler (and the optimization level) to keep the
overhead of the code as low as we can. As I said though, this is
speculation.
>
> * In the constructor for G1CollectedHeap I think we could call
> reset_evacuation_should_fail() instead of these lines:
>
> #ifndef PRODUCT
> _evacuation_failure_alot_for_current_gc = false;
> _evacuation_failure_alot_count = 0;
> _evacuation_failure_alot_gc_number = 0;
> #endif // !PRODUCT
Good observation. Done.
>
>
> * G1CollectedHeap::evacuation_should_fail()
> Why do we need evacuation_should_fail(volatile size_t* count)? It is
> always called with _evacuation_failure_alot_count as parameter. How
> about using that directly instead of passing it as a parameter? Also,
> the nesting level is getting three levels deep. We could flatten that
> by bailing out early if we are not triggering evacuation failures. So,
> I think I would prefer evacuation_should_fail() to look something like:
>
> bool G1CollectedHeap::evacuation_should_fail() {
> if (!G1EvacuationFailureALot ||
> !_evacuation_failure_alot_for_current_gc) {
> return false;
> }
> // Access to _evacuation_failure_alot_count is not atomic; the value
> does not have to be exact.
> if (++_evacuation_failure_alot_count >= G1EvacuationFailureALotCount) {
> _evacuation_failure_alot_count = 0;
> return true;
> }
> return false;
> }
>
OK. How about also testing the negative of the second condition:
inline bool
G1CollectedHeap::evacuation_should_fail() {
if (!G1EvacuationFailureALot ||
!_evacuation_failure_alot_for_current_gc) {
return false;
}
// G1EvacuationFailureALot is in effect for current GC
// Access to _evacuation_failure_alot_count is not atomic;
// the value does not have to be exact.
if (++_evacuation_failure_alot_count < G1EvacuationFailureALotCount) {
return false;
}
_evacuation_failure_alot_count = 0;
return true;
}
?
>
> * G1CollectedHeap::evacuation_failure_alot_for_gc_type() and
> G1CollectedHeap::set_evacuation_failure_alot_for_current_gc()
> I am not sure that the method evacuation_failure_alot_for_gc_type()
> actually makes the code easier to understand. I think I would prefer
> to inline that logic in set_evacuation_failure_alot_for_current_gc().
> Something like:
>
> void G1CollectedHeap::set_evacuation_failure_alot_for_current_gc() {
> if (G1EvacuationFailureALot) {
> // Note we can't assert that _evacuation_failure_alot_for_current_gc
> // is clear here. It may have been set during a previous GC but
> that GC
> // did not copy enough objects (i.e. G1EvacuationFailureALotCount) to
> // trigger an evacuation failure and clear the flags and and counts.
> _evacuation_failure_alot_for_current_gc = false;
>
> // Check if we have gone over the interval.
> const size_t gc_num = total_collections();
> const size_t elapsed_gcs = gc_num -
> _evacuation_failure_alot_gc_number;
>
> if (elapsed_gcs >= G1EvacuationFailureALotInterval) {
> const bool gcs_are_young = g1_policy()->gcs_are_young();
> const bool during_im = g1_policy()->during_initial_mark_pause();
> const bool during_marking = mark_in_progress();
>
> if (gcs_are_young) {
> _evacuation_failure_alot_for_current_gc =
> G1EvacuationFailureALotDuringYoungGC
> || (during_marking && G1EvacuationFailureALotDuringConcMark)
> || (during_im && G1EvacuationFailureALotDuringInitialMark);
> } else {
> _evacuation_failure_alot_for_current_gc =
> G1EvacuationFailureALotDuringMixedGC;
> }
> }
> }
I disagree. The conditional statement above is more complicated IMO. In
the current version you don't need to know that during_initial_mark will
be true iff gcs_are_young is true and so you don't need to encode that
in a nested if. It's sufficient say for an initial mark pause - check
the value of G1EvacuationFailureALotDuringInitialMark.
>
> * If we keep any changes in g1CollectedHeap.inline.hpp it should get
> its copyright year updated
Oops. Thanks. Done.
JohnC
More information about the hotspot-gc-dev
mailing list