RFR(S): 7041879: G1: introduce stress testing parameter to cause frequent evacuation failures
Bengt Rutisson
bengt.rutisson at oracle.com
Wed Aug 22 06:48:23 UTC 2012
Hi John,
On 2012-08-22 00:42, John Cuthbertson wrote:
> Hi Bengt,
>
> Thanks for reviewing these changes. Detailed responses are inline....
Thanks for your comments! With your proposed changes I think it looks
good. Ship it!
Bengt
>
> On 08/16/12 03:48, Bengt Rutisson wrote:
>>
>> Hi John,
>>
>> Overall I think this looks fine. It's definitely a good stress option
>> to have. Thanks for fixing this!
>>
>> Functionally it looks good to me, but I have some style questions:
>>
>> * Why do we add the new methods as inline methods in the
>> g1CollectedHeap.inline.hpp file? These are only debug methods and
>> they are only used from g1CollectedHeap.cpp. I would prefer to place
>> the implementations in g1CollectedHeap.cpp and let the compiler
>> decide whether to inline or not.
>
> I followed the style of PromotionFailureALot. This is speculation...
>
> The code would be included in non-product builds (which includes jvmg,
> fastdebug, and optimized). If we are using fastdebug and/or optimized
> builds of hotspot to chase down a timing sensitive failure then adding
> overhead to the fast path of the copy code might affect our attempts
> to reproduce the failure. I don't think we can assume that all
> compilers will inline routines that are in the same cpp file (IIRC the
> Solaris Studio compiler only does it at certain optimization levels
> _and_ if it has already seen the definition of the callee routine). So
> by declaring the routines as "inline" and placing them in an include
> file, we don't need to rely upon the compiler (and the optimization
> level) to keep the overhead of the code as low as we can. As I said
> though, this is speculation.
>>
>> * In the constructor for G1CollectedHeap I think we could call
>> reset_evacuation_should_fail() instead of these lines:
>>
>> #ifndef PRODUCT
>> _evacuation_failure_alot_for_current_gc = false;
>> _evacuation_failure_alot_count = 0;
>> _evacuation_failure_alot_gc_number = 0;
>> #endif // !PRODUCT
>
> Good observation. Done.
>
>>
>>
>> * G1CollectedHeap::evacuation_should_fail()
>> Why do we need evacuation_should_fail(volatile size_t* count)? It is
>> always called with _evacuation_failure_alot_count as parameter. How
>> about using that directly instead of passing it as a parameter? Also,
>> the nesting level is getting three levels deep. We could flatten that
>> by bailing out early if we are not triggering evacuation failures.
>> So, I think I would prefer evacuation_should_fail() to look something
>> like:
>>
>> bool G1CollectedHeap::evacuation_should_fail() {
>> if (!G1EvacuationFailureALot ||
>> !_evacuation_failure_alot_for_current_gc) {
>> return false;
>> }
>> // Access to _evacuation_failure_alot_count is not atomic; the
>> value does not have to be exact.
>> if (++_evacuation_failure_alot_count >=
>> G1EvacuationFailureALotCount) {
>> _evacuation_failure_alot_count = 0;
>> return true;
>> }
>> return false;
>> }
>>
>
> OK. How about also testing the negative of the second condition:
>
> inline bool
> G1CollectedHeap::evacuation_should_fail() {
> if (!G1EvacuationFailureALot ||
> !_evacuation_failure_alot_for_current_gc) {
> return false;
> }
> // G1EvacuationFailureALot is in effect for current GC
> // Access to _evacuation_failure_alot_count is not atomic;
> // the value does not have to be exact.
> if (++_evacuation_failure_alot_count < G1EvacuationFailureALotCount) {
> return false;
> }
> _evacuation_failure_alot_count = 0;
> return true;
> }
>
> ?
>
>>
>> * G1CollectedHeap::evacuation_failure_alot_for_gc_type() and
>> G1CollectedHeap::set_evacuation_failure_alot_for_current_gc()
>> I am not sure that the method evacuation_failure_alot_for_gc_type()
>> actually makes the code easier to understand. I think I would prefer
>> to inline that logic in set_evacuation_failure_alot_for_current_gc().
>> Something like:
>>
>> void G1CollectedHeap::set_evacuation_failure_alot_for_current_gc() {
>> if (G1EvacuationFailureALot) {
>> // Note we can't assert that _evacuation_failure_alot_for_current_gc
>> // is clear here. It may have been set during a previous GC but
>> that GC
>> // did not copy enough objects (i.e.
>> G1EvacuationFailureALotCount) to
>> // trigger an evacuation failure and clear the flags and and counts.
>> _evacuation_failure_alot_for_current_gc = false;
>>
>> // Check if we have gone over the interval.
>> const size_t gc_num = total_collections();
>> const size_t elapsed_gcs = gc_num -
>> _evacuation_failure_alot_gc_number;
>>
>> if (elapsed_gcs >= G1EvacuationFailureALotInterval) {
>> const bool gcs_are_young = g1_policy()->gcs_are_young();
>> const bool during_im = g1_policy()->during_initial_mark_pause();
>> const bool during_marking = mark_in_progress();
>>
>> if (gcs_are_young) {
>> _evacuation_failure_alot_for_current_gc =
>> G1EvacuationFailureALotDuringYoungGC
>> || (during_marking && G1EvacuationFailureALotDuringConcMark)
>> || (during_im && G1EvacuationFailureALotDuringInitialMark);
>> } else {
>> _evacuation_failure_alot_for_current_gc =
>> G1EvacuationFailureALotDuringMixedGC;
>> }
>> }
>> }
>
> I disagree. The conditional statement above is more complicated IMO.
> In the current version you don't need to know that during_initial_mark
> will be true iff gcs_are_young is true and so you don't need to encode
> that in a nested if. It's sufficient say for an initial mark pause -
> check the value of G1EvacuationFailureALotDuringInitialMark.
>
>>
>> * If we keep any changes in g1CollectedHeap.inline.hpp it should get
>> its copyright year updated
>
> Oops. Thanks. Done.
>
> JohnC
More information about the hotspot-gc-dev
mailing list