Very long (nearly 10s, abnormal) "Choose CSet" time in a young collection

chao chu chuchao333 at gmail.com
Thu May 9 16:43:00 UTC 2019


Hi,
We got a weird issue in our prod environment that we observed a nearly 10s
"Choose CSet" time in a young GC (it's GCLocker initiated though, not sure
if this is relevant), while the related "start choosing cset..." and
"finish choosing cset..." log lines happened in the same milli-sec, which
just doesn't make much sense.

Looking at the related G1 code, all related happened in this
`finalize_cset` function: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot
/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l1949

the reported value in the "Choose CSet" is simply the sum of these two:

phase_times()->record_young_cset_choice_time_ms((young_end_time_sec -
young_start_time_sec) * 1000.0);

...

phase_times()->record_non_young_cset_choice_time_ms((non_young_end_time_sec
- non_young_start_time_sec) * 1000.0);


possibly the jvm process got stuck for some reason before the `double
non_young_end_time_sec = os::elapsedTime();` ?


  ergo_verbose5(ErgoCSetConstruction,
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2146>
               "finish choosing CSet",
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2147>
               ergo_format_region("eden")
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2148>
               ergo_format_region("survivors")
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2149>
               ergo_format_region("old")
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2150>
               ergo_format_ms("predicted pause time")
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2151>
               ergo_format_ms("target pause time"),
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2152>
               eden_region_length, survivor_region_length,
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2153>
               old_cset_region_length(),
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2154>
               predicted_pause_time_ms, target_pause_time_ms);
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2155>
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2156>
 double non_young_end_time_sec = os::elapsedTime();
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2157>
 phase_times()->record_non_young_cset_choice_time_ms((non_young_end_time_sec
- non_young_start_time_sec) * 1000.0);
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2158>
 evacuation_info.set_collectionset_regions(cset_region_length());
<http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/c16cc0cdcce5/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp#l2159>}


>From your experience, what could be the causes for the stuck? any
particular os/host level metrics we should check? (only saw this once and
couldn't re-produce it). Many thanks in advance, any insights are welcome :)


The sample gc log sections below (due to our firm's policy, I couldn't post
the gc log directly to public, sorry about that):


   [Other:   9698.8 ms]
      [Choose CSet:   *9677.5 ms*]
      [Ref Proc:   0.3 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:   0.3 ms]
   [Eden: 18G(18G)->0B(352M) Survivors: 1232M->1280M Heap:
26.5G(32G)->8175.2M(32G)]
 [Times: user=5 sys=0.09, real=10.55 secs]

Regards,

-- 
ChuChao


More information about the hotspot-dev mailing list