Rare very big pause

Kirill A. Korinsky kirill at korins.ky
Thu Dec 21 22:30:08 UTC 2017


Hey,

You can find below an output of verbose:gc.

Short summary:

> Pause Final Mark (G)        =     4.59 s (a =     9911 us) (n =   463) (lvls, us =     3633,     5215,     6230,    10547,   329910)
> Pause Final Mark (N)        =     3.74 s (a =     8082 us) (n =   463) (lvls, us =     2969,     4102,     4766,     6504,   328861)
>   Finish Queues             =     0.90 s (a =     1935 us) (n =   463) (lvls, us =      275,      799,      975,     1348,   325530)

I have no idea why :(

> GC STATISTICS:
>   "(G)" (gross) pauses include VM time: time to notify and block threads, do the pre-
>         and post-safepoint housekeeping. Use -XX:+PrintSafepointStatistics to dissect.
>   "(N)" (net) pauses are the times spent in the actual GC code.
>   "a" is average time for each phase, look at levels to see if average makes sense.
>   "lvls" are quantiles: 0% (minimum), 25%, 50% (median), 75%, 100% (maximum).
> 
> Total Pauses (G)            =     8.66 s (a =     4678 us) (n =  1852) (lvls, us =      721,     2891,     3516,     5156,   329914)
> Total Pauses (N)            =     6.15 s (a =     3319 us) (n =  1852) (lvls, us =       87,      170,     2363,     3711,   328864)
> Pause Init Mark (G)         =     2.09 s (a =     4509 us) (n =   463) (lvls, us =     3184,     3828,     3945,     4473,    18191)
> Pause Init Mark (N)         =     1.33 s (a =     2880 us) (n =   463) (lvls, us =     2285,     2832,     2891,     2930,     9342)
>   Accumulate Stats          =     0.02 s (a =       53 us) (n =   463) (lvls, us =       47,       49,       50,       51,      154)
>   Make Parsable             =     0.03 s (a =       73 us) (n =   463) (lvls, us =       62,       69,       71,       74,      143)
>   Clear Liveness            =     0.08 s (a =      181 us) (n =   463) (lvls, us =      145,      172,      176,      180,     1377)
>   Scan Roots                =     1.15 s (a =     2474 us) (n =   463) (lvls, us =     1914,     2441,     2480,     2539,     8814)
>     S: Thread Roots         =     0.30 s (a =      639 us) (n =   463) (lvls, us =      471,      496,      510,      553,     2176)
>     S: String Table Roots   =     0.33 s (a =      713 us) (n =   463) (lvls, us =        0,      850,      865,      891,     2207)
>     S: Universe Roots       =     0.00 s (a =        2 us) (n =   463) (lvls, us =        2,        2,        2,        2,       61)
>     S: JNI Roots            =     0.00 s (a =        6 us) (n =   463) (lvls, us =        4,        5,        5,        6,        8)
>     S: JNI Weak Roots       =     0.02 s (a =       35 us) (n =   463) (lvls, us =        0,       40,       41,       43,      236)
>     S: Synchronizer Roots   =     0.02 s (a =       43 us) (n =   463) (lvls, us =       12,       36,       40,       46,      646)
>     S: Flat Profiler Roots  =     0.00 s (a =       10 us) (n =   463) (lvls, us =        6,       10,       10,       10,       24)
>     S: Management Roots     =     0.00 s (a =        3 us) (n =   463) (lvls, us =        2,        2,        3,        3,       50)
>     S: System Dict Roots    =     0.05 s (a =      104 us) (n =   463) (lvls, us =       15,       16,       17,       21,      501)
>     S: CLDG Roots           =     0.38 s (a =      825 us) (n =   463) (lvls, us =      201,      912,      926,      947,     4637)
>     S: JVMTI Roots          =     0.00 s (a =        1 us) (n =   463) (lvls, us =        1,        1,        1,        1,        2)
>   Resize TLABs              =     0.02 s (a =       34 us) (n =   463) (lvls, us =       29,       31,       33,       35,      134)
> Pause Final Mark (G)        =     4.59 s (a =     9911 us) (n =   463) (lvls, us =     3633,     5215,     6230,    10547,   329910)
> Pause Final Mark (N)        =     3.74 s (a =     8082 us) (n =   463) (lvls, us =     2969,     4102,     4766,     6504,   328861)
>   Finish Queues             =     0.90 s (a =     1935 us) (n =   463) (lvls, us =      275,      799,      975,     1348,   325530)
>   Weak References           =     0.14 s (a =     1549 us) (n =    92) (lvls, us =      908,     1152,     1289,     1504,     7248)
>     Process                 =     0.13 s (a =     1379 us) (n =    92) (lvls, us =      783,      994,     1113,     1367,     7108)
>     Enqueue                 =     0.02 s (a =      166 us) (n =    92) (lvls, us =      104,      146,      160,      174,      577)
>   System Purge              =     1.21 s (a =    13126 us) (n =    92) (lvls, us =    12500,    12891,    12891,    13086,    15472)
>     Unload Classes          =     0.03 s (a =      378 us) (n =    92) (lvls, us =      334,      344,      348,      357,     1531)
>     Parallel Cleanup        =     1.17 s (a =    12738 us) (n =    92) (lvls, us =    11719,    12500,    12695,    12695,    14853)
>       Code Cache            =     0.22 s (a =     2411 us) (n =    92) (lvls, us =     1836,     2324,     2383,     2461,     4410)
>       String/Symbol Tables  =     0.55 s (a =     6027 us) (n =    92) (lvls, us =     5898,     5977,     5996,     6035,     6708)
>       Clean Classes         =     0.39 s (a =     4202 us) (n =    92) (lvls, us =     3750,     4160,     4199,     4219,     4368)
>     CLDG                    =     0.00 s (a =        6 us) (n =    92) (lvls, us =        4,        4,        5,        5,       84)
>   Prepare Evacuation        =     0.21 s (a =      456 us) (n =   463) (lvls, us =      270,      414,      451,      500,      722)
>   Initial Evacuation        =     1.25 s (a =     2710 us) (n =   463) (lvls, us =     1973,     2598,     2676,     2734,     8715)
>     E: Thread Roots         =     0.23 s (a =      492 us) (n =   463) (lvls, us =      166,      457,      471,      500,     2367)
>     E: Code Cache Roots     =     0.92 s (a =     1993 us) (n =   463) (lvls, us =      957,     1914,     1973,     2051,     4452)
> Pause Init  Update Refs (G) =     0.51 s (a =     1092 us) (n =   463) (lvls, us =      719,      865,      891,      949,     7584)
> Pause Init  Update Refs (N) =     0.04 s (a =       96 us) (n =   463) (lvls, us =       85,       90,       92,       95,      170)
> Pause Final Update Refs (G) =     1.48 s (a =     3192 us) (n =   463) (lvls, us =     2500,     3008,     3066,     3145,     9424)
> Pause Final Update Refs (N) =     1.03 s (a =     2214 us) (n =   463) (lvls, us =     1992,     2109,     2148,     2207,     7563)
>   Update Roots              =     0.75 s (a =     1627 us) (n =   463) (lvls, us =     1484,     1543,     1562,     1602,     6927)
>     UR: Thread Roots        =     0.21 s (a =      443 us) (n =   463) (lvls, us =      271,      420,      432,      445,     1406)
>     UR: String Table Roots  =     0.22 s (a =      465 us) (n =   463) (lvls, us =      301,      451,      457,      475,      937)
>     UR: Universe Roots      =     0.00 s (a =        1 us) (n =   463) (lvls, us =        1,        1,        1,        1,       15)
>     UR: JNI Roots           =     0.00 s (a =        3 us) (n =   463) (lvls, us =        2,        3,        3,        3,       35)
>     UR: JNI Weak Roots      =     0.01 s (a =       17 us) (n =   463) (lvls, us =       12,       15,       16,       16,       91)
>     UR: Synchronizer Roots  =     0.02 s (a =       38 us) (n =   463) (lvls, us =        7,       37,       38,       39,       71)
>     UR: Flat Profiler Roots =     0.00 s (a =       10 us) (n =   463) (lvls, us =        2,        9,       10,       10,      118)
>     UR: Management Roots    =     0.00 s (a =        2 us) (n =   463) (lvls, us =        1,        2,        2,        2,       32)
>     UR: System Dict Roots   =     0.01 s (a =       13 us) (n =   463) (lvls, us =        9,       12,       12,       13,       77)
>     UR: CLDG Roots          =     0.24 s (a =      508 us) (n =   463) (lvls, us =      422,      482,      492,      506,     4392)
>     UR: JVMTI Roots         =     0.00 s (a =        1 us) (n =   463) (lvls, us =        1,        1,        1,        1,        2)
>   Recycle                   =     0.25 s (a =      549 us) (n =   463) (lvls, us =      432,      508,      539,      574,      895)
> Concurrent Marking          =   546.57 s (a =  1180492 us) (n =   463) (lvls, us =   474609,   964844,  1230469,  1425781,  2008314)
> Concurrent Precleaning      =     0.07 s (a =      800 us) (n =    92) (lvls, us =      174,      613,      807,      932,     2517)
> Concurrent Evacuation       =    45.53 s (a =    98326 us) (n =   463) (lvls, us =    56445,    76562,    98828,   115234,   272469)
> Concurrent Update Refs      =   334.94 s (a =   723406 us) (n =   463) (lvls, us =   304688,   625000,   757812,   847656,  1095144)
> Concurrent Reset Bitmaps    =     1.83 s (a =     3961 us) (n =   463) (lvls, us =      107,     3184,     3809,     4336,    13480)
> 
> 0 allocation failure and 0 user requested GCs
> 463 successful and 0 degenerated concurrent markings
> 463 successful and 0 degenerated update references


--
wbr, Kirill


> On 19 Dec 2017, at 21:51, Aleksey Shipilev <shade at redhat.com> wrote:
> 
> On 12/19/2017 06:43 PM, Kirill A. Korinsky wrote:
>> Well, it helps decrease pauses, but it is still about 150-200 ms.
>> 
>> For example:
>> 
>> Concurrent marking triggered. Free: 1718M, Free Threshold: 1720M; Allocated: 1718M, Alloc Threshold: 0M
>> 2017-12-19T16:59:02.785+0000: 3055.494: [Pause Init Mark, 2.553 ms]
>> 2017-12-19T16:59:02.788+0000: 3055.496: [Concurrent marking 4418M->4723M(6144M), 916.628 ms]
>> 2017-12-19T16:59:03.710+0000: 3056.419: [Pause Final MarkTotal Garbage: 3555M
>> Immediate Garbage: 1374M, 688 regions (41% of total)
>> Garbage to be collected: 1849M (52% of total), 965 regions
>> Live objects to be evacuated: 80M
>> Live/garbage ratio in collected regions: 4%
>> 4723M->3351M(6144M), 175.951 ms]
>> 2017-12-19T16:59:03.886+0000: 3056.595: [Concurrent evacuation 3352M->3453M(6144M), 71.376 ms]
>> 2017-12-19T16:59:03.959+0000: 3056.667: [Pause Init Update Refs, 0.099 ms]
>> 2017-12-19T16:59:03.959+0000: 3056.667: [Concurrent update references  3453M->3489M(6144M), 578.340 ms]
>> 2017-12-19T16:59:04.538+0000: 3057.246: [Pause Final Update Refs 3489M->1560M(6144M), 2.241 ms]
>> 2017-12-19T16:59:04.540+0000: 3057.249: [Concurrent reset bitmaps 1560M->1560M(6144M), 3.516 ms]
>> Capacity: 6144M, Peak Occupancy: 4723M, Lowest Free: 1420M, Free Threshold: 1228M
>> Adjusting free threshold to: 25% (1536M)
> 
> This one looks like legit non-degraded Final Mark pause. Seeing 175ms for that pause seems odd,
> unless you are running fastdebug builds (in which case you will spend some time zapping the memory).
> With -verbose:gc, there would be the GC stats table after JVM exits, which will dissect that pause
> much better.
> 
> -Aleksey
> 



More information about the shenandoah-dev mailing list