Degenerated GC pauses for 5-10 seconds
Alexander Yakushev
alex at bytopia.org
Tue Jan 15 21:11:25 UTC 2019
In the second log, it looks like it happened not during the concurrent
reset, but between the other two phases. From the excerpt (
https://gist.github.com/alexander-yakushev/549a1a600f53ed61249b8ae8184a3971
):
2019-01-07T18:02:57.859+0000: 258123.422: [Concurrent update
referencesUsing 25 of 25 workers for concurrent reference update
49G->50G(55G), 95.928 ms]
Failed to allocate 2048K
Cancelling GC: Allocation Failure
2019-01-07T18:03:03.153+0000: 258128.717: [Pause Final Update RefsUsing 25
of 25 workers for final reference update
, 2.967 ms]
In this case, the delay is not calculated, but there is a 5-second
difference between two timestamps. I lost the full log for this case,
unfortunately.
Best regards,
Alex Yakushev
On Tue, 15 Jan 2019 at 22:42, Aleksey Shipilev <shade at redhat.com> wrote:
> Hi,
>
> On 1/11/19 3:03 PM, Alexander Yakushev wrote:
> > *Setup: *
> > - openjdk-shenandoah-jdk8-latest-linux-x86_64-release from
> > builds.shipilev.net, downloaded on 2018-12-31.
> > - -Xmx55g -Xms55g -XX:+UseShenandoahGC -XX:+AlwaysPreTouch
>
> Ok, this is recent enough.
>
> > // Also, not sure if relevant:
> > -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
>
> This does not seem relevant.
>
> > *Problem:*
> > Shenandoah does an excellent job dealing with ~4GB/sec allocation rate on
> > a 55GB heap staying under 100ms max pause time and ~500ms max cycle time.
> > However, on particularly nasty spikes, it drops into the degenerated
> mode
> > which causes a 5-10 second STW pause. So far I observed that three times,
> > each with a huge pause. There doesn't seem to be other "normal"
> > degenerated cycles, so I assume every degenerate cycle causes a pause
> like
> > that.
>
> Here's the excerpt, verbatim:
>
> [Concurrent reset
> Using 25 of 25 workers for concurrent reset
> Failed to allocate 2048K
> Cancelling GC: Allocation Failure
> 46G->52G(55G), 11069.097 ms]
>
> So this is not the pause per se. It looks like cancellation triggered when
> "Concurrent reset", the
> first action in the cycle, was running. Why was it running for 11 seconds
> is not clear right now,
> let me look around.
>
> Can you grep your other logs and see if those 5-10 second "pauses" were
> also "Concurrent reset"s?
>
> Cheers,
> -Aleksey
>
>
More information about the shenandoah-dev
mailing list