Very long safepoint pauses for RevokeBias

David Holmes david.holmes at oracle.com
Wed May 20 07:26:07 UTC 2015


On 20/05/2015 11:29 AM, Christopher Berner wrote:
> Yep, we often have a hundred or so threads active. If I read the output
> correctly, it was only waiting on 6 threads which took 13secs and then
> the revocation took 8secs. Is that normal to take even 8secs? Our GC and
> other safe point pauses are typically in the hundreds of milliseconds.

8 seconds seems an extreme amount of time, but it suggests to me that 
that the VMThread couldn't get enough CPU cycles. How many active 
threads and how many processors on the system overall?

13 seconds to reach a safepoint is also extreme.

It is very hard to try and remote diagnoze these things. You really need 
much more fine-grained information about the actual VM operation as it 
executes. :(

David

> Thanks for all the help!
> Christopher
>
> Sent from my phone
>
> On May 19, 2015 1:01 AM, "David Holmes" <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     On 19/05/2015 2:24 AM, Christopher Berner wrote:
>
>         Don't think it got truncated, but I only included the first line
>         of log
>         about the safepoint statistics. Turning off biased locks seem to
>         have
>         fixed those pauses, thanks!
>
>         Just to help me understand, what circumstances would cause biased
>         locking to induce 28sec pauses? Is that because a thread was
>         holding the
>         lock for 28sec?
>
>
>     No. Based on the stats you showed the actual revocation is only a
>     fraction of the time spent. It is taking a long time to get the
>     system to a safepoint and then a reasonable amount of time is also
>     being spent on safepoint cleanup tasks.
>
>     Do you have hundreds of active threads?
>
>     David
>     -----
>
>         On Sun, May 17, 2015 at 1:54 PM, David Holmes
>         <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>         <mailto:david.holmes at oracle.com
>         <mailto:david.holmes at oracle.com>>> wrote:
>
>              Hi Christopher,
>
>
>              On 16/05/2015 10:38 AM, Christopher Berner wrote:
>
>                  I work on the Presto project, which is a distributed
>         SQL engine, and
>                  intermittently (roughly once an hour) I see very long
>                  application stopped
>                  times (as reported by
>                       -XX:+PrintGCApplicationStoppedTime). I enabled
>         safepoint
>                  statistics, and
>                  see that the pause seems to be coming from a RevokeBias
>         safepoint.
>
>                  Any suggestions as to how I can debug this? I already
>         tried adding
>                  -XX:+PerfDisableSharedMem, in case this was related to
>         https://bugs.openjdk.java.net/browse/JDK-8076103
>
>                  See below for safepoint statistics:
>
>                             vmop                    [threads: total
>         initially_running
>                  wait_to_block]    [time: spin block sync cleanup vmop]
>                  page_trap_count
>
>                  2528.893: RevokeBias                       [     872
>                0
>                      6    ]      [     0 12826 13142  6476  8177    ]  0
>
>
>              Your email seems to be truncated?
>
>              But just to be sure, does the application run okay with
>              biased-locking disabled?
>
>              David
>
>


More information about the hotspot-dev mailing list