hsperfdata causing long GC/safepoint times: Don't use mmap?
Alex Bagehot
ceeaspb at gmail.com
Wed Apr 1 14:32:09 UTC 2015
Hi Florian,
I looked into this with perf - see the comments eg.
this was on a laptop, just to see if I could capture the event, you might
get a different result on a more production like environment.
there were 2 different stacks - one for safepoint and one for gc itself.
This provided the [kind of] stack we are looking for showing java provoking
mmap activty and then being delayed by it:
java 3634 [002] 4018.313162: sched:sched_stat_iowait: comm=java pid=3641
delay=3233713566 [ns]
7fff810ab390 enqueue_entity ([kernel.kallsyms])
7fff810ab83c enqueue_task_fair ([kernel.kallsyms])
7fff810978da enqueue_task ([kernel.kallsyms])
7fff8109bd43 activate_task ([kernel.kallsyms])
7fff8109c033 ttwu_do_activate.constprop.87 ([kernel.kallsyms])
7fff8109ee99 try_to_wake_up ([kernel.kallsyms])
7fff8109f092 default_wake_function ([kernel.kallsyms])
7fff810b3806 autoremove_wake_function ([kernel.kallsyms])
7fff810b3867 wake_bit_function ([kernel.kallsyms])
7fff810b3028 __wake_up_common ([kernel.kallsyms])
7fff810b327d __wake_up ([kernel.kallsyms])
7fff810b32e1 __wake_up_bit ([kernel.kallsyms])
7fff810b3315 wake_up_bit ([kernel.kallsyms])
7fff812178a7 unlock_buffer ([kernel.kallsyms])
7fff812afeb0 do_get_write_access ([kernel.kallsyms])
7fff812b0297 jbd2_journal_get_write_access ([kernel.kallsyms])
7fff812970bb __ext4_journal_get_write_access ([kernel.kallsyms])
7fff8126a800 ext4_reserve_inode_write ([kernel.kallsyms])
7fff8126a874 ext4_mark_inode_dirty ([kernel.kallsyms])
7fff8126db60 ext4_dirty_inode ([kernel.kallsyms])
7fff81210472 __mark_inode_dirty ([kernel.kallsyms])
7fff811ff301 update_time ([kernel.kallsyms])
7fff811ff500 file_update_time ([kernel.kallsyms])
7fff8126dd75 ext4_page_mkwrite ([kernel.kallsyms])
7fff81199add do_page_mkwrite ([kernel.kallsyms])
7fff8119bda8 do_wp_page ([kernel.kallsyms])
7fff8119e16c handle_mm_fault ([kernel.kallsyms])
7fff810613f4 __do_page_fault ([kernel.kallsyms])
7fff81061811 do_page_fault ([kernel.kallsyms])
7fff817811e8 page_fault ([kernel.kallsyms])
832a1f _ZN10PSScavenge16invoke_no_policyEv
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/libjvm.
832b18 _ZN10PSScavenge6invokeEv
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/libjvm.so)
7e9203 _ZN20ParallelScavengeHeap19failed_mem_allocateEm
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/s
94c3d4 _ZN29VM_ParallelGCFailedAllocation4doitEv
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/l
9508d7 _ZN12VM_Operation8evaluateEv
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/libjvm.so)
94f248 _ZN8VMThread18evaluate_operationEP12VM_Operation
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/s
94f6a9 _ZN8VMThread4loopEv
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/libjvm.so)
94faf2 _ZN8VMThread3runEv
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/libjvm.so)
7c4a42 _ZL10java_startP6Thread
(/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/libjvm.so)
8182 start_thread (/lib/x86_64-linux-gnu/libpthread-2.19.so)
On Wed, Apr 1, 2015 at 2:22 PM, Florian Weimer <fweimer at redhat.com> wrote:
> On 03/26/2015 04:14 PM, Evan Jones wrote:
> > At Twitter, I recently discovered that the hsperfdata file that is
> created
> > by default by the JVM causes long safepoint and GC pauses. It turns out
> > that writes to mmap-ed files can block until disk I/O completes, even if
> > the I/O is to another disk. For details see:
> > http://www.evanjones.ca/jvm-mmap-pause.html
>
> Can you try to get a kernel backtrace when this happens? Preferably on
> ext4 or XFS.
>
> Scheduling a page for writeback probably requires a kernel memory
> allocation, so this behavior isn't too surprising.
>
> --
> Florian Weimer / Red Hat Product Security
>
More information about the hotspot-runtime-dev
mailing list