RFD: 8252768: Fast, asynchronous heap dumps
Laurence Cable
larry.cable at oracle.com
Thu Sep 3 19:07:16 UTC 2020
On 9/3/20 9:03 AM, Volker Simonis wrote:
> Hi,
>
> I'd like to get your opinion on a POC I've done in order to speed up
> heap dumps on Linux:
>
> https://bugs.openjdk.java.net/browse/JDK-8252768
> http://cr.openjdk.java.net/~simonis/webrevs/2020/8252768/
>
> Currently, heap dumps can be taken by the SA tools from a frozen
> process or core file or directly from a running process with jcmd,
> jconsole & JMX, jmap, etc. If the heap of a running process is dumped,
> this happens at a safepoint (see VM_HeapDumper). Because the time to
> produce a heap dump is roughly proportional to the size and fill ratio
> of the heap, this leads to safepoint times which can range from ~100ms
> for a 100mb heap to ~1s for a 1gb heap up to 15s and more for a 8gb
> heap (measured on my Core i7 laptop with SSD).
>
> One possibility to decrease the safepoint time is to offload the
> dumping work to an asynchronous process. On Linux (and probably any
> other OS which supports fork()) this can be achieved by forking and
> offloading the heap dumping to the child process. Forking still needs
> to happen at a safepoint, but forking is considerably faster compared
> to the dumping process itself. The fork performance is still
> proportional to the size of the original Java process because although
> fork won't copy any memory pages, the kernel still needs to duplicate
> the page table entries of the process.
curious what is the. behavior of the parent/target JVM process "after"
it executes the fork() at the safepoint? i.e what does it do next?
> Linux uses a “copy-on-write” technique for the creation of a forked
> child process. This means that right after creation, the child process
> will have exactly the same memory image like its parent process. But
> at the same time, the child process won’t use any additional physical
> memory, as long as it doesn’t change (i.e. writes into) its memory.
> Since heap dumping only reads the child process's memory and then
> exits immediately, this technique can be applied even if the Java
> process already uses almost the whole free physical memory.
>
> The POC I've created (see
> http://cr.openjdk.java.net/~simonis/webrevs/2020/8252768/) decreases
> the aforementioned ~100ms, ~1s and 15s for a 100mb, 1gb and 8gb heap
> to ~3ms, ~15ms and ~60ms on my laptop which I think is significant.
> You can try it out by using the new "-async" or "-async=true" option
> of the "GC.heap_dump" jcmd command.
>
> Of course this change will require a CSR for the additional jcmd
> GC.heap_dump "-async" option which I'll be happy to create if there's
> any interest in this enhancement. Also, logging in the child process
> might potentially interfere with logging in the parent VM and probably
> will have to be removed in the final version, but I've left it in for
> now to better illustrate what's happening. Finally, we can't output
> the size of the created dump any more if we are using asynchronous
> dumping but from my point of view that's not such a big problem. Apart
> from that, the POC works surprisingly well :)
>
> Please let me know what you think and if there's something I've overlooked?
>
> Best regards,
> Volker
>
> PS: by the way, asynchronous dumping combines just fine with
> compressed dumps. So you can easily use "GC.heap_dump -async=true
> -gz=6"
More information about the serviceability-dev
mailing list