RFD: 8252768: Fast, asynchronous heap dumps
Laurence Cable
larry.cable at oracle.com
Thu Sep 3 19:38:53 UTC 2020
On 9/3/20 12:36 PM, Thomas Stüfe wrote:
>
>
>
> On Thu, Sep 3, 2020, 21:27 Laurence Cable <larry.cable at oracle.com
> <mailto:larry.cable at oracle.com>> wrote:
>
>
>
> On 9/3/20 12:25 PM, Thomas Stüfe wrote:
>>
>>
>> On Thu, Sep 3, 2020, 21:07 Laurence Cable <larry.cable at oracle.com
>> <mailto:larry.cable at oracle.com>> wrote:
>>
>>
>>
>> On 9/3/20 9:03 AM, Volker Simonis wrote:
>> > Hi,
>> >
>> > I'd like to get your opinion on a POC I've done in order to
>> speed up
>> > heap dumps on Linux:
>> >
>> > https://bugs.openjdk.java.net/browse/JDK-8252768
>> > http://cr.openjdk.java.net/~simonis/webrevs/2020/8252768/
>> >
>> > Currently, heap dumps can be taken by the SA tools from a
>> frozen
>> > process or core file or directly from a running process
>> with jcmd,
>> > jconsole & JMX, jmap, etc. If the heap of a running process
>> is dumped,
>> > this happens at a safepoint (see VM_HeapDumper). Because
>> the time to
>> > produce a heap dump is roughly proportional to the size and
>> fill ratio
>> > of the heap, this leads to safepoint times which can range
>> from ~100ms
>> > for a 100mb heap to ~1s for a 1gb heap up to 15s and more
>> for a 8gb
>> > heap (measured on my Core i7 laptop with SSD).
>> >
>> > One possibility to decrease the safepoint time is to
>> offload the
>> > dumping work to an asynchronous process. On Linux (and
>> probably any
>> > other OS which supports fork()) this can be achieved by
>> forking and
>> > offloading the heap dumping to the child process. Forking
>> still needs
>> > to happen at a safepoint, but forking is considerably
>> faster compared
>> > to the dumping process itself. The fork performance is still
>> > proportional to the size of the original Java process
>> because although
>> > fork won't copy any memory pages, the kernel still needs to
>> duplicate
>> > the page table entries of the process.
>>
>>
>> curious what is the. behavior of the parent/target JVM
>> process "after"
>> it executes the fork() at the safepoint? i.e what does it do
>> next?
>>
>>
>> It just continues its life. It will periodically try to reap the
>> child process, but apart from that it will just run on.
> so then the state of the (parent's) heap may change "under" the
> dumping child?
>
> what am I missing?
>>
>
> When forking, the child process gets a *copy* of the parents address
> space. That is if one were to implement fork() naively. Since copying
> the parent address space would be prohibitively expensive, especially
> for parent processes with a large footprint, modern OSes do a
> copy-on-write.
>
> The child gets a copy of the address space of the parent at the time
> the fork was done. As long as the child only reads a page, and the
> page has not been modified by the parent, this is physically the same
> memory. If either parent or child writes to the page it gets really
> duplicated.
>
> Effectively, Volker uses the OS to get a frozen snapshot of the java
> heap at fork time.
>
thanks for the clarification... I had brain fade on COW semantics...
>> > Linux uses a “copy-on-write” technique for the creation of
>> a forked
>> > child process. This means that right after creation, the
>> child process
>> > will have exactly the same memory image like its parent
>> process. But
>> > at the same time, the child process won’t use any
>> additional physical
>> > memory, as long as it doesn’t change (i.e. writes into) its
>> memory.
>> > Since heap dumping only reads the child process's memory
>> and then
>> > exits immediately, this technique can be applied even if
>> the Java
>> > process already uses almost the whole free physical memory.
>> >
>> > The POC I've created (see
>> > http://cr.openjdk.java.net/~simonis/webrevs/2020/8252768/)
>> decreases
>> > the aforementioned ~100ms, ~1s and 15s for a 100mb, 1gb and
>> 8gb heap
>> > to ~3ms, ~15ms and ~60ms on my laptop which I think is
>> significant.
>> > You can try it out by using the new "-async" or
>> "-async=true" option
>> > of the "GC.heap_dump" jcmd command.
>> >
>> > Of course this change will require a CSR for the additional
>> jcmd
>> > GC.heap_dump "-async" option which I'll be happy to create
>> if there's
>> > any interest in this enhancement. Also, logging in the
>> child process
>> > might potentially interfere with logging in the parent VM
>> and probably
>> > will have to be removed in the final version, but I've left
>> it in for
>> > now to better illustrate what's happening. Finally, we
>> can't output
>> > the size of the created dump any more if we are using
>> asynchronous
>> > dumping but from my point of view that's not such a big
>> problem. Apart
>> > from that, the POC works surprisingly well :)
>> >
>> > Please let me know what you think and if there's something
>> I've overlooked?
>> >
>> > Best regards,
>> > Volker
>> >
>> > PS: by the way, asynchronous dumping combines just fine with
>> > compressed dumps. So you can easily use "GC.heap_dump
>> -async=true
>> > -gz=6"
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200903/9198fccf/attachment-0001.htm>
More information about the serviceability-dev
mailing list