RFR: JDK-8306441: Two phase segmented heap dump [v14]
Thomas Stuefe
stuefe at openjdk.org
Sat Jul 15 15:19:23 UTC 2023
On Mon, 10 Jul 2023 06:07:35 GMT, Yi Yang <yyang at openjdk.org> wrote:
>>> Hi @alexmenkov,
>>>
>>> > It restricts use of parallel dumping only to attach case.
>>>
>>> I'm not sure what you mean. Using handshake won't limit it to only attach cases; it can also be used in other cases like HeapDumpBeforeFullGC. The only difference is that previously, VMThread merged the dump files, and now it's the Attach listener that merges them. For the file merging process, my candidate options are as follows:
>>>
>>> 1. Execute with VMThread outside the safepoint, which will block VMThread from executing other vmoperations.
>>>
>>> 2. Execute with Attach listener thread outside the safepoint, which will block Attach listener from processing requests like jcmd/jstack.
>>>
>>> 3. Create a new temporary thread to execute the file merging outside the safepoint, which will have some resource consumption.
>>>
>>>
>>> I don't have a strong motivation to use 2, and 3 may be a good solution with the availability of virtual threads. If you have any concerns, we can consider using the most conservative option 1 to simplify the review process, and then optimize the file merging process in a follow-up patch.
>>
>> I mean that you can't be sure that you can use attach listener thread.
>> My concerns about attach listener thread are:
>> - AttachListener can be disabled at all or fail to initialize;
>> - attach listener thread may be not yet available when we need to perform heap dump;
>> - need to ensure attach listener thread can't be blocked (for example waiting for next command)
>>
>> I think it makes sense to go option 1 now and add optimizations as follow-up changes (they will be much smaller and easier to review).
>
> Thanks @alexmenkov for the reviews! I added corresponding jtreg for it, also I found verifyHeapDump is duplicated in several tests, I filed https://bugs.openjdk.org/browse/JDK-8311775 as a follow-up test improvement.
>
> @plummercj @kevinjwalls Can I have a second review when you have time? Thanks.
Hi, @y1yang0!
Interesting idea! But I have some questions I could not answer by reading through your description:
I don't understand the performance numbers. I assume they correlate with the Y axis of the diagrams? Are these seconds? Of what, the new solution? Or are these percentages? You summarize with "reduces 71~83% application pause time". How do you recon that? You also talk about parallel and serial dump. Does that mean parallel = your patch, serial = stock?
Bottomline, unless I missed it, it would be nice to see how your patched VM stacks up against the existing heap dump mechanism to see if the effort is worth it.
The design: I understand that you write physically separate files, right? Why do we then, in the second phase, need an AttachListener thread to merge them? Why a VM operation? Could you please explain the merge process in more detail?
>Pauseless heap dump solution?
>An alternative pauseless solution is to fork a child process, set the parent process heap to read-only, and dump the heap in child process. Once writing happens in parent process, child process observes them by userfaultfd and corresponding pages are prioritized for dumping. I'm also looking forward to hearing comments and discussions about this solution.
I assume the whole complexity about setting the heap readonly and `userfaultfd` etc is because you plan to use `vfork(2)`? I would advise against it for many reasons, mainly because getting a bug-free, safe solution will be a pain, and errors will be fiendishly difficult to solve.
That leaves us with a solution based on standard `fork(2)`, which is maybe possible. There are cons, though:
- Since this heapdumper optimization is targeted toward processes with very large heaps, even cloning the page tables may be too much. And an active parent may lead to more COW activity, which you must pay for with CPU and memory. Heapdumps are often done in low-resource situations (e.g. as a last resort for post-crash analysis before killing the JVM due to OOM). Paying the extra footprint for the forked process may be too much.
- I fear unforeseen side effects of process forking. Just one example would be the child JVM inheriting all parent file descriptors. Child would keep them open for as long as it needs to take a heap dump. If there are mechanisms that rely on file descriptors being closed, those will timeout. This is certainly solvable, but I am wary of similar effects like these we don't foresee.
Cheers, Thomas
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1636797625
More information about the serviceability-dev
mailing list