RFR: JDK-8306441: Segmented heap dump [v5]

Yi Yang yyang at openjdk.org
Mon May 15 02:19:45 UTC 2023


On Wed, 10 May 2023 08:29:43 GMT, Yi Yang <yyang at openjdk.org> wrote:

>> Hi, heap dump brings about pauses for application's execution(STW), this is a well-known pain. JDK-8252842 have added parallel support to heapdump in an attempt to alleviate this issue. However, all concurrent threads competitively write heap data to the same file, and more memory is required to maintain the concurrent buffer queue. In experiments, we did not feel a significant performance improvement from that.
>> 
>> The minor-pause solution, which is presented in this PR, is a two-stage segmented heap dump:
>> 
>> 1. Stage One(STW): Concurrent threads directly write data to multiple heap files.
>> 2. Stage Two(Non-STW): Merge multiple heap files into one complete heap dump file.
>> 
>> Now concurrent worker threads are not required to maintain a buffer queue, which would result in more memory overhead, nor do they need to compete for locks. It significantly reduces 73~80% application pause time. 
>> 
>> | memory | numOfThread | STW         | Total      |
>> | --- | --------- | -------------- | ------------ |
>> | 8g | 1 thread | 15.612 secs | 15.612 secs |
>> | 8g | 32 thread |  2.5617250 secs | 14.498 secs |
>> | 8g | 96 thread | 2.6790452 secs | 14.012 secs | 
>> | 16g | 1 thread | 26.278 secs | 26.278 secs |
>> | 16g | 32 thread |  5.2313740 secs | 26.417 secs |
>> | 16g | 96 thread | 6.2445556 secs | 27.141 secs |
>> | 32g | 1 thread | 48.149 secs | 48.149 secs |
>> | 32g | 32 thread | 10.7734677 secs | 61.643 secs | 
>> | 32g | 96 thread | 13.1522042 secs |  61.432 secs |
>> | 64g | 1 thread |  100.583 secs | 100.583 secs |
>> | 64g | 32 thread | 20.9233744 secs | 134.701 secs | 
>> | 64g | 96 thread | 26.7374116 secs | 126.080 secs | 
>> | 128g | 1 thread | 233.843 secs | 233.843 secs |
>> | 128g | 32 thread | 72.9945768 secs | 207.060 secs |
>> | 128g | 96 thread | 67.6815929 secs | 336.345 secs |
>> 
>>> **Total** means the total heap dump including both two phases
>>> **STW** means the first phase only.
>>> For parallel dump, **Total** = **STW** + **Merge**. For serial dump, **Total** = **STW**
>> 
>> ![image](https://user-images.githubusercontent.com/5010047/234534654-6f29a3af-dad5-46bc-830b-7449c80b4dec.png)
>> 
>> In actual testing, two-stage solution can lead to an increase in the overall time for heapdump(See table above). However, considering the reduction of STW time, I think it is an acceptable trade-off. Furthermore, there is still room for optimization in the second merge stage(e.g. sendfile/splice/copy_file_range instead of read+write combination). Since number of...
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   execute VM_HeapDumper directly

Hi, can I have a review for this patch?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1547101136


More information about the serviceability-dev mailing list