RFR: 8262386: resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java timed out

Wed Mar 17 06:36:06 UTC 2021

On Tue, 9 Mar 2021 13:57:24 GMT, Lin Zang <lzang at openjdk.org> wrote:

>> Update a new patch that reduce the memory consumption issue.
>> As mentioned in the CR,  there is internal buffer used for segmented heap dump.  The dumped object data firstly write into this buffer and then flush() when the size is known. when the internal buffer is full, the current implementation do:
>> 
>> - new a larger buffer,  copy data from old buffer into the new one, and then use it as the internal buffer.
>> 
>> This could cause large memory consumption because the old buffer data are copied, and also the old buffer can not be "free" until next GC. 
>> 
>> For example, if the internel buffer's length is 1MB, when it is full, a new 2MB buffer is allocated so there is actually 3MB memory taken (old buffer + new buffer). And in this case, for the ~4GB large array, it keeps generating new buffers and do copying, which takes both CPU and memory intensively and cause the timeout issue.
>> 
>> This patch optimize it by creating a array list of byte[]. when old buffer is full, it saved into the list and the new one is created and used as the internal buffer. In this case, the above example takes 2MB（1MB for old， saved in the list; and 1MB for the new buffer)
>> 
>> Together with the "write through" mode introduced in this PR, by which all arrays are write through to underlying stream and hence no extra buffer requried. The PR could help fix the LargeArray issue and also save memory.
>> 
>> Thanks!
>> Lin
>
> As discussed in CR https://bugs.openjdk.java.net/browse/JDK-8262386, the byte[] list is much more like an optimization. Revert it in the PR and I will create a separate CR and PR for it. 
> 
> Thanks,
> Lin

Hi Lin,

One concern I have is the naming used in the fix.
First, the term write-through you use is unusual and confusing.
Would it better to say 'direct or unbuffered output' instead?

 612         // Now the total size of data to dump is known and can be filled to segment header.
 613         // Enable write-through mode to avoid internal buffer copies.
 614         if (useSegmentedHeapDump) {
 615             long longBytes = length * typeSize + headerSize;
 616             int bytesToWrite = (int) (longBytes);
 617             hprofBufferedOut.fillSegmentSizeAndEnableWriteThrough(bytesToWrite);
 618         }

Why 'longBytes' ? The scope of this local variable is very short, and it is clear that its type is long.
Why not something like 'size' or 'totalSize' ?
There is no need in parentheses around the 'longBytes' at L616.
Also, there is no need to for one more local variable 'bytesToWrite'.
Something like below is more clear:
  int size = (int)(length * typeSize + headerSize);
I agree with Chris, it is confusing why the decision about write-trough mode is made in this method.
The method fillSegmentSizeAndEnableWriteThrough() is called for any array if useSegmentedHeapDump == true,
so any segmented output for arrays is direct (in write-through mode).
Is this new dimension really needs to be introduced?

Thanks,
Serguei

-------------

PR: https://git.openjdk.java.net/jdk/pull/2803