RFR (M): 8129419: heapDumper.cpp: assert(length_in_bytes > 0) failed: nothing to copy

Mon Jan 18 12:26:22 UTC 2016

I believe I then have to cast it to size_t at 496, 497 and 498?
I'm not sure how write works in regards to returning ssize_t vs a size_t 
length.
Maybe I should change it back to checking for return value < 0 (instead 
of only checking for OS_ERR), and then keep 'n' as ssize_t?

- Andreas

On 2016-01-18 13:11, Dmitry Samersoff wrote:
> Андреас,
>
> 481: It's better to declare n as ssize_t and remove cast at 488
>
> Looks good for me!
>
> -Dmitry
>
>
> On 2016-01-18 14:42, Andreas Eriksson wrote:
>> Hi,
>>
>> New webrev: http://cr.openjdk.java.net/~aeriksso/8129419/webrev.02/hotspot/
>>
>> Write_internal handles writes less than len, and EINTR.
>>
>> EINTR is taken care of by os::write, but to use it I had to remove an
>> assert from the solaris os::write, which checked that a JavaThread was
>> in_native. Since heap dumping is done from the vm_thread (not a
>> JavaThread) the assert crashed the VM.
>>
>> Regards,
>> Andreas
>>
>> On 2016-01-13 13:39, Andreas Eriksson wrote:
>>> Hi,
>>>
>>> On 2015-12-29 21:27, Dmitry Samersoff wrote:
>>>> Andreas,
>>>>
>>>> Great work. All but write_internal looks good for me (see below).
>>>>
>>>>
>>>> HprofReader.java:
>>>>
>>>> Nit -  length -= skipped; should be after if skipped == 0
>>>>
>>>> heapDumper.cpp:480
>>>>
>>>> 1. For windows you can do :
>>>>       assert(len < (size_t)UINT_MAX, ... );
>>>>       ::write( ..., (uint) (len & UINT_MAX));
>>>>
>>>> 2. You should check for n < len and if it is true,
>>>> deal with errno. n = 0 is legitimate case,
>>>> so assert is not necessary.
>>> I only think n = 0 is valid if write is called with length 0, which
>>> write_internal will never do.
>>> Otherwise, according to posix, n will be -1 if an error occurred, or
>>> greater than zero if some bytes were written. [1]
>>> If some bytes but not all were written then the while(len > 0) loop
>>> will make sure we try to write the rest of the bytes.
>>> It looks like windows write should never return 0 when len != 0
>>> either. [2]
>>>
>>> I should however handle the -1 result with errno EINTR, working on a
>>> new webrev.
>>>
>>> Thanks,
>>> Andreas
>>>
>>> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html
>>> If write() is interrupted by a signal before it writes any data, it
>>> shall return -1 with errno set to [EINTR].
>>> If write() is interrupted by a signal after it successfully writes
>>> some data, it shall return the number of bytes written.
>>>
>>> [2] https://msdn.microsoft.com/en-us/library/1570wh78(v=vs.80).aspx
>>>
>>>> -Dmitry
>>>>
>>>> On 2015-12-28 17:02, Andreas Eriksson wrote:
>>>>> Hi,
>>>>>
>>>>> Here is the webrev for type changes.
>>>>> Top-repo:
>>>>> http://cr.openjdk.java.net/~aeriksso/8129419/webrev.01/jdk9-hs-rt/
>>>>> Hotspot:
>>>>> http://cr.openjdk.java.net/~aeriksso/8129419/webrev.01/hotspot/
>>>>>
>>>>> The windows_x64 native write function uses a length of type uint, not
>>>>> size_t. I added an ifdef for that case and handled it, but better
>>>>> suggestions are welcome.
>>>>>
>>>>> Also found and fixed another problem in the hprof parser when skipping
>>>>> over bytes.
>>>>>
>>>>> I've not included the tests, they have OOM issues on several JPRT
>>>>> hosts,
>>>>> and until I've figured out a way to fix that I wont be pushing them.
>>>>>
>>>>> Thanks,
>>>>> Andreas
>>>>>
>>>>> On 2015-12-14 19:34, Andreas Eriksson wrote:
>>>>>> Hi Dmitry, thanks for looking at this.
>>>>>>
>>>>>>
>>>>>> On 2015-12-14 18:30, Dmitry Samersoff wrote:
>>>>>>> Andreas,
>>>>>>>
>>>>>>> Nice cleanup.
>>>>>>>
>>>>>>> Some generic comments below.
>>>>>>>
>>>>>>> 1. It would be excellent if you can split webrev into two parts, one
>>>>>>> part with types cleanup and other part with array truncation related
>>>>>>> changes or ever push these changes separately as it address different
>>>>>>> problems.
>>>>>>>
>>>>>>> Type cleanup could be reviewed quickly, but review of array
>>>>>>> truncation
>>>>>>> requires some time.
>>>>>>>
>>>>>>> (We already have two different CRs: JDK-8129419 for type cleanup and
>>>>>>> JDK-8144732 for array truncation)
>>>>>> Yes, that's a good point.
>>>>>>
>>>>>>> 2.
>>>>>>> it might be better to use size_t and jlong (or julong) across all
>>>>>>> file
>>>>>>> and get rid of all other types with a few exclusion.
>>>>>> I'll see what I can do, and we can look at it closer once I've split
>>>>>> the type changes into a separate webrev.
>>>>>>
>>>>>>> 3.
>>>>>>> Each assert costs some time in nightly testing, so we probably don't
>>>>>>> need as much asserts.
>>>>>> Alright.
>>>>>>
>>>>>>> 4. write_internal:
>>>>>>>
>>>>>>>      There are couple of cases where ::write writes less bytes than
>>>>>>> requested and doesn't return -1.
>>>>>>>      So we should check if ::write returns a value less that len
>>>>>>> and if it
>>>>>>> happens deal with errno - either repeat entire write
>>>>>>> attempt(EINTR/EAGAIN) or abort writing.
>>>>>> Actually, I meant to ask about that, but forgot:
>>>>>> I tried using os::write, which handles EINTR/EAGAIN if necessary
>>>>>> (depending on platform), but Solaris has an assert that fails:
>>>>>> http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/file/5a42c1dde332/src/os/solaris/vm/os_solaris.cpp#l5692
>>>>>>
>>>>>>
>>>>>> It checks that os::write is called by a JavaThread that is in_native,
>>>>>> which we are not, since we are in the VM_thread doing a safepoint_op.
>>>>>> Does anyone know if that assert is really needed? It's only done for
>>>>>> Solaris.
>>>>>>
>>>>>>> 5. we should handle zero-length array in calculate_array_max_length -
>>>>>>> it's a legitimate case
>>>>>> Not sure what you mean here, I believe it does handle zero-length
>>>>>> arrays.
>>>>>> It should just fall through without taking any of the if-clauses.
>>>>>> (Or do you mean that I should add a return immediately at the top if
>>>>>> length is zero?)
>>>>>>
>>>>>>> 6. max_juint is less than SegmentedHeapDumpThreshold so non-segmented
>>>>>>> heapdump can't contain huge array and it's better to check for
>>>>>>> segmented
>>>>>>> dump before any other checks.
>>>>>> Correct me if I'm wrong here, but SegmentedHeapDumpThreshold could in
>>>>>> theory be set so that we have a non-segmented heap dump while still
>>>>>> having huge arrays.
>>>>>> Not sure if this is worth taking into account, since it most likely
>>>>>> would lead to other problems as well, and the flag is develop only, so
>>>>>> it can't happen in product builds.
>>>>>> What do you think would be best to do here?
>>>>>>
>>>>>> - Andreas
>>>>>>
>>>>>>> -Dmitry
>>>>>>>
>>>>>>>
>>>>>>> On 2015-12-14 18:38, Andreas Eriksson wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please review this fix for dumping of long arrays, and general
>>>>>>>> cleanup
>>>>>>>> of types in heapDumper.cpp.
>>>>>>>>
>>>>>>>> Problem:
>>>>>>>> At several places in heapDumper.cpp overflows could happen when
>>>>>>>> dumping
>>>>>>>> long arrays.
>>>>>>>> Also the hprof format uses an u4 as a record length field, but
>>>>>>>> arrays
>>>>>>>> can be longer than that (counted in bytes).
>>>>>>>>
>>>>>>>> Fix:
>>>>>>>> Many types that were previously signed are changed to equivalent
>>>>>>>> unsigned types and/or to a larger type to prevent overflow.
>>>>>>>> The bulk of the change though is the addition of
>>>>>>>> calculate_array_max_length, which for a given array returns the
>>>>>>>> number
>>>>>>>> of elements we can dump. That length is then used to truncate arrays
>>>>>>>> that are too long.
>>>>>>>> Whenever an array is truncated a warning is printed:
>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: cannot dump array of type
>>>>>>>> object[] with length 1,073,741,823; truncating to length 536,870,908
>>>>>>>>
>>>>>>>> Much of the rest of the change is moving functions needed by
>>>>>>>> calculate_array_max_length to the DumpWriter or DumperSupport
>>>>>>>> class so
>>>>>>>> that they can be accessed.
>>>>>>>>
>>>>>>>> Added a test that relies on the hprof parser, which also had a
>>>>>>>> couple of
>>>>>>>> overflow problems (top repo changes).
>>>>>>>> I've also run this change against a couple of other tests, but
>>>>>>>> they are
>>>>>>>> problematic in JPRT because they are using large heaps and lots of
>>>>>>>> disk.
>>>>>>>>
>>>>>>>> Bug:
>>>>>>>> 8129419: heapDumper.cpp: assert(length_in_bytes > 0) failed:
>>>>>>>> nothing to
>>>>>>>> copy
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8129419
>>>>>>>>
>>>>>>>> Also fixed in this change is the problems seen in these two bugs:
>>>>>>>> 8133317: Integer overflow in heapDumper.cpp leads to corrupt HPROF
>>>>>>>> dumps
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133317
>>>>>>>>
>>>>>>>> 8144732: VM_HeapDumper hits assert with bad dump_len
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8144732
>>>>>>>>
>>>>>>>> Webrev:
>>>>>>>> Top repo:
>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8129419/webrev.00/jdk9-hs-rt/
>>>>>>>> Hotspot:
>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8129419/webrev.00/hotspot/
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Andreas
>