PING: RFR: 8204531: Remove unused chars following '\0'

Thu Jun 14 07:37:07 UTC 2018

Thanks David!
I'm waiting for second reviewer.

Yasumasa

2018-06-14 15:52 GMT+09:00 David Holmes <david.holmes at oracle.com>:
> Seems fine.
>
> Thanks,
> David
>
>
> On 14/06/2018 3:01 PM, Yasumasa Suenaga wrote:
>>
>> Hi David,
>>
>> Thank you for your comment.
>>
>> I removed the assertion from PerfDataEntry.java .
>> Could you review again?
>>
>>    http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.03/
>>
>>
>> Thanks,
>>
>> Yasuamsa
>>
>>
>>
>> 2018-06-14 13:41 GMT+09:00 David Holmes <david.holmes at oracle.com>:
>>>
>>> Hi Yasumasa,
>>>
>>> This seems mostly okay - though like Thomas this is not an area I'm that
>>> familiar with. But it seems this addresses the problem without impacting
>>> other things.
>>>
>>> This assert is redundant:
>>>
>>>   366                 if (Assert.ASSERTS_ENABLED) {
>>>   367                     Assert.that(vectorLength() > 0, "not a byte
>>> vector");
>>>
>>> You can't reach it unless len != 0 and I assume you're not really
>>> checking
>>> that len is not-negative - if you are then the assert needs to go further
>>> up
>>> after:
>>>
>>> 328         int len = vectorLength();
>>>
>>> Thanks,
>>> David
>>>
>>>
>>> On 14/06/2018 2:18 PM, Yasumasa Suenaga wrote:
>>>>
>>>>
>>>> Hi Thomas,
>>>>
>>>> Thank you for your comment.
>>>> I uploaded new webrev:
>>>>
>>>>     http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.02/
>>>>
>>>>
>>>> 2018-06-14 3:34 GMT+09:00 Thomas Stüfe <thomas.stuefe at gmail.com>:
>>>>>
>>>>>
>>>>> Hi Yasumasa,
>>>>>
>>>>> looks good to me.
>>>>>
>>>>> small nit:
>>>>> import java.nio.charset.* -> import java.nio.charset.Charset;
>>>>
>>>>
>>>>
>>>> Fixed.
>>>>
>>>>
>>>>> We loose the assert for type now from byteArrayValue(). Can you re-add
>>>>> it to the location where you call CStringUtils?
>>>>
>>>>
>>>>
>>>> I added the assertion to check vector length.
>>>> I did not add type assertion because if-statement branches by data type.
>>>>
>>>>
>>>>> We do not seem to need byteArrayValue(), can this be removed?
>>>>
>>>>
>>>>
>>>> I cannot agree that.
>>>> byteArrayValue() is public method (but it is not used AFAICS).
>>>> However, similar getter methods are privided from PerfDataEntry (e.g.
>>>> charArrayValue()). So I think we should not remove byteArrayValue().
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>> Best Regards, Thomas
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 13, 2018 at 3:33 PM, Yasumasa Suenaga <yasuenag at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Thanks Thomas,
>>>>>>
>>>>>>> That said, I'd probably just add a variant to
>>>>>>> CStringUtilities::getString which takes encoding as an argument, and
>>>>>>> then pass "US-ASCII" to remain compatible with the current version of
>>>>>>> jsnap.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I agree with you. My opinion is one of the ideals.
>>>>>> Anyway, I uploaded new webrev which is added charset support in
>>>>>> CStringUtils:
>>>>>>
>>>>>>     http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.01/
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2018/06/13 21:46, Thomas Stüfe wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Disclaimer: I am Reviewer but not versed in this corner of the jdk,
>>>>>>> nor have knowledge of the history, so take what I suggest with a
>>>>>>> grain
>>>>>>> of salt.
>>>>>>>
>>>>>>> That said, I'd probably just add a variant to
>>>>>>> CStringUtilities::getString which takes encoding as an argument, and
>>>>>>> then pass "US-ASCII" to remain compatible with the current version of
>>>>>>> jsnap.
>>>>>>>
>>>>>>> ..Thomas
>>>>>>>
>>>>>>> On Wed, Jun 13, 2018 at 2:37 PM, Yasumasa Suenaga
>>>>>>> <yasuenag at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Thomas,
>>>>>>>>
>>>>>>>> Thank you for your comment.
>>>>>>>>
>>>>>>>> CStringUtilities.java says the String which is constructed from byte
>>>>>>>> array
>>>>>>>> should use Charset.defaultCharset(). [1]
>>>>>>>> I'm not convinced how should we fix - US-ASCII or default charset.
>>>>>>>>
>>>>>>>> IMHO we should use default charset. However I concern I need to get
>>>>>>>> CSR
>>>>>>>> approvement.
>>>>>>>> (In addition, I want to refactor CStringUtilities. I guess we can
>>>>>>>> make
>>>>>>>> more
>>>>>>>> simple code :-)
>>>>>>>>
>>>>>>>>
>>>>>>>> What do you think about it?
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/7bf4f1b5e438/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/CStringUtilities.java#l73
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2018/06/13 21:18, Thomas Stüfe wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> your patch will change behavior of jsnap if file.encoding is set to
>>>>>>>>> something different than US-ASCII.
>>>>>>>>>
>>>>>>>>> Old version uses, hard wired, US-ASCII.
>>>>>>>>>
>>>>>>>>> You now use jvm/hotspot/utilities/CStringUtilities.java, which uses
>>>>>>>>>
>>>>>>>>> private static String encoding =
>>>>>>>>> System.getProperty("file.encoding",
>>>>>>>>> "US-ASCII");
>>>>>>>>>
>>>>>>>>> and thus may use a different encoding.
>>>>>>>>>
>>>>>>>>> Best Regards, Thomas
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 13, 2018 at 2:09 PM, Yasumasa Suenaga
>>>>>>>>> <yasuenag at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> PING: Could you review it?
>>>>>>>>>> I want to merge this change before JDK 11 RDP 1.
>>>>>>>>>>
>>>>>>>>>>>>>         webrev:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/
>>>>>>>>>>>>>
>>>>>>>>>>>>>         JBS:
>>>>>>>>>>>>>           https://bugs.openjdk.java.net/browse/JDK-8204531
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2018/06/08 15:38, Yasumasa Suenaga wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> 2018-06-08 10:47 GMT+09:00 David Holmes
>>>>>>>>>>> <david.holmes at oracle.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/06/2018 10:43 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>
>>>>>>>>>>>>>         webrev:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/
>>>>>>>>>>>>>
>>>>>>>>>>>>>         JBS:
>>>>>>>>>>>>>           https://bugs.openjdk.java.net/browse/JDK-8204531
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> We can use `jhsdb jsnap` to check all PerfData.
>>>>>>>>>>>>> String values in PerfData are defined as jbyte array, but JSnap
>>>>>>>>>>>>> cannot
>>>>>>>>>>>>> handle it well as following:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ```
>>>>>>>>>>>>> $ jhsdb jsnap --pid 28542 --all | less
>>>>>>>>>>>>>
>>>>>>>>>>>>> sun.gc.cause=No
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> GC^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
>>>>>>>>>>>>> ```
>>>>>>>>>>>>>
>>>>>>>>>>>>> You can see this value via `less` and `vim` on Linux. `^@`
>>>>>>>>>>>>> shows
>>>>>>>>>>>>> it
>>>>>>>>>>>>> is
>>>>>>>>>>>>> non-ascii character.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> So your synopsis says "remove unused chars following \0". Is
>>>>>>>>>>>> that
>>>>>>>>>>>> really
>>>>>>>>>>>> what is happening? I would have expected "new String(bytes,
>>>>>>>>>>>> encoding)"
>>>>>>>>>>>> to
>>>>>>>>>>>> stop processing bytes when it encounters a NUL!
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> String c'tor will continue to parse byte array even if it finds
>>>>>>>>>>> '\0'.
>>>>>>>>>>> You can see it on JShell as below:
>>>>>>>>>>>
>>>>>>>>>>> -------------
>>>>>>>>>>> jshell> var bytes = new byte[]{(byte)'a', (byte)'b', (byte)'c',
>>>>>>>>>>> (byte)'\0', (byte)'d', (byte)'e', (byte)'f'}
>>>>>>>>>>> bytes ==> byte[7] { 97, 98, 99, 0, 100, 101, 102 }
>>>>>>>>>>>
>>>>>>>>>>> jshell> new String(bytes, "US-ASCII")
>>>>>>>>>>> $2 ==> "abc\000def"
>>>>>>>>>>> -------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> PerfDataEntry has null-terminated C string. So we should
>>>>>>>>>>>>> restore
>>>>>>>>>>>>> as
>>>>>>>>>>>>> it
>>>>>>>>>>>>> in
>>>>>>>>>>>>> Java layer.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> If the issue is really "junk" beyond the \0 nul-terminator, how
>>>>>>>>>>>> did
>>>>>>>>>>>> we
>>>>>>>>>>>> get
>>>>>>>>>>>> such values? Shouldn't the byte array already be terminated at
>>>>>>>>>>>> the
>>>>>>>>>>>> NUL?
>>>>>>>>>>>> Or
>>>>>>>>>>>> is this just a raw representation of an array in the VM using
>>>>>>>>>>>> the
>>>>>>>>>>>> exact
>>>>>>>>>>>> length of the array regardless of content?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I think JSnap shows the array regardless of content.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I see CStringUtilities.getString() both stops when it encounters
>>>>>>>>>>>> a
>>>>>>>>>>>> 0
>>>>>>>>>>>> value,
>>>>>>>>>>>> and uses the default file encoding (else ASCII).
>>>>>>>>>>>>
>>>>>>>>>>>> The fix seems perfectly reasonable, I'm just unclear where
>>>>>>>>>>>> exactly
>>>>>>>>>>>> this
>>>>>>>>>>>> "bad" data should have been stopped.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>
>