PING: RFR: 8204531: Remove unused chars following '\0'

Wed Jun 13 12:09:45 UTC 2018

PING: Could you review it?
I want to merge this change before JDK 11 RDP 1.

>>>     webrev:
>>>       http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/
>>>
>>>     JBS:
>>>       https://bugs.openjdk.java.net/browse/JDK-8204531

Yasumasa

On 2018/06/08 15:38, Yasumasa Suenaga wrote:
> Hi David,
> 
> 2018-06-08 10:47 GMT+09:00 David Holmes <david.holmes at oracle.com>:
>> Hi Yasumasa,
>>
>> On 7/06/2018 10:43 PM, Yasumasa Suenaga wrote:
>>>
>>> Hi all,
>>>
>>> Please review this change:
>>>
>>>     webrev:
>>>       http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/
>>>
>>>     JBS:
>>>       https://bugs.openjdk.java.net/browse/JDK-8204531
>>>
>>>
>>> We can use `jhsdb jsnap` to check all PerfData.
>>> String values in PerfData are defined as jbyte array, but JSnap cannot
>>> handle it well as following:
>>>
>>> ```
>>> $ jhsdb jsnap --pid 28542 --all | less
>>>
>>> sun.gc.cause=No
>>> GC^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
>>> ```
>>>
>>> You can see this value via `less` and `vim` on Linux. `^@` shows it is
>>> non-ascii character.
>>
>>
>> So your synopsis says "remove unused chars following \0". Is that really
>> what is happening? I would have expected "new String(bytes, encoding)" to
>> stop processing bytes when it encounters a NUL!
> 
> String c'tor will continue to parse byte array even if it finds '\0'.
> You can see it on JShell as below:
> 
> -------------
> jshell> var bytes = new byte[]{(byte)'a', (byte)'b', (byte)'c',
> (byte)'\0', (byte)'d', (byte)'e', (byte)'f'}
> bytes ==> byte[7] { 97, 98, 99, 0, 100, 101, 102 }
> 
> jshell> new String(bytes, "US-ASCII")
> $2 ==> "abc\000def"
> -------------
> 
> 
>>> PerfDataEntry has null-terminated C string. So we should restore as it in
>>> Java layer.
>>
>>
>> If the issue is really "junk" beyond the \0 nul-terminator, how did we get
>> such values? Shouldn't the byte array already be terminated at the NUL? Or
>> is this just a raw representation of an array in the VM using the exact
>> length of the array regardless of content?
> 
> I think JSnap shows the array regardless of content.
> 
> 
>> I see CStringUtilities.getString() both stops when it encounters a 0 value,
>> and uses the default file encoding (else ASCII).
>>
>> The fix seems perfectly reasonable, I'm just unclear where exactly this
>> "bad" data should have been stopped.
> 
> Thanks!
> 
> 
> Yasumasa
> 
> 
>> Thanks,
>> David
>>
>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>