[rfc][icedtea-web] Console Output Encoding Fix

Jiri Vanek jvanek at redhat.com
Tue Jul 15 15:58:13 UTC 2014


On 07/15/2014 05:53 PM, Jacob Wisor wrote:
> On 07/15/2014 05:25 PM, Jiri Vanek wrote:
>> On 07/15/2014 05:07 PM, Jacob Wisor wrote:
>>> On 07/15/2014 04:10 PM, Jie Kang wrote:
>> Hello,
>>>  >
>>>  > This patch resolves the bug here
>>> http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=1858
>>>  >
>>>  > Characters such as 'ó' were not appearing in the Java Console due to the
>>> implementation of
>>> TeeOutputStream appending bytes to a StringBuffer in a byte-by-byte fashion
>>> ignoring the fact that
>>> the encodings involve multi-byte characters.
>>>  >
>>>  > Also, as far as I can tell the StringBuffer is not used by multiple threads
>>> and has been replaced
>>> by StringBuilder (see
>>> http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html)
>>>
>>> No, I would rather advise to stay with StringBuffer. There is no need for
>>> StringBuilder as long as
>>> TeeOutputStream is thread-safe.
>>
>> TeaOutputStreamitself is not thread safe, but all acess to the "string"
>> are(should! be) already synchronised so it should be ok to move to
>> StringBuilder, whih have much less overhead.
>>
>>>
>>> Generally speaking, what this code should be doing is this:
>>>
>>> import java.util.Arrays;
>>> [...]
>>> this.string.append(new String(Arrays.copyOfRange(b, off, off + len)));
>>>
>>> Flushing on every '\n' can also be incorporated. ;-)
>>
>> See my much longer reply :) -
>> http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2014-July/028521.html
>>
>> Maybe you have arguments why write(int b) should not be fixed to.
>
> I have not said a word about write(int b). ;-) Please read carefully.

I know. But I did (and was not sure bout it)
>
> As matter of fact, I cannot see much that needs fixing in write(int). It flush()es after every '\n'
> as write(byte[],int,int) does. Well, you could do String.valueOf(b) before appending though. ;-)

ValueOf will not help when just one byte of multibyte char arrives (???).
>
>> And maybe you also know if write(int, int,  int[], ...) may end in middle of character. I
>> think it may.
>
> Yes, this can happen. Or, the offset may point to just some part of a character multi-byte encoding
> sequence. But, this is not for us to worry about. The caller should worry about it and make sure to
> pass the correct offset and length. This is something developers need to take into account when
> working with multi-byte encoded character sequences (or strings).
>

I doubt they do :)

Anyway - now I'm moreover sure that go with ByteBuffer instead of StringBuilder/Buffer and convert 
it to string once \n or -1 arrives is correct thing to do.


J.


More information about the distro-pkg-dev mailing list