[rfc][icedtea-web] Console Output Encoding Fix

Jacob Wisor gitne at gmx.de
Tue Jul 15 15:53:15 UTC 2014


On 07/15/2014 05:25 PM, Jiri Vanek wrote:
> On 07/15/2014 05:07 PM, Jacob Wisor wrote:
>> On 07/15/2014 04:10 PM, Jie Kang wrote:
> Hello,
>>  >
>>  > This patch resolves the bug here
>> http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=1858
>>  >
>>  > Characters such as 'ó' were not appearing in the Java Console due to the
>> implementation of
>> TeeOutputStream appending bytes to a StringBuffer in a byte-by-byte fashion
>> ignoring the fact that
>> the encodings involve multi-byte characters.
>>  >
>>  > Also, as far as I can tell the StringBuffer is not used by multiple threads
>> and has been replaced
>> by StringBuilder (see
>> http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html)
>>
>> No, I would rather advise to stay with StringBuffer. There is no need for
>> StringBuilder as long as
>> TeeOutputStream is thread-safe.
>
> TeaOutputStreamitself is not thread safe, but all acess to the "string"
> are(should! be) already synchronised so it should be ok to move to
> StringBuilder, whih have much less overhead.
>
>>
>> Generally speaking, what this code should be doing is this:
>>
>> import java.util.Arrays;
>> [...]
>> this.string.append(new String(Arrays.copyOfRange(b, off, off + len)));
>>
>> Flushing on every '\n' can also be incorporated. ;-)
>
> See my much longer reply :) -
> http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2014-July/028521.html
>
> Maybe you have arguments why write(int b) should not be fixed to.

I have not said a word about write(int b). ;-) Please read carefully.

As matter of fact, I cannot see much that needs fixing in write(int). It 
flush()es after every '\n' as write(byte[],int,int) does. Well, you could do 
String.valueOf(b) before appending though. ;-)

> And maybe you also know if write(int, int,  int[], ...) may end in middle of character. I
> think it may.

Yes, this can happen. Or, the offset may point to just some part of a character 
multi-byte encoding sequence. But, this is not for us to worry about. The caller 
should worry about it and make sure to pass the correct offset and length. This 
is something developers need to take into account when working with multi-byte 
encoded character sequences (or strings).

Jacob


More information about the distro-pkg-dev mailing list