[rfc][icedtea-web] Console Output Encoding Fix

Jie Kang jkang at redhat.com
Wed Jul 16 18:06:43 UTC 2014



----- Original Message -----
> On 07/16/2014 06:09 PM, Jie Kang wrote:
> > ----- Original Message -----
> >> On 07/16/2014 04:27 PM, Jie Kang wrote:
> >>> Hello
> >>>
> >>> Here is the revised patch with unit tests. If it looks okay I will work
> >>> on
> >>> backporting it to 1.5.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>   > diff --git
> >>   > a/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
> >>   > b/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
> >>   > --- a/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
> >>   > +++ b/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
> >>   > […]
> >>   > @@ -55,7 +54,7 @@
> >>   >
> >>   >      // Everthing written to TeeOutputStream is written to our log too
> >>   >
> >>   > -    private final StringBuffer string = new StringBuffer();
> >>   > +    private final ByteArrayOutputStream byteStorage = new
> >>   > ByteArrayOutputStream();
> >>
> >> Naming: byteStorage is very misleading. What is a byte storage? Please
> >> rename
> >> this member variable to what it is, e.g. byteOutputStream.
> >
> >
> > Well to me a byte storage is some object that stores bytes. Personal style
> > prefers not naming objects after the object identifier regardless, I have
> > changed the name to byteOutputStream.
> >
> > I guess this is situational but 'Jpanel jpanel' is something I hope to
> > never see.
> 
> Yes, every rule has exceptions. ;-) And yes, every rule or style can be
> abused.
> But, in this case I think it is a good choice for a name because it describes
> a
> part of TeeOutputStream's core functionality.
> 
> >>
> >>   >      private final boolean isError;
> >>   >
> >>   >      public TeeOutputStream(PrintStream stdStream, boolean isError) {
> >>   > […]
> >>   > @@ -82,32 +80,27 @@
> >>   >
> >>   >      @Override
> >>   >      public synchronized void write(byte[] b, int off, int len) {
> >>   > -        if (b == null) {
> >>   > -            throw new NullPointerException();
> >>   > -        } else if ((off < 0) || (off > b.length) || (len < 0)
> >>   > -                || ((off + len) > b.length) || ((off + len) < 0)) {
> >>   > -            throw new IndexOutOfBoundsException();
> >>   > -        } else if (len == 0) {
> >>
> >> Err, testing len for 0 is indeed a good optimization. ;-)
> >
> > Readded the test for 0 length.
> >
> >>
> >>   > -            return;
> >>   > -        }
> >>   > -        for (int i = 0; i < len; i++) {
> >>   > -            appendChar(b[off + i]);
> >>   > -        }
> >>   > +        appendByteArray(b, off, len);
> >>   >          super.write(b, off, len);
> >>   >      }
> >>   >
> >>   >      @Override
> >>   >      public synchronized void write(int b) {
> >>   > -        appendChar(b);
> >>   > +        appendByte(b);
> >>
> >> Yep, proper renaming. Good. :-)
> >>
> >>   >          super.write(b);
> >>   >      }
> >>   >
> >>   >      private void flushLog() {
> >>   > -        if (string.length() <= 0 ){
> >>   > -            return;
> >>   > +        String s = "";
> >>
> >> Why initialize here with an empty string? Strings are immutable, remember?
> >> Besides, any of the ByteArrayOutputStream.toString() methods will always
> >> return
> >> a String, never null, so need to worry about null here.
> >>
> >>   > +        try {
> >>   > +            s = byteStorage.toString("UTF-8");
> >>
> >> Err, why do you suddenly assume the passed in byte array to be UTF-8
> >> encoded?
> >> If
> >> anything, only the current default encoding can be assumed. So no, you
> >> cannot
> >> do
> >> this.
> >>
> >>   > +        } catch (UnsupportedEncodingException e) {
> >>   > +            s = byteStorage.toString();
> >>
> >> Just stick to ByteArrayOutputStream.toString() only.
> >> And, even if byteStorage.size() is 0 ByteArrayOutputStream.toString() will
> >> still
> >> return an empty string ("").
> >>
> >>   >          }
> >>   > -        log(string.toString());
> >>   > -        string.setLength(0);
> >>   > +        if (s.length() > 0 ){
> >>
> >> Formatting: Please remove the space before ) and add a space between ) and
> >> {.
> >>
> >> Why do you check for s' length to be greater than 0? Cannot log handle
> >> empty
> >> strings?
> >
> > I think there is no reason to log empty strings. There is no reason to
> > propagate this to log.
> >
> >>
> >>   > +            log(s.toString());
> >>
> >> ??? s is already of type String?
> >>
> >>   > +            byteStorage.reset();
> >>   > +        }
> >>   >      }
> >>   >
> >>   >      @Override
> >>   > @@ -121,11 +114,23 @@
> >>   >          return isError;
> >>   >      }
> >>   >
> >>   > -    private void appendChar(int b) {
> >>   > -         if ( b <= 0 || b == '\n'){
> >>   > +    private void appendByte(int b) {
> >>   > +        byteStorage.write(b);
> >>   > +        if ( b <= 0 || b == '\n'){
> >>
> >> Unfortunately, this is not correct. Like Jiri and me have been debating,
> >> it
> >> is
> >> not that simple in this case. With b being part of a multi-byte encoded
> >> character we cannot assume it to be denoting the 0 (\u0000) or new line
> >> (\u000a,
> >> \000a\u000c, or \u000c\u000a) code point.
> >>
> >> So, what you actually would need to do is to convert the byte array
> >> (byteStorage) to a String and then check for the String's last character
> >> to
> >> be 0
> >> or new line. And yes, for every /single/ byte appended.\
> >
> > Done.
> >
> >>
> >> Besides, incorrect formatting still remains.
> >>
> >>   >              flushLog();
> >>   > -        } else {
> >>   > -            string.append((char)b);
> >>   > +        }
> >>   > +    }
> >>   > +
> >>   > +    private void appendByteArray(byte[] b, int off, int len) {
> >>   > +        byteStorage.write(b, off, len);
> >>   > +        String s = "";
> >>
> >> Again, no need to initialize with an empty string here.
> >>
> >>   > +        try {
> >>   > +            s = new String(b, off, len, "UTF-8");
> >>
> >> Again, why do you suddenly assume the passed in byte array to be UTF-8
> >> encoded?
> >>
> >>   > +        } catch (UnsupportedEncodingException e) {
> >>   > +            s = new String(b, off, len);
> >>   > +        }
> >>   > +        if(s.endsWith(System.getProperty("line.separator"))) {
> >>
> >> Yep, something like this would need to happen in appendByte(int) too. ;-)
> >>
> >> Yet, for better performance System.getProperty("line.separator") should be
> >> put
> >> into a final String member on TeeOutputStream's constuction.
> >>
> >> Again, formatting: Add a space between if and (.
> >>
> >>   > +            flushLog();
> >>   >          }
> >>   >      }
> 
> Overall the patch looks good. One last nit though:
> 
>  > diff --git a/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
>  > b/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
>  > --- a/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
>  > +++ b/netx/net/sourceforge/jnlp/util/logging/TeeOutputStream.java
>  > […]
>  > +    public ByteArrayOutputStream getByteArrayOutputStream() {
>  > +        return this.byteArrayOutputStream;
>  > +    }
> 
> Do we really want to expose this internal ByteArrayOutputStream instance? I
> cannot think of any good reason now. Actually, it is a helper for
> TeeOutputStream to be able to flush() on every new line. What if a caller
> calls
> ByteArrayOutputStream.reset() before TeeOutputStream has flushed? Then
> possibly
> data could get lost. Because of this, in any case, this method should return
> a
> clone of this.byteArrayOutputStream.


Hmm good catch on the reset use case. I was having a lot of trouble figuring out how to properly unit test TeeOutputStream which is a sign that the whole class needs to be rewritten but at the same time I have yet to think of an acceptable reimplementation. So this might be considered a temporary patch that resolves the bug before a rework of the whole class. And also we must consider that the backport to 1.5 should probably not have a rewrite of the class, which will use this patch.

In terms of the problem with unit testing, we need to make sure calls to TeeOutputStream result in successful writes to the ByteOutputStream. However, given the current implementation there really isn't any way that I can see to validate that ByteOutputStream contains the correct write information without either exposing something or rewriting the class. I decided the getter was the best solution for the moment. It's also possible to expose flushLog() and have it return the flushed contents, but this sidesteps the PrintStream and doesn't feel as good to me.

I've made it return a copy and also set the method to be protected since it's really only used for testing. Just to make it clear, I also really dislike this approach so if you have any suggestions I will be happy to hear them.


> 
> Jacob
> 

-- 

Jie Kang
-------------- next part --------------
A non-text attachment was scrubbed...
Name: console-1858-5.patch
Type: text/x-patch
Size: 6124 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/distro-pkg-dev/attachments/20140716/69efa836/console-1858-5-0001.patch>


More information about the distro-pkg-dev mailing list