Review/comment needed for the new public java.util.Base64 class

Wed Oct 24 03:50:17 UTC 2012

I'm eager to use this! 

Some comments:

- <tt>code</tt> is used in some places rather than {@code }

- The initial <ul> would perhaps be better as a <dl>. I don't know if definition lists are allowed in javadoc.

- public OutputStream wrap(OutputStream os) closing the underlying output stream precludes many potential uses. FilterOutputStreams do not consistently close the underlying stream.

- spelling error "methold"

- InputStream wrap() it might be nice if the return stream EOF when the padding character(s) are consumed but not close() or otherwise alter the source InputStream

- private static byte[] getBytes(ByteBuffer bb) should document, even though it's a private method, that the bytes returned should be not considered mutable since they might be shared.

- the "if (ret != dst.length) // TBD: not necessary" is worrisome as it will generate garbage.

- I'm still wishing String encode(byte[] buf) didn't use an intermediate byte array. :-)

- "stream after use, in which" -> "stream after use, during which"

On Oct 23 2012, at 19:56 , Xueming Shen wrote:

> Hi Alan,
> 
> Thanks for the review. I hope I addressed most of the comments in the
> updated webrev at
> 
> http://cr.openjdk.java.net/~sherman/4235519/webrev
> 
> mainly
> 
> (1) Pulled the base64 "terms" up to the class doc and then be referenced from various methods
> (2) Gave up the C style de/encode(byte[], null), just leave the invoker to have enough space. I
>     remember Paul once suggested to have a convenient method to return the estimated length
>     of the byte[] needed to en/decode a specified input byte array/buffer. I think we can do it
>     later when it is really desired. I agreed your argument that if people need to get the size and
>     prepare a "new" array, they bet off just call de/encode(byte[])
> (3) Gave up the "liberal" decoding design, tight the spec/impl to treat the input as "illegal base64"
>     if the padding character appears in the middle of input.
> (4) Some clarification of spec for thread safe, null exception and the pos/limit of input/output byte
>     buffer as suggested.
> (5) Fixed the typos
> 
> I still don't have a better name for method "getUrlEn/Decoder()", Base64.getBase64UrlEn/Decoder()
> does not feel good for me.
> 
> -Sherman
> 
> 
> On 10/23/12 6:04 AM, Alan Bateman wrote:
>> On 18/10/2012 03:10, Xueming Shen wrote:
>>> :
>>> 
>>> webrev:
>>> http://cr.openjdk.java.net/~sherman/4235519/webrev
>> I took another pass over this, focusing on the API as that is what we have to get right. Performance is important too but I think the priority has to be the API first.
>> 
>> Overall I think it is quite nice and easy to use.
>> 
>> I wonder if the Base64 class description should try to establish the terms "base64" and "base64url" so that they can be referenced from the various methods? That would avoid needing to duplicate references to the RFC 4668 and RFC 2045 in so many places.
>> 
>> I think it would also be useful if the specification indicated whether Encoders and Decoders are safe for use by concurrent threads. Also make it clear that NPE is thrown if any of the parameters are passed as null (unless otherwise specified of course).
>> 
>> I'm not sure that getUrlEncoder is the most suitable name to get a base64url encoder. The reason is that the method name makes it sound like it returns a URLEncoder or or at least an encoder for HTML forms. While more verbose, getBase64UrlEncoder is clear that it returns a base64url encoder.
>> 
>> Do you think getEncoder(int,byte[]) will be used much? If not then perhaps it should be left out for the first version (I guess part of this is getting used to providing the line separate as a byte array).
>> 
>> For the javadoc then Encoder and Decoder will need @since 1.8. They should probably cross reference each other too.
>> 
>> encode(byte[]) should be clearer that it encodes all bytes in the given array. Also I think it needs to be clear that the returned byte array is appropriately sized -- as currently worded it doesn't make it clear that they can't be extra elements at the end (odd as it might be).
>> 
>> Typo at line 215, "byter array" -> "byte array"
>> 
>> Typo at line 246, "methold" -> "method".
>> 
>> Typo at line 247, "arry" -> "array"
>> 
>> Type at line 254, "encocde" -> "encode"
>> 
>> Typo at line 277, "buffter" -> "buffer". Another one at line 702.
>> 
>> Minor comment, but I assume you should move the @SuppressWarnings("deprecation") on encodeToString to after the method comment rather than before. I see the same thing in decode(String).
>> 
>> I think encode(ByteBuffer) needs to be clear as to the position/limit/capacity of the returned buffer.
>> 
>> I'm not sure so about encode(ba, null) returning the required length, it feels odd and a bit like some of the win32 APIs. If the intention is that the caller allocates the byte[] and then calls encode again then it would be easier to just encode(ba).
>> 
>> For the decoder methods then IllegalArgumentException may be thrown mid-flight, meaning that some bytes may have been written into output buffer or array even though an exception is thrown. I think this needs to be made clear in the spec. It also makes me wonder if this is the right exception, it feels like a more specialized malformed input exception.
>> 
>> Another point about the decode methods is that they stop at the padding bytes and so this allows for bytes after the padding. I'm curious about this choice and whether you considered this as an error? I see how this influences decode(ByteBuffer,ByteBuffer) to return a boolean and I just wonder if there are other choices to consider.
>> 
>> That's mostly it for now. I didn't check in the IDE but there are lot of imports that are don't appear to be  used, perhaps left over from previous iterations?
>> 
>> -Alan
>