[Base64] Codereview request for 8006315 8006530

Fri Feb 1 16:29:23 UTC 2013

On 2/1/13 1:30 AM, Chris Hegarty wrote:
> The spec clarifications look good to me, and will be much more user 
> friendly.
>
> Trivially, this is the first mention of padding in the API, "Base64 
> padding character", should this be qualified as '='? Or maybe, this is 
> just another case of deferring to the  "The Base64 Alphabet" in the rfc.
>
> The test seems inconsistent with the spec clarification, "The Base64 
> padding character is accepted and interpreted as the end of the 
> encoded byte data"
>
> +    checkEqual(decM.decode(encoded), src[i], "Non-base64 char is not 
> ignored");
> +                try {
> +                    dec.decode(encoded);
> +                    throw new RuntimeException("No IAE for non-base64 
> char");
> +                } catch (IllegalArgumentException iae) {}
>
> Why, in the case of src input "A" would you expect the mime decoder to 
> ignore the trailing character, and not the Basic decoder?

The current spec and implementation expect everything passed in (byte array/
String/ByteBuffer) are acceptable valid base64 data, if there is 
anything (trailing
bytes) left can't be handled by the decoder, it's an "error".
In case of MIME, the spec explicitly specifies any non-bsae64 character are
ignored, so even they are not valid base64 bits, but they should be accepted
and ignored, as part of the valid base64 data, not the "trailing" stuff 
(the mime
decoder skips those bytes in the middle of data, it would be consistent to
also skip at the end)

-Sherman

>
> -Chris.
>
> On 01/02/2013 07:37, Xueming Shen wrote:
>> Hi,
>>
>> This is the webrev for
>>
>> 8006530: Base64.getMimeDecoder().decode() throws exception for
>> non-base64 character after adding =
>> 8006315: Base64.Decoder decoding methods are not consistent in treating
>> non-padded data
>>
>> http://cr.openjdk.java.net/~sherman/8006315_8006530/webrev
>>
>> The change is to
>>
>> (1) explicitly specify line feed is not added to the end of mime encoded
>> output (no surprise)
>>
>> (2) mime decoder now ignores any non-base64 character after padding =
>> (same as it
>> ignores those non-base64 character within the data (mime base64 spec).
>> Convenient
>> for use case like a padding \n at the end of any input data from the
>> file and, as suggested
>> from real base64 use case.
>>
>> (3) explicitly specify padding character at the end of the base64
>> encoded data is
>> optional when DECODING (encoder always adds padding character = when
>> necessary).
>> The "decoding" part of the RFC really does not make it a MUST, so be
>> liberal.
>>
>> (4) update the decoder inputstream to behave the same way other decoders
>> do, to
>> accept AA and AAA same as AA== AAA=.
>>
>> Please help review.
>>
>> -Sherman