RFR 8025003: Base64 should be less strict with padding

Wed Nov 13 19:37:46 UTC 2013

Xueming Shen wrote on 11/13/13 11:11:
> On 11/13/2013 10:41 AM, Bill Shannon wrote:
>>
>>
>>> The other thought is the charset API where a charset decoder can be configured
>>> to ignore, replace or report then malformed or unmappable input. Having support
>>> for all these actions is important for charset encoding/decoding but seems way
>>> too much for Base64 where I think the API should be simple for the majority of
>>> usages.
>> We started this with a request for a strict/lenient option.  That may still be
>> simpler than figuring out how to do strict decoding and report the error in a
>> way that users of the API can ignore the error and provide as much data as
>> possible.
>>
>>> In any case, it's not clear what we can do this late in the schedule. It might
>>> be prudent to just fix the MIME decoder to throw IAE consistently and re-visit
>>> the API support for a lenient decoder in JDK 9.
>> When we started this conversation there was plenty of time to fix this.  :-(
> 
> The issue here is we disagree on the specification of what lenient should be and
> how the
> API should look like.
> 
> Here is the proposed change to undo the "lenient padding handling for mime"
> change we
> did earlier to leave the option open for a complete "lenient base64" in future
> release,
> when we have a consensus

What other implementors of base64 MIME decoding software have you consulted,
or do you intend to consult in the future?

What experiments have you done with other base64 MIME decoding software or
applications to determine how they handle these cases?

I'm trying to determine how we're going to reach consensus in the future.

My base64 MIME decoding software has evolved over time based on customer
requirements.  I'm trying to give you the benefit of that experience so
that you don't need to waste years getting to the same point I got to.
I started in a similar place as you, believing that applications would
want to know about improperly encoded data.  I learned that many do not,
and that most end-user applications simply want to be as lenient as possible
to provide the best data possible to the user.