RFR 8025003: Base64 should be less strict with padding

Thu Nov 14 23:27:08 UTC 2013

Xueming Shen wrote on 11/14/2013 11:20 AM:
> On 11/14/2013 11:12 AM, Bill Shannon wrote:
>> Alan Bateman wrote on 11/14/2013 06:18 AM:
>>> On 13/11/2013 20:28, Xueming Shen wrote:
>>>> Yes, the plan is to see what other implementations do.
>>> I think we've run out road on this for JDK 8. Even if we had agreement on
>>> dealing with corrupt input then there is little/no time to get feedback and do
>>> any further adjustments. Technically only showstopper API changes have been
>>> allowed since October so we have been on borrowed time anyway. Also we're coming
>>> up on RDP2 so we'd have to justify any changes as showstoppers.
>>>
>>> So what you would think about just leaving it strict for JDK 8 and then continue
>>> the work to see how lenient support should be exposed in the API so that it can
>>> go into JDK 9 early. That would allow you to consider whether it to have a means
>>> to get a Decoder that will consume all sewage or just decode up to the point
>>> where invalid chars or undeflow is detected. Also it probably is a bit
>>> inconsistent to have only decode buffer method stop (as proposed) so that could
>>> be looked at too.
>>>
>>> If you agree then there is a bit of clean-up to do with the changes for 8025003
>>> that were pushed but I think that can be justified.
>> Making it strict is fine, but right now it's half-lenient, and you need a way
>> to use/wrap the APIs to ignore the errors and provide as much data as possible.
> 
> The webrev I posted yesterday is to put the mime decoder back to "strict".
> However it keeps
> the change in decode(buffer, buffer), which leaves the position of src and dst
> at the place that
> the malformed occurred (-1 is being returned now. an alternative is to return
> the negative value
> of the the bytes written...). With the assumption that the "decoded bytes" might
> still be valuable
> for some use scenario, given decode(buffer, buffer) is supposed to be an
> "advanced" api with
> some degree of "recovery" functionality, such as the output buffer is not big
> enough...
> 
> But if the consensus is this is kinda of inconsistent compared to other decode
> variants (which
> throws away any decided bytes, if error occurs), I'm happy to back out this
> change and back
> to the original spec/implementation (1) throw IAE if malformed detected (2)
> reset the pos of
> src and dst buffer to their original position when the method is invoked.
> 
> http://cr.openjdk.java.net/~sherman/base64_malformed2/webrev/

I'd prefer that all variants of the API report the error in a way that allows
the users of the API to ignore the error, access the data that caused the error,
and supply replacement data if desired.

For most of the APIs, decoding as much data as possible and throwing an
exception with details about how much has been decoded and where the error
was detected would be best.  I understand that designing this and getting
it right might exceed what you can do in this release.  In that case, just
throw an exception with no details, and we can figure out what details to
provide later.  Returning a negative number is kind of a hack, and unlike
most other APIs.  Plus, if we decide we need to return two numbers (e.g.,
input and output positions), there's no way to extend it.