Review request for 8035807: Convert use of sun.misc.BASE64Encoder/Decoder with java.util.Base64

Fri Mar 21 09:23:40 UTC 2014

You’re right but we’ve never received a report of any charset interop. issues.
Probably such a scenario has never been encountered by customers.

On 21 Mar 2014, at 05:54, Xueming Shen <xueming.shen at oracle.com> wrote:

> Obj.java:#482
>    It appears sun.misc.BASE64Decoder.decodeBuffer(String) uses String's deprecated
>    String.getBytes(int srcBegin, int srcEnd, byte[] dst, int dstBegin). The proposed change
>    now uses the jvm's default charset. It might trigger incompatible behavior if the default
>    charset is not an ASCII compatible charset.  But if the "Java object in LDAP was encoded
>    with the platform default charset" (as the new comment suggested), the old implementation
>    actually did not work on platform that the default encoding is not ASCII compatible, such
>    as the IBM ebcdic.
> 
> -Sherman
> 
> On 3/20/14 3:48 PM, Mandy Chung wrote:
>> On 3/19/14 12:28 PM, Xueming Shen wrote:
>>> On 03/19/2014 11:37 AM, Mandy Chung wrote:
>>>> https://bugs.openjdk.java.net/browse/JDK-8035807
>>>> 
>>>> Webrev at:
>>>> http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8035807/webrev.00/
>>>> 
>>>> This patch converts the last 2 references to sun.misc.BASE64Encoder/Decoder from the jdk repo with java.util.Base64.   We should also update the tests and I have filed JDK-8037873 for that.
>>>> 
>>>> Thanks
>>>> Mandy
>>> 
>>> The sun.misc.BASE64En/Decoder is MIME type, so it outputs the \r\n per 76
>>> characters during encoding, and ignores/skip \r or \n when decoding. The new
>>> Base64.getEncoder/Decoder() returns the "basic" Base64 coder, which it never
>>> inserts line separator when output, and throws exception for any non-base64-
>>> alphabet character, including \r and \n.
>>> 
>>> The only disadvantage/incompatibility (j.u.Base64.getMimeDecoer() vs
>>> sun.misc.BASE64Decoder) of switching to j.u.Base64 MIME type en/decoder
>>> is that the Base64 Mime decoder ignores/skips any non-base64-alphabet
>>> (including \r and \n), while sun.misc.BASE64Decoder appears to simply
>>> use the init value "-1" for any non-base64-alphabet character for decoding.
>>> 
>>> I'm not familiar with the use scenario of ldap's Obj class, so I'm not sure if
>>> it matters (if it ever outputs/inputs > 76 character data, or even it does,if
>>> the difference matters).
>>> 
>>> Btw, except getMimeEncoder(int ...) all other Base64.getXXXEn/Decoder()
>>> returns singleton, so the de/encoder cache might not be necessary.
>> 
>> Thanks Sherman.  Vinnie confirms that it should retain the current behavior as there could be long-lived Java object in LDAP encoded with JDK 8 for example and then retrieved with JDK 9.
>> 
>> Here is the updated webrev:
>> http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8035807/webrev.01/
>> 
>> Thanks
>> Mandy
>