Review/comment needed for the new public java.util.Base64 class

Fri Oct 12 20:12:57 UTC 2012

Hi,

It appears to be possible to do something like

boolean de/encode(ByteBuffer src, ByteBuffer dst);

returns true if all remaining bytes in src are en/decoded, false, the dst
is not big enough for all output bytes, the src.position() will be advanced
to the position of next un-en/decoded byte, dst.position() will be updated
accordingly as well.

to avoid the en/decoder to hold an internal state.

-Sherman

On 10/12/2012 12:47 PM, Ariel Weisberg wrote:
> Hi,
>
> Thanks for doing this BTW.
>
> I think that including ByteBuffer API even if it isn't as efficient as
> raw byte arrays is better then not having it in the API at all. If that
> means allocating a byte array for the output and then doing a put on the
> ByteBuffer that is fine.
>
> Down the line if someone has a particularly powerful itch to scratch WRT
> to performance they can add more code to the library to make it more
> efficient at handling them and then everyone will benefit or they can do
> their own implementation.
>
> Thanks,
> Ariel
>
>
> On Fri, Oct 12, 2012, at 02:56 PM, Xueming Shen wrote:
>> Hi,
>>
>> The exactly reason I was trying to skip en/decode(ByteBuffer in,
>> ByteByuffer out)
>> for now. I'm struggling with/can't make up my mind on whether or not the
>> en/decoder
>> should  have internal state, like the charset en/decoder. It appears the
>> API is being
>> pushed going that direction though:-)
>>
>> -Sherman
>>
>> On 10/12/2012 11:39 AM, Michael Schierl wrote:
>>> Hello,
>>>
>>> (sorry if the threading is broken, but I was not subscribed to the list
>>> and only found the discussion on Twitter and read it in the mailing list
>>> archive)
>>>
>>> Ariel Weisberg wrote on Thu Oct 11 11:30:56 PDT 2012:
>>>> I know that ByteBuffers are pain, but I did notice that you can't
>>>> specify a source/dest pair when using ByteBuffers and that ByteBuffers
>>>> without arrays have to be copied. I don't see a simple safe way to
>>>> normalize access to them the way you can if everything is a byte array.
>>> Agreed. One of the advantages of using byte buffers is reducing
>>> allocations, resulting in fewer garbage collections.
>>>
>>> In addition, in this implementation the ByteBuffers have to contain the
>>> full data.
>>>
>>> What I like about most byte buffers APIs is that I can pass in a
>>> ByteBuffer with incomplete data or maybe an output ByteBuffer that is
>>> too small to hold the complete result, and it will just process as much
>>> as it can, and leave the rest for the next round (which should work well
>>> for Base64, too, as it always processes chunks of 3 or 4 bytes).
>>>
>>> So, a useful ByteBuffer API in my opinion needs a method like
>>>
>>> public boolean encode(ByteBuffer in, ByteBuffer out,
>>>     boolean endOfInput);
>>>
>>> public boolean decode(ByteBuffer in, ByteBuffer out,
>>>     boolean endOfInput);
>>>
>>> (similar to CharsetEncoder#encode) that can process partial input and
>>> will return true if all processable input has been processed (i. e. in
>>> has to be refilled) or false if some input could not have been processed
>>> (i. e. out has to be flushed).
>>>
>>> Users have to call it again and again until they call it with
>>> endOfInput=true and get true back (Using an enum as result similar to
>>> CoderResult#UNDERFLOW and CoderResult#OVERFLOW might be another option
>>> if the boolean results are too cryptic).
>>>
>>> Having a ByteBuffer Base64 API might be useful (although I'm not sure
>>> yet if I ever need it), but as it is now, it is mostly useless for
>>> serious ByteBuffer usage, as if I have to split and copy the data
>>> manually anyway, I can as well use the array APIs.
>>>
>>>
>>> Just my 0.02 EUR,
>>>
>>>
>>> Michael