RFR JDK-8164278: java.util.Base64.EncOutputStream/DecInputStream is slower than corresponding version in javax.mail package
Xueming Shen
xueming.shen at oracle.com
Thu Feb 8 03:32:34 UTC 2018
Hi Roger,
Given the concern of the possible incompatible behavior change of over
reading bytes
from the underlying stream. I decided to give up last proposed changes
for DecInputStream
for now. With some "minor" cleanup and tuning I still have about 10%+
improvement with
various input size sampling. Let's address the decoder stream in
separate rfe later, if desired.
The updated webrev and the jmh results are at
http://cr.openjdk.java.net/~sherman/8164278/webrev
http://cr.openjdk.java.net/~sherman/8164278/base64.bm
Last version of webrev and corresponding jmh result can be found at
http://cr.openjdk.java.net/~sherman/8164278/webrev.02
http://cr.openjdk.java.net/~sherman/8164278/base64.bm.old
Thanks,
Sherman
On 2/5/18, 6:00 PM, Xueming Shen wrote:
> Hi,
>
> Please help review the change for JDK-8164278.
>
> issue: https://bugs.openjdk.java.net/browse/JDK-8164278
> webrev: http://cr.openjdk.java.net/~sherman/8164278/webrev
>
> jmh.src: http://cr.openjdk.java.net/~sherman/8164278/Base64BM.java
> jmh.result: http://cr.openjdk.java.net/~sherman/8164278/base64.bm
>
> Base64.Decoder.decode0:
> Adopted the "similar" optimization approach we took in
> Base64.Encoder.encode0()
> to add a "fast path" to decode a block of 4-byte units together
> (current implementation
> decodes one single byte per while/loop. The jmh benchmark result
> indicates a big speed
> boost (those decodeArray/Mime/Url results, from 30% to 2 times
> faster, depends on
> input size).
>
> Base64.Encoder.encode0()
> It appears encode0() was fully optimized in 1.8. Can't get it
> faster :-) Tried to use
> Unsafe.getLong/putLong instead of byte by byte access. But it
> appears the 8-byte
> "vectorization" does not bring us enough speed up, the performance
> is the same as the
> current one. See encode00() at
> http://cr.openjdk.java.net/~sherman/8164278/webrev.00
>
> Base64.Encoder.wrap(OutputStream)/EncOutputStream.write():
> If my memory serves me right, the current implementation was under
> the assumption that
> the underlying output stream probably is buffered (if invoker
> cares). It would be a redundant
> if EncOutputStream buffers bytes again. It appears this is a wrong
> assumption. It is much
> slower to write 4 bytes separately, compared to bundle them
> together in a byte[4] and write
> into underlying, even the underlying output stream is a
> ByteArrayOutputStream.
> Again, the proposed change is to add a fast path loop, as we do in
> encode0(), to decode/
> write a block of 3-byte->4-byte unites. It appears this fast loop
> can help the compiler to
> optimize away some boundary checks, therefor is much faster.
> The jmh result Base64BM.encodeOS suggests the new implementation
> is almost 4 times faster
> and is almost the same as java.mail's stream encoder.
>
> Base64.Decoder.wrap(InputStream)/DecInputStream.read():
> Same as the approach we take for decode0(), to add a fast path
> decode block of 4-byte unites
> together.
> The jmh result Base64BM.decodeOS (the name probably should be
> decodeIS, for InputStream,
> but anyway...) shows the proposed one is 4 times faster than the
> existing impl and double
> the java.mail (Base64BM.decodeOS_javamail) implementation.
>
> However, there is a side-effect of adding a buffering mechanism
> into DecInputStream. The
> current implementation read bytes from the underlying stream one
> by one, it never reads
> more bytes than it needs, which means it should/is supposed to
> just stop at the last byte
> that it needs to decode, when there is "=" present in the stream.
> With buffering, it's possible more bytes (after "=", which
> indicates "end of base64 stream") might
> be read/consumed in and buffered. A concern? if this is indeed a
> concern, the only alternative
> might be to add a separate method to support this
> "faster-buffered-decoder"?
>
> Thanks,
> Sherman
>
>
>
More information about the core-libs-dev
mailing list