RFR(S): 8242848: Improve performance of InflaterOutputStream.write()
Vyom Tiwari
vyommani at gmail.com
Thu Apr 16 04:40:26 UTC 2020
Hi Volker,
Thanks for doing this, i think the below code change is not required .
Please do let me know if i am not reading it correctly.
if (inf.finished() || (len == 0)/* no more input */) {
If you check the native code "inf.finished() will be true only of the
corresponding native call inflate(strm, Z_PARTIAL_FLUSH) returns
"Z_STREAM_END".
After your code change write may return even if finished() is not true.
Thanks,
Vyom
On Wed, Apr 15, 2020 at 11:19 PM Volker Simonis <volker.simonis at gmail.com>
wrote:
> Hi,
>
> can I please have a review for the following simple performance
> enhancement:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848/
> https://bugs.openjdk.java.net/browse/JDK-8242864
>
> InflaterOutputStream has an internal buffer which is used for the
> inflated data. This buffer can be configured to a custom size (the
> default is 512 bytes) by using the specialized
> "InflaterOutputStream(OutputStream out, Inflater infl, int bufLen)"
> constructor. A bigger buffer, results in fewer native calls to
> "write()" on the associated OutputStream and better overall
> performance.
>
> Unfortunately "InflaterOutputStream.write(byte[] b, int off, int len)"
> unnecessarily chunks its own byte input buffer "b" into pieces of
> maximum 512 bytes before feeding them to the underlying Inflater. This
> unnecessarily increases the number of native calls to
> Inflater.inflate() and limits the benefit of the configurable internal
> buffer to a size of ~(512 * compression-ratio) bytes.
>
> Instead, we could easily pass the complete input buffer "b" as input
> to the underlying Inflater. This simplifies the code and results in up
> to ~25% performance improvements for bigger internal buffers (see the
> JMH benchmark attached to the bug).
>
> Regarding the implementation, I initially wanted to completely remove
> the "for (;;)" loop after removing the chunking of the input buffer.
> But I think it is still necessary, because an InflaterOutputStream can
> be created with a custom Inflater which already has some input data.
> In that case, the Inflater will first consume its initial input data
> in the first "for" loop iteration while "write()"s input buffer "b"
> will only be consumed in the second "for" loop iteration.
>
> While doing this change, I've also realized that all the streams in
> java.util.zip (i.e. DeflaterInputStream, GZIPInputStream,
> GZIPOutputStream, InflaterInputStream, DeflaterOutputStream) use an
> internal byte buffer of 512 bytes by default. Looking at the benchmark
> attached to JDK-8242864, I think that increasing this default to a
> bigger size (e.g. 4096 bytes) will considerably speed up (up to 50%)
> read and write operations on these streams when they are created with
> the default buffer size. I think the value "512" is a legacy of old
> times when memory was more precious :) so I've opened "JDK-8242864:
> Increase the default, internal buffer size of the Streams in
> java.util.zip" to track that as as separate issue. Do you think it
> makes sense to increase that default value?
>
> Thank you and best regards,
> Volker
>
> PS: do you think it makes sense to contribute the benchmark attached
> to JDK-8242864 to the code-tools/mh-jdk-microbenchmarks [1] project?
>
> [1] http://openjdk.java.net/projects/code-tools/jmh-jdk-microbenchmarks/
>
--
Thanks,
Vyom
More information about the core-libs-dev
mailing list