RFR(S): 8242848: Improve performance of InflaterOutputStream.write()

Mon Apr 20 18:46:46 UTC 2020

On Mon, Apr 20, 2020 at 5:41 AM Vyom Tiwari <vyommani at gmail.com> wrote:
>
> Hi Volker,
>
> Latest changes looks good to me. I notices the copyright in new test Streams.java is "
>
> Copyright (c) 2020, Amazon and/or its affiliates." instead is regular Oracle copyright.
>

Hi Vyom,

thanks a lot for your review. The non-Oracle copyright is actually
fine for new files which are not derived from other,
Oracle-copyrighted files. Other examples are:

http://hg.openjdk.java.net/jdk/jdk/file/f499459eda7a/test/hotspot/jtreg/serviceability/jvmti/8036666/
http://hg.openjdk.java.net/jdk/jdk/file/f499459eda7a/test/jdk/java/text/Format/DateFormat/Bug8235699.java

Best regards,
Volker

> Thanks,
> Vyom
>
> On Fri, Apr 17, 2020 at 4:23 PM Volker Simonis <volker.simonis at gmail.com> wrote:
>>
>> Thanks everybody for looking at this change!
>>
>> Please find an updated webrev (with a new test and micro-benchmark)
>> and my answers to your comments below:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848.01/
>>
>> On Thu, Apr 16, 2020 at 6:40 AM Vyom Tiwari <vyommani at gmail.com> wrote:
>> > Thanks for doing this, i think the below code change is not required .
>> >
>> > Please do let me know if i am not reading it correctly.
>> >
>> > if (inf.finished() || (len == 0)/* no more input */) {
>> >
>> > If you check the native code "inf.finished() will be true only of the corresponding native call inflate(strm, Z_PARTIAL_FLUSH) returns "Z_STREAM_END".
>> >
>> > After your code change  write may return even if finished() is not true.
>>
>> Yes, that's true, but that's what we must do if there's no more input
>> available. Before my change this break on "len < 1" was in the "if
>> (inf.needsInput())" branch.
>>
>> On Thu, Apr 16, 2020 at 8:22 AM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>> >  252                 // Check the decompressor
>> >  253                 if (inf.finished() || (len == 0)/* no more input */) {
>> >  254                     break;
>> >  255                 }
>> >
>> > Not sure but I think this is wrong because now you bypass the followup handling of inf.needsDirectory.
>> >
>> > Inflater.inflate() returns 0 if either needsInput or needsDirectory. We have to ignore needsInput since we have no input anymore, but needsDirectory has to be handled, no?
>>
>> You're absolutely right Thomas. Thanks for catching this! I've moved
>> the check for "needsDictionary" in front of the "finished() || len ==
>> 0" check.
>>
>> Unfortunately there is not very good test coverage for zip with preset
>> dictionaries (jtreg and submit repo passed without problems). So I
>> added a new test for this use case to "
>> test/jdk/java/util/zip/DeflateIn_InflateOut.java".
>>
>> On Thu, Apr 16, 2020 at 8:48 AM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>> >
>> > As for increasing the buffer size, it makes sense but IMHO needs a CSR and a release note.
>>
>> I don't think so. This is an internal, implementation specific setting
>> which has never been exposed or documented before so I don't see why
>> we should document it now. But let's discuss this separately in the
>> corresponding JBS issue (see below).
>>
>> On Thu, Apr 16, 2020 at 1:18 PM Claes Redestad
>> <claes.redestad at oracle.com> wrote:
>> >
>> > Hi Volker,
>> >
>> > On 2020-04-15 19:48, Volker Simonis wrote:
>> > > While doing this change, I've also realized that all the streams in
>> > > java.util.zip (i.e. DeflaterInputStream, GZIPInputStream,
>> > > GZIPOutputStream, InflaterInputStream, DeflaterOutputStream) use an
>> > > internal byte buffer of 512 bytes by default. Looking at the benchmark
>> > > attached to JDK-8242864, I think that increasing this default to a
>> > > bigger size (e.g. 4096 bytes) will considerably speed up (up to 50%)
>> > > read and write operations on these streams when they are created with
>> > > the default buffer size. I think the value "512" is a legacy of old
>> > > times when memory was more precious:)  so  I've opened "JDK-8242864:
>> > > Increase the default, internal buffer size of the Streams in
>> > > java.util.zip" to track that as as separate issue. Do you think it
>> > > makes sense to increase that default value?
>> >
>> > Seems reasonable. 8192 seems to be the buffer size we've been converging
>> > on elsewhere (see InputStream, BufferedInputStream, Files, ..). I also
>>
>> That seems reasonable. Alan commented on the JBS issue so we can
>> continue the discussion there.
>>
>> > found an instance of 8096, which is either a typo or a clever
>> > optimization to keep the array + array object header fit snugly within
>> > 8Kb. You chose. :-)
>> >
>>
>> I like how you try to be positive :)
>>
>> > >
>> > > Thank you and best regards,
>> > > Volker
>> > >
>> > > PS: do you think it makes sense to contribute the benchmark attached
>> > > to JDK-8242864 to the code-tools/mh-jdk-microbenchmarks [1] project?
>> > >
>> > > [1]http://openjdk.java.net/projects/code-tools/jmh-jdk-microbenchmarks/
>> >
>> > I'd definitely welcome the micro as part of the patch under
>> > test/micro/org/openjdk/bench/java/util/zip - additionally contributing
>>
>> I knew that "jmh-jdk-microbenchmarks" has been copied to the jdk repo
>> but somehow I did found "jmh-jdk-microbenchmarks" before looking in
>> the obvious place :)
>>
>> So I've added the benchmark to the patch now. There's one thing I'm
>> not sure however. The benchmark requires a "big" (several 100k) with
>> good compression ratio (e.g. a large text file). I've decided to use a
>> big Java source file from "src.zip" but I'm not sure if "src.zip" is
>> always available in the jdk images which are used to run the
>> microbenchmarks. Do you think the test it is fine this way or do you
>> have a better idea? I saw that "ZipFind" uses "microbenchmarks.jar"
>> (i.e. the container of the test itself) but that file is already
>> compressed so the compression rate won't be that good.
>>
>> Another thing I couldn't figure out is a good way to skip the
>> benchmark when I realize that I can't load the expected file in the
>> "@Setup" method. Do you now anything better than just throwing an
>> exception?
>>
>> Thank you and best regards,
>> Volker
>>
>> > to jmh-jdk-microbenchmarks could enable you to test the micro on 8 or
>> > 11.
>> >
>> > /Claes
>> >
>
>
>
> --
> Thanks,
> Vyom