RFR: 8322256: Define and document GZIPInputStream concatenated stream semantics
Eirik Bjørsnøs
eirbjo at openjdk.org
Fri Aug 30 10:31:48 UTC 2024
On Fri, 30 Aug 2024 07:27:11 GMT, Eirik Bjørsnøs <eirbjo at openjdk.org> wrote:
> Please review this PR with picks up on the excellent work done by @archiecobbs in #18385
>
> The proposed changes aim to solve two issues with the current `java.util.zip.GZIPInputStream`:
>
> * The class parses multiple concatenated GZIP files as a single stream. This behavior is not documented in the API specification.
> * Any additional bytes following a trailer which do not form a valid header are discarded and the stream behaves as if the end of stream has been reached. This behavior is not documented in the API specification.
>
> Testing:
>
> * A new test `GZIPInputStreamConcat` verifies the behaviors being specified in this PR
> * A new test `GZIPInputStreamGzipCommand` verifies decompression of various GZIP files created using the `gzip` command.
@LanceAndersen @jaikiran
I have updated the API documentation in this PR inspired by the following comment from @jaikiran in Archie's PR:
https://github.com/openjdk/jdk/pull/18385#issuecomment-2265378324
I aimed to keep this at a high level, avoiding any details of the GZIP file format and the parsing logic involved in the implementation:
* <p>
* The InputStream passed to the constructor of this class may represent a
* single GZIP file or multiple consecutive GZIP files. When the end of a
* GZIP file is immediately followed by a new GZIP file, this class continues
* to decode compressed data into a single, concatenated stream of uncompressed
* data. Otherwise, any additional trailing bytes following a GZIP file are
* discarded as if the end of stream is reached.
What do you think?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20787#issuecomment-2320776918
More information about the core-libs-dev
mailing list